AI Is Not a Strategy: Why Blind Adoption Is Burning Your Budget

What's in this article

We have been here before
Mistake 1 — AI as a vending machine
Mistake 2 — The headcount replacement spreadsheet
Mistake 3 — AI usage as a KPI
Real companies, real costs, real surprises
What thoughtful AI adoption actually looks like

We have been here before

Cast your mind back to 2012. Cloud was the new religion. Every CIO had "cloud-first" in their strategy deck. The pitch was irresistible: elasticity, agility, pay only for what you use. The vendor demos were flawless. Boards signed off. Budgets were approved.

Then the bills started arriving. For a lot of organisations, moving to cloud did not save money. It increased costs — significantly. Not because cloud was a bad idea, but because teams took their on-premises habits with them. They provisioned servers the way they bought hardware: big, always-on, and never decommissioned. They ran dev environments 24/7 because forgetting to turn off a server in a rack costs nothing, but forgetting to turn off an EC2 instance costs you every hour. They built architectures that looked exactly like what they had in the data centre — just hosted somewhere else — and paid a premium for the privilege.

The tool changed. The thinking did not. That was the root cause of every cloud cost disaster of the last decade. And we are about to repeat the same mistake, at speed, with AI.

Today, the word "cloud" has been replaced with "AI" in every leadership presentation. The structure of the conversation is identical: an exciting new capability, enormous projected savings, a board mandate to adopt it, and a complete absence of a plan for how. Teams are being told to "use AI" the same way they were once told to "move to cloud" — as if the instruction itself is the strategy.

It is not. And the bills are starting to prove it.

Mistake 1 — AI as a vending machine

The vending machine mental model goes like this: you have a problem, you put a prompt in, an answer comes out, and the problem is solved. Leadership approves a suite of AI tools, distributes access to the team, and waits for the productivity graphs to go up.

This is not how AI works. It is not how any tool works at scale.

A vending machine gives the same output every time you press the same button. AI tools are probabilistic, context-sensitive, and expensive at volume. Every interaction with a large language model consumes compute. The cost of that compute is not flat — it scales with the complexity of the prompt, the length of the context window, and the number of times your team presses the button. If you give 5,000 engineers unrestricted access to an AI coding tool with no governance, no usage tracking, and no clear definition of what "good use" looks like, you do not get 5,000 productive engineers. You get 5,000 engineers running experiments at your expense, some useful, many not, and a finance team scrambling to explain why the AI line item looks nothing like the vendor proposal.

The question leadership never asks: "What problem, specifically, are we solving with this tool — and how will we know if we've solved it?" Without an answer to that, you are not deploying AI. You are buying access and hoping something good happens.

Vending machine thinking also shows up in how organisations choose AI tools. The demo always works. Vendors are exceptionally good at showing you the best-case scenario on a well-prepared dataset, with a skilled operator, on a problem the tool was designed to handle. The real question is how the tool performs on your messy data, with your team's actual skill level, on your specific workflows. That question requires a properly designed pilot. Most organisations skip it, buy the enterprise licence, and discover the answer six months later when the invoice arrives and the productivity gains are nowhere to be found.

Vending machine AI vs. intentional AI — how costs behave differently

Mistake 2 — The headcount replacement spreadsheet

This one is popular with boards and CFOs. The logic looks clean on paper: AI can do the work of X people. We have X people doing that work. We therefore no longer need those people, and we save their salaries. Net: a positive number in the savings column. Simple.

The real world is not a spreadsheet.

Human roles are not a collection of discrete, automatable tasks that you can simply hand to a model. They are bundles of judgment, context, relationship management, escalation handling, and institutional knowledge. When you replace a human role with AI, you often discover — too late — that the AI handles the easy 70% of that role adequately, and handles the hard 30% badly. That hard 30% is frequently the part that matters most: the complex customer issue, the ethical judgement call, the situation that does not fit any of the training examples.

And here is the cost dimension that never makes it into the savings spreadsheet: when the AI fails on that hard 30%, the failure is more expensive than the original human cost. You now have unhappy customers, brand damage, the cost of re-hiring the people you just let go (who have since found other jobs), and an AI system that still needs to be maintained, fine-tuned, and monitored. The people who did that maintenance? They were part of the headcount you cut.

Firing people to save money with AI is not a strategy. It is an accounting entry with hidden liabilities. The liabilities arrive on a delay — usually 12 to 18 months after the announcement, when the quarterly numbers for customer satisfaction and retention start showing up.

None of this means AI cannot genuinely reduce the need for certain kinds of work. It can. But the path to that outcome runs through understanding your work deeply — mapping which tasks are genuinely repetitive and well-defined enough for AI to handle reliably, which tasks require human judgment and should be augmented rather than replaced, and what happens to your service quality when the AI encounters something outside its training distribution. That analysis takes time and honesty. The headcount replacement spreadsheet takes an afternoon and looks much better in a board deck.

Mistake 3 — AI usage as a KPI

This is perhaps the most insidious pattern, because it sounds like good management. Leadership sets a goal: by end of Q3, 80% of the team will be actively using AI tools. Usage is tracked. Dashboards are built. Managers are held accountable. Adoption goes up.

Costs go up with it. And outcomes? Nobody is quite sure.

Measuring AI adoption tells you nothing about whether AI is creating value. It tells you that people are using the tool. What you do not know is: are they using it on problems where it genuinely helps, or are they using it to satisfy the dashboard? Are they reviewing AI outputs carefully, or are they shipping whatever the model generates because that is faster and the metric only tracks usage, not quality? Are the AI-generated outputs actually better than what the team produced before, or just faster to produce — with quality problems that will show up later in production, in customer complaints, or in technical debt?

Usage is a leading indicator at best and a vanity metric at worst. The only number that matters is whether outcomes improved — and that requires you to have defined what "improved outcomes" means before you started, which most organisations did not do.

Uber made this mistake in a very specific way. They built internal leaderboards ranking engineers by AI tool usage. This is gamification of adoption — and it worked exactly as you would expect gamification to work: engineers used the tools more, because they were incentivised to use the tools more. Whether that usage created better software or just more AI-generated code is a harder question. The annual AI budget was gone in four months. The CTO described the situation as being "back to the drawing board."

When you measure usage, you get usage. If you want outcomes, measure outcomes. Define them first. That conversation is harder, takes longer, and forces you to be honest about what you are actually trying to achieve. It is also the only conversation worth having.

Real companies, real costs, real surprises

These are not hypotheticals. This is happening now, publicly, to well-resourced organisations with capable leadership teams. The pattern is consistent enough that it should be taken seriously as a structural warning, not a collection of isolated mistakes.

Klarna: the 700-person headline that became a quiet reversal

In early 2024, Klarna published what became the most cited AI announcement of the year. Their AI assistant, built on OpenAI's models, was handling the equivalent work of 700 customer service agents. The CEO was vocal about the result. Press coverage was extensive. It was framed as proof that AI replacement of knowledge work had arrived.

By early 2025, Klarna's customer satisfaction scores on complex interactions had deteriorated. Repeat contact rates — customers who had to reach out multiple times for the same issue — were up. The company began quietly rebuilding its human customer service capacity. The CEO later acknowledged publicly that "cost unfortunately seems to have been a too predominant evaluation factor," and that the result was "lower quality." Klarna began re-hiring, adopting what they described as an "Uber-style" on-demand model for human agents — adding cost back on top of the AI investment that had not been removed.

The AI did handle the easy 70% well. Fast responses, high volume, consistent answers for common queries. The 30% it could not handle — complex disputes, nuanced situations, emotionally sensitive interactions — turned out to matter considerably more to customer loyalty than the volume numbers suggested. The savings projected from removing 700 roles did not fully materialise. The cost of re-hiring, retraining, and managing a hybrid model was not in the original business case.

Klarna did not fail because AI is bad at customer service. They failed because the decision was driven by a cost model that treated human roles as undifferentiated units, rather than an analysis of which specific interactions AI could handle without degrading customer experience. That is a leadership and process failure, not a technology failure.

Uber: the annual budget, gone in four months

In December 2025, Uber rolled out Anthropic's Claude Code to its engineering organisation. The intent was genuine and reasonable: give engineers access to a powerful agentic coding tool, improve productivity, accelerate development. They measured success by adoption. They built internal leaderboards. They encouraged engineers to compete on usage.

By February 2026, usage had roughly doubled from the rollout baseline. By April, Uber's CTO, Praveen Neppalli Naga, was on record saying the company had already exhausted its entire annual AI budget. "I'm back to the drawing board," he said, "because the budget I thought I would need is blown away already."

The mechanism is worth understanding, because it will happen to your organisation if you deploy agentic AI tools without accounting for it. Claude Code is not priced like a SaaS seat. It is priced on token consumption. An engineer using it to ask occasional questions consumes a modest amount of tokens. An engineer using it to orchestrate agentic workflows — reading entire codebases, planning changes across dozens of files, running tests, opening pull requests autonomously — consumes tokens at an order of magnitude higher rate. Multiply that by 5,000 engineers on leaderboards incentivised to maximise usage, and you have a cost model that bears no resemblance to the per-seat figure in the vendor proposal.

How AI tool costs can diverge from initial estimates

Scenario	Assumed cost/engineer/month	Actual cost/engineer/month	Multiplier
Light usage — occasional queries	$20	$40–80	2–4×
Medium usage — daily assistance	$20	$150–250	7–12×
Heavy usage — agentic workflows, leaderboard incentives	$20	$500–2,000	25–100×

Based on published reports on Uber's Claude Code deployment (April 2026). Actual figures vary by tool and usage pattern.

Uber is not a cash-strapped startup. Their R&D budget is $3.4 billion. If a company of that size and sophistication could not predict the cost trajectory of an AI tool rollout, the problem is not specific to Uber. The problem is that usage-based, consumption-driven AI pricing is genuinely difficult to forecast — especially when you actively incentivise consumption without governance guardrails.

Microsoft Copilot: the licence you're paying for but not using

Microsoft 365 Copilot is the most widely deployed enterprise AI tool among large organisations. It is also, for many of those organisations, one of the least understood costs on their technology budget. The headline price — now moving to A$60 per user per month — sounds straightforward. In practice, for many organisations, you cannot purchase Copilot without first upgrading to a higher Microsoft 365 tier, which adds cost before a single AI feature is enabled. The actual all-in cost per user is meaningfully higher than the advertised per-seat number.

For a 5,000-person organisation deploying Copilot company-wide, the annual spend is well above A$3.6 million at current pricing. The question that more than 40% of technology leaders surveyed by Deloitte could not confidently answer: is it generating measurable value?

Adoption numbers tell part of the story. Microsoft's own data suggests that only around 3% of Microsoft 365 users have chosen to pay for Copilot. Organisations that have deployed it are reporting that ROI depends heavily on which users have access. For knowledge workers who spend their entire day in Word, Excel, and Outlook, the productivity gains are real and documentable. For operational staff, frontline workers, or anyone whose primary tools are not Microsoft 365 applications, the gain approaches zero — while the cost per seat is identical.

The flat-licence model means you pay the same for the engineer who generates genuine productivity gains as you do for the operations manager who opens Copilot twice a month. Most organisations deployed Copilot broadly because the mandate came from the top. The analysis of which roles would actually benefit came later — if it came at all.

A licence unused is still a licence paid for. The question to ask before any AI tool purchase is not "could this be useful?" but "for which specific roles, in which specific workflows, has a pilot demonstrated measurable improvement — and how many of those roles do we actually have?"

The broader picture: organisations spending more, measuring less

Across the enterprise landscape, a pattern is emerging that should concern every technology leader. Average monthly AI spend among mid-to-large organisations rose from roughly $63,000 per month in 2024 to an expected $85,000 in 2025 — a 36% increase in a single year. The proportion of organisations planning to spend over $100,000 per month on AI tools more than doubled in the same period. Meanwhile, only 51% of those organisations can confidently evaluate the ROI of that spending.

That gap — between the pace of spending and the ability to measure return — is where the damage accumulates. Unchecked, it compounds. Licences renew. Usage grows. New tools are added on top of old ones that nobody has evaluated. The AI budget becomes a fixed political cost rather than a managed investment.

What thoughtful AI adoption actually looks like

None of the above is an argument against AI. It is an argument against deploying AI without thinking. The organisations that are seeing genuine returns from AI investment have one thing in common: they treated AI adoption as an operational challenge, not a purchasing decision.

Here is what that looks like in practice.

Start with the problem, not the tool

Before any AI procurement, the question is: what specific process are we trying to improve, and what does "improved" look like in measurable terms? Not "we want to be more productive" — that is not measurable. Something like: "Our support team spends an average of 12 minutes per ticket on routine password and account access queries. We want to reduce that to under 3 minutes, without a measurable drop in first-contact resolution." That is a testable hypothesis. You can run a pilot. You can measure the outcome. You can decide whether the tool justifies its cost based on evidence.

Design pilots that actually test the hard cases

Vendor demos test the easy cases. Your pilots should test the hard ones. Define in advance what the failure modes look like — the queries the AI will struggle with, the edge cases in your data, the situations that require human judgment. Measure how the AI performs on those cases, not just on the comfortable majority. The economics of AI adoption depend almost entirely on how you handle the cases where it fails, because those cases determine whether you need humans alongside the AI (cost addition) or can genuinely remove them (cost saving).

Price for actual usage, not advertised seats

For any consumption-based AI tool — and most of the powerful ones are consumption-based — the budget conversation needs to happen at two levels: the per-unit cost and the usage forecast. The usage forecast is the harder number. It requires you to estimate how often each user will invoke the tool, how complex their typical interactions will be, and how that usage will change over time as people become more comfortable with the tool. Budget 2 to 3 times your initial estimate and set up usage alerts at 50%, 75%, and 90% of monthly budget. Discover surprises early, not at invoice time.

Measure outcomes, not adoption

If your AI strategy is being reported to the board as a usage percentage, you do not have an AI strategy. You have an adoption metric. Define the business outcomes you are trying to move — customer satisfaction, time to resolution, defect rates, cycle time — and measure those. If AI is working, those numbers will move. If they are not moving, adoption numbers are a distraction from the honest conversation you need to have.

Augment first, replace later

The Klarna story is instructive precisely because Klarna is not a company that made a naive mistake. They made a sophisticated, data-driven decision that missed a critical variable: the qualitative difference between interactions that AI handles well and interactions where human judgment is the product. The organisations that have navigated this well started by putting AI alongside humans — using it to make human agents faster, better-informed, and more consistent — before making any decisions about headcount. That sequence gives you real performance data before you make irreversible decisions. Doing it the other way around gives you a press release and a problem you discover a year later.

Treat developer access as a cost control point, not just a technical one

This dynamic is not entirely new. In cloud computing, developers have long had the ability to silently drive up infrastructure costs — spinning up compute instances, calling APIs that provision storage, triggering workflows that consume resources — all without explicit budget approval and often without visibility until the monthly bill arrived. Organisations that managed cloud well learned to treat developer access as a cost control point, not just a technical one.

AI compounds this. With cloud, a careless action creates a resource that continues to accrue cost until it is stopped. With AI, every single interaction has a cost — every query, every prompt, every agentic task the tool runs on your behalf. There is no idle state. There is no resource you can switch off. Usage is the cost. This means that understanding how your people are using AI tools — what they are asking, how often, and with what complexity — is not an IT metric. It is a financial control. Defining appropriate usage is not a limitation on productivity; it is the exercise that determines whether your AI investment returns value or quietly drains it.

A checklist for leadership before any AI investment

🎯

Define the problem

Write down the specific process you are improving and what "improved" means in a number you can measure. If you cannot write it down, you are not ready to buy.

🧪

Run a real pilot

Test on your actual data, with your actual team, on your actual edge cases. Give it 60 to 90 days. Measure before and after. Be honest about the results.

💰

Price for actual usage

Get the vendor's consumption data from a comparable customer, not just the per-seat number. Budget 2–3× the headline figure and set usage alerts before you go live.

📊

Measure outcomes, not usage

Replace adoption dashboards with outcome dashboards. Define the three business metrics that AI is supposed to move. Track them monthly from day one.

🤝

Augment before you replace

Put AI alongside your people first. Measure the combined performance. Only make structural changes when you have 6+ months of real data on how the hybrid model performs under pressure.

🔍

Govern access and spend

Know who has access to which tools, how much they are using them, and what it is costing you — in real time. If you cannot see the spend as it happens, you cannot manage it.

The cloud lesson, restated for AI: The technology is not the transformation. The transformation is the change in how your organisation thinks, decides, and operates. No tool — cloud or AI — does that for you. If your practices are not thoughtful, no tool can help you. It can only make the bill bigger.