AI Costs Out of Control: How to Budget Tokens
- The moment the AI bill arrived on the table
- The numbers that count: tokens, costs, and actual consumption
- Strategic Reading: Why SMEs Are More Exposed Than Large Companies
- Problem Architecture: Where Costs are Hidden
- Operational implications for Italian SMEs
- What nobody tells you: the hidden cost of iterative prompting
The AI market has experienced years of accelerated experimentation. However, by 2026, the bill has come due: token consumption costs are becoming a significant budget item for many companies. Therefore, the conversation has shifted from optimism to
The moment the AI bill arrived on the table
For years, the tech industry has been running at full speed. The dominant logic was that of the Tokenmaxxing: maximize the use of language models, experiment, scale. However, as a recent analysis reports TechCrunch, the industry today faces out-of-control AI infrastructure costs. The sentence that summarizes the change is eloquent: «The whole conversation shifted from tokenmaxxing and ‘go fast’ to ‘we need guardrails, how do we control this?’
Therefore, 2026 marks a concrete discontinuity. It's no longer about exploring the possibilities of artificial intelligence. It's about understanding the real cost of integrating it into business processes. This applies to large corporations, but it applies even more so to Italian SMEs.
The numbers that count: tokens, costs, and actual consumption
The concept of token is central to this analysis. A token corresponds approximately to three or four characters of text. Each call to a language model — whether it's GPT-4o, Claude 3.5, or Gemini — consumes tokens for both input and output. Consequently, the cost grows proportionally to the complexity and frequency of requests.
According to aggregated industry data, a medium-sized company integrating AI into three or four workflows can spend between €2,000 and €15,000 per month on API calls alone. Furthermore, these costs tend to grow over time as use cases multiply. Without a monitoring system, spending becomes opaque and difficult to justify during budget reviews.
Research by McKinsey confirm that AI cost management has become one of the top concerns for CTOs in 2026. Specifically, the difficulty is not just technical; it's organizational. Who approves expenses? Who monitors consumption? Who decides when a model is too expensive compared to the value generated?
Strategic Reading: Why SMEs Are More Exposed Than Large Companies
Large companies have dedicated MLOps teams. They have engineers who optimize prompts, reduce context length, and choose the most efficient model for each task. In contrast, Italian SMEs often adopt AI solutions in a fragmented way, relying on SaaS tools that hide the underlying costs.
This creates a specific risk. In fact, an SME using an AI writing tool, a chatbot for customer service, and a data analysis system might not realize they are paying for tokens three times over. Therefore, the fragmentation of tools translates into a fragmentation of costs, which is difficult to aggregate and even harder to optimize.
In addition to this, the AI model market is rapidly evolving. Newer, more capable models tend to cost more per token. Therefore, automatically updating integrations to the latest available version can mean significant, often unplanned, cost increases.
We of SHM Studio we observe this pattern with increasing frequency. SMEs that turn to our AI services Often they do not have a clear mapping of their consumption. The first step of our work is always to build this visibility.
Problem Architecture: Where Costs are Hidden
To understand where to intervene, it is useful to map the main sources of token consumption in a typical company. There are three main categories of spending.
- Inefficient prompt engineering Prompts too long, redundant contexts, instructions repeated with every call. Every extra character has a cost.
- Oversized model for the task: Using GPT-4o to classify simple emails is like using a scalpel to cut bread. Smaller, specialized models cost less and often perform better on specific tasks.
- No caching Many implementations repeat identical or nearly identical calls without leveraging caching mechanisms. This multiplies costs without any additional benefit.
Similarly, the absence of spending limits configured at the API account level is a concrete operational risk. Gartner has identified the Cost governance of AI as one of the ten technological priorities for the 2026-2027 biennium, as read in their technology forecasts.
Operational implications for Italian SMEs
Translating this analysis into concrete actions requires a structured approach. Below are the priority areas for intervention for an SME that wants to bring its AI spending under control.
First of all, the mapping. It is necessary to inventory all AI API touchpoints: SaaS tools, custom integrations, automations. Without this visibility, any optimization is blind.
Next, segmentation by value. Not all AI tasks have the same strategic value. Therefore, it is useful to classify use cases based on the return generated and the cost incurred. This allows for rational budget allocation.
So, the choice of the right model. Today, there are open-source and mid-range commercial models that offer excellent performance on specific tasks at a fraction of the cost of flagship models. The choice of model is an economic as well as a technical decision.
Finally, continuous monitoring. AI costs are not static. They grow with adoption, change with model updates, and vary with usage volumes. A monitoring dashboard—even a simple one—is an essential tool.
Our activities digital marketing and of SEO they are increasingly integrating AI components. Similarly, our clients are also exploring these integrations. For this reason, we have developed an approach that balances innovation and cost control.
What nobody tells you: the hidden cost of iterative prompting
There's an aspect that rarely emerges in discussions about AI costs. It's the Iterative promptthe practice of progressively refining a prompt through tens of test calls. In a team of three or four people working on an AI integration, this process can consume hundreds of thousands of tokens even before the system goes into production.
Despite this, few companies track these development costs separately from operating costs. The result is a systematic underestimation of the total cost of ownership of an AI integration. Therefore, a company's true AI budget is often double what is perceived.
This does not mean AI is not worth the investment. On the contrary, it means the ROI must be calculated more rigorously. Our web services and our activities in SEO copywriting demonstrate daily that AI, used well, generates measurable value. However,
News Categories
Related articles
Discover other articles that explore similar topics in depth, selected to give you a more complete and stimulating view. Each piece of content is carefully chosen to enrich your experience.