If you’ve ever paid for a digital subscription product, your expectations are probably being challenged by AI Chatbots. As a consumer, you’re used to a simple way things work: you pay $19.99 for Netflix and get unrestricted access for a month, with hundreds of movies available for you to watch.
Now imagine this. You’re 14 days into your subscription, 40 minutes into a 90-minute movie, and a notice pops up saying you’ve hit your viewing limit. You’re told to wait a few hours for it to reset. That would be frustrating. It breaks the core promise of what a subscription is supposed to be.
That is the shift AI products are introducing, and users are struggling with it.
This has been the experience for many users of AI products. Take Claude, for example, the rapidly adopted AI assistant developed by Anthropic. Even on a paid subscription plan, Claude users hit usage limits and are prompted to pay more to continue or wait for a limit reset, which could last for minutes or hours.
This deviates from the typical software subscription experience; it used to mean predictability, but Claude is introducing variability in the middle of usage. This interruption breaks productivity and erodes the trust users have historically had in subscription products.
Now, while this may feel different or unfamiliar to users, we shouldn’t be quick to assume it is an exploitative pricing technique; we need to look further to truly understand why.
What I see here is a reflection of the underlying economics of AI product commercialisation. Of the several factors that could be at play, three are very plausible to me because they are quite connected.
One is infrastructural, rooted in how these AI models actually run. Another is a business maths concern at scale. And the third is experimental, about how companies learn what customers will actually pay. The three factors are:
- Expensive inference cost
- Unit economics discipline
- Demand elasticity discovery
Expensive inference cost
When you enter a prompt in Claude, ChatGPT, or any AI assistant you use, and it gives you a response, the AI’s ability to understand, process, and generate a response is called inference – this process requires significant computing resources, and it is expensive.

Now, this compute cost is not typical of traditional software products, where the marginal cost of serving one more user is nearly zero. When you subscribe to a movie streaming app or something like Microsoft Office, it costs almost nothing for Netflix or Microsoft to serve an additional user. For example, once a movie is acquired Netflix can distribute it to millions of customers through its app at almost zero cost, and even if several people stream at once, the only infrastructure that is affected is content delivery.
AI Labs, on the other hand, even after training AI models, still incur a cost when users interact with their models, and the more people retrieve information from the models, the higher the inference costs for the AI Labs. Patterson et al. (2022) find that the aggregate cost of inference at Google data centers in three weeks of 2019, 2020, and 2021 accounts for 60% of the total ML compute expenditure.
Essentially, software companies benefit from strong economies of scale, but that is not the reality for AI products, given how the technology works.
It is this inference compute constraint that has led to some recent key decisions across the industry. Google, for instance, limited free access to Gemini 3 following a surge in demand that happened last year. OpenAI has had to sunset Sora, their video model, just to free up scarce compute, and has introduced usage limits for their Plus users to prevent long, resource-intensive sessions. Claude operates under similar constraints, with token limits that tighten during peak weekday hours.
Inference compute cost is why AI firms now ration access, even within subscriptions, offering plans that cost up to $200 per month. This is why OpenAI and Anthropic now layer in usage-based billing into their subscription plans as a way of passing the variable inference compute costs directly to users. As much as it protects margins for them, it makes costs less predictable for heavy users.
Unit economics discipline
Everything discussed about inference compute cost and AI subscription pricing ultimately comes down to unit economics. Unit economics is the maths of a business at the most basic level. What does it cost to serve one user, and how much revenue does that user generate? If that equation does not hold, the business is effectively loss-making.
Although a loss-making equation is not unusual for high-growth technology companies, many scale before they fully optimise for profitability. But there still needs to be a clear path toward building a financially sustainable business over time.
The maths is, however, unusually tight for AI companies because of the sheer scale of global demand.
OpenAI is arguably the fastest-growing technology product in history. ChatGPT reached over 100 million users in about two months after launch, and has since scaled to more than 800 million weekly active users globally. For context, it took Facebook roughly 4.5 years to reach 100 million users, and Instagram about 2.5 years to get there.
The demand for AI products has also driven revenue growth in the industry. Anthropic’s run rate has already crossed $30B this year, up from about $9B at the end of 2025. The company went from roughly $1 billion in annualised revenue in December 2024 to over $30 billion in less than 16 months.
But even at this scale of AI product adoption and the associated revenue generated, cracking the unit economics code remains a difficult equation. Both OpenAI and Anthropic are projected to spend a combined $65 billion on model training and inference this year alone. This figure is more than what the two companies are generating in revenue.
Every prompt has a real cost because inference is expensive. So if you offer unlimited access at a flat price, your heaviest users quickly become unprofitable. You are effectively losing money the more people use your product.
It’s like running a buffet business, where every plate you serve costs you money. If a few customers keep coming back for unlimited refills, your revenue might grow, but your losses grow faster. At some point, you either limit portions or change the pricing.
This is what has left AI companies with two choices. Raise prices for everyone, or introduce usage sensitivity through limits and usage-based billing.
The usage limits and $200 subscriptions are AI companies trying to make the maths work while still capturing the massive market demand.
Demand elasticity discovery
As AI companies work through the maths of unit economics, they are also trying to answer a more fundamental question: how much are customers actually willing to pay before they walk away? For a relatively new product category, this insight is critical, and companies are leaning on basic economic principles to figure it out.
In basic economics, when the price of a product increases, demand is expected to fall. That is elastic demand, where demand is sensitive to price changes. But there are well-known exceptions.
In some categories, prices go up and demand barely moves. Think insulin, water, or even addictive goods like cigarettes. In this category, the demand curve is stubborn. People may complain, but they continue to pay because the need, habit, or lack of substitutes keeps them locked in.
That is where pricing power lives. This is why firms (pharmaceuticals, utility companies, etc.) fight to operate in categories with limited substitutes. Inelastic demand, in many ways, is a licence to extract value.
What we are seeing in AI today looks like early-stage elasticity testing. People complain about rate limits, but many still pay, upgrade to higher tiers, or simply wait for their limits to reset.
A few weeks ago, I saw this interaction on X (Twitter) that lends credence to my point here. Anthropic's Head of Growth was responding publicly after users called out a pricing change where Claude Code was pulled from the Pro plan (base subscription), albeit for a small percentage of new signups.
His response was candid. He said usage patterns have changed fundamentally since their Max plan launched over a year ago. The product had evolved, but the pricing had not caught up.
If you read the full interaction, you would notice that Anthropic’s position was not to reverse course. They held the line and explained, and kept experimenting in public. That is a company that believes its demand curve is stubborn enough to absorb the backlash.
From the company’s perspective, this behaviour suggests the product may already be crossing into an inelastic category.
This is not exactly surprising considering how AI is becoming deeply embedded into our lives and work. As dependency increases, switching becomes harder, and willingness to pay rises, resulting in vendor lock-in.
Obviously, these companies also can’t get too far ahead of themselves because while short-medium term inelasticity can look like pricing power, over time, users adapt, alternatives emerge, and demand can shift again.
For consumers, businesses, and AI companies
So for now, usage limits and interruptions on paid subscriptions are likely to remain the new normal as you use AI products. If you work at an organisation that has adopted AI company-wide, some of that pressure may shift away from personal subscriptions, depending on how heavily you use these tools.
Even at the enterprise level, costs can still rise quickly, but this is not entirely new to businesses. Most organisations are already familiar with usage-based pricing through cloud service providers.
Nonetheless, enterprise buyers should still assess AI products not only on features or current pricing, but on how sustainable their business usage patterns are under variable cost structures.
For AI companies, they need to deliver value to customers while bringing down the cost of serving them. There is ongoing work to reduce inference compute cost, from techniques like quantisation to better hardware and more efficient infrastructure. But until cost efficiency improves for AI companies, everything we are experiencing with pricing and limits is not going away.
Ultimately, if you follow the money, you end up at inference compute cost. It is the invisible hand driving every usage limit and every subscription pricing decision.
Start the conversation
Become a member of Product Marketing Alliance to start commenting.
Sign up now