OpenAI made $4.3 billion in the first half of 2025 and still burned about $2.5 billion in cash. The company says it’s on track for $13 billion this year, but the cost of running and improving ChatGPT remains enormous. Below, we unpack the why, the risks, and what to watch if you build on OpenAI.
Table of Contents
The numbers in context
The headline is a study in contrasts. Revenue is rising fast, up from last year’s total by mid-year. Losses are still heavy because two things dominate the bill: running today’s models for hundreds of millions of users and training tomorrow’s, bigger ones. By mid-2025, OpenAI also sat on a large cash cushion to keep building at this pace. On the demand side, usage is massive and still growing, with hundreds of millions of weekly users sending billions of messages daily. That scale is both the moat and the money pit.
Why the cash burn is so high
When people hear “AI costs,” they think of training the next giant model. That’s part of it. The less obvious part is inference — answering your prompts in real time. Inference eats compute all day, every day. Each chat, image request, or code fix spins up GPUs, memory, and networking. Multiply that by global usage and you get a power-hungry machine that never sleeps.
Infrastructure adds its own gravity. OpenAI is tying up supply for years through deals that bundle chips, networking gear, and datacenter capacity. The upside is predictable access to GPUs. The trade-off is the bill shows up up-front.
What’s new on infra
OpenAI and NVIDIA laid out a path to deploy systems capable of at least 10 gigawatts of AI compute over the next few years. The plan includes millions of GPUs and staged financing tied to build-outs. In parallel, Oracle and SoftBank are helping stand up additional sites under the Stargate banner. If you run AI apps, this matters because supply stability can ease model quotas and reduce wait times. If you’re counting dollars, it means depreciation and power contracts have a long tail.
Microsoft’s role and margins
Microsoft remains the key strategic partner. Historically, OpenAI shared a chunk of top-line revenue with Microsoft. That share is expected to fall later this decade as new terms kick in. Lower rev-share should help gross margins, but it won’t erase the reality that compute is still the main cost driver. In short, OpenAI’s near-term margin story hinges more on serving costs per request and model efficiency than on accounting mechanics.
Competition watch: Anthropic
Anthropic says its annualized run-rate topped $5 billion by August 2025, helped by developer-friendly coding tools and long-running “agent” tasks. It just launched Claude Sonnet 4.5, which the company claims beats rivals on practical coding work. For buyers, the picture is nuanced: OpenAI still has broader distribution and product breadth; Anthropic may be the sharper pick for certain coding and agent workflows. Expect price-feature moves on both sides.
Is this sustainable? Three 2026-2030 scenarios
- Bull case: hardware and software efficiency reduce per-request costs; rev-share falls; enterprise spend keeps rising. Net: margins improve, cash burn moderates even as usage grows.
- Base case: hybrid cloud and owned capacity coexist; per-token costs drift down, but not dramatically. Net: growth continues; profitability waits on another step-change in efficiency.
- Bear case: GPU supply tightens or energy costs jump; pricing wars intensify; large enterprise contracts consolidate on a few “good enough” models. Net: revenue grows slower than costs.
What this means for customers and builders
- Expect smarter tiering. Watch for clearer price breaks by latency, context length, and reliability SLAs.
- Plan for model swaps. Keep your apps portable across at least two vendors. Abstract your calls and log evals so you can switch without rewriting your stack.
- Track total cost, not just API price. Latency, retries, and context inflation add up. Measure real cost per successful task, not per token.
- Lock in capacity when it helps. If you have stable workloads, negotiated commitments can yield better pricing and priority access.
A quick example
A customer-support team moves from a generic “answer any question” bot to a retrieval-tuned model with a shorter context window. Ticket resolution time drops 18%, hallucinations fall, and monthly tokens shrink by a third. The API line item is smaller, but the bigger win is deflecting 12% of tickets entirely. That’s how to beat headline token prices.
Frequently Asked Question
Is OpenAI profitable today?
No. Revenue is rising, but operating losses remain large due to compute and R&D.
Why not slow spending?
The race for capacity and model quality rewards early movers. Falling behind on GPUs or research can be hard to recover.
Will prices drop?
Maybe. Efficiency gains and long-term chip contracts can reduce unit costs, but demand growth often soaks up savings.
What would change the cost curve?
More efficient architectures, better KV-cache handling, sparse/mixture-of-experts routing, custom silicon, and cheaper power.
Does user growth help margins?
At scale, yes, if infrastructure is well utilized and models get more efficient per request.
The Bottom Line
OpenAI is growing quickly and spending even faster. The path to healthier margins runs through efficiency per request, steadier GPU supply, and smarter revenue sharing. For builders, design your stack for portability and measure cost per successful task, not just per token.
Glossary
- Cash burn: Net cash outflow from operations and investment in a period.
- Inference: Running a trained model to answer real-time requests.
- Run-rate: Annualized revenue based on the most recent month/quarter.
- Stargate: OpenAI’s multi-site datacenter build-out with partners to secure long-term compute.
Featured Answer Boxes
What is OpenAI’s cash burn in 2025?
About $2.5B in the first half of 2025, driven mostly by the cost of running ChatGPT at huge scale and ongoing R&D. The company still targets $13B in full-year revenue but remains unprofitable due to heavy compute and infrastructure spending.
Is OpenAI profitable?
No. Rapid revenue growth is offset by high compute (inference) costs, training new models, and stock-based compensation. Profitability depends on lowering per-request cost, sharing less revenue with partners, and locking in cheaper, long-term compute.
Why are inference costs so high?
Every prompt consumes GPU time, memory bandwidth, and power. With hundreds of millions of users and long contexts, those micro-costs add up. Efficiency features (like sparse routing, better caching) and custom hardware can lower the bill.
What is Project Stargate?
OpenAI’s multi-gigawatt datacenter build-out with partners like NVIDIA, Oracle, and SoftBank to secure compute for training and serving future models. It’s designed to stabilize supply and reduce per-unit costs over time.
How many people use ChatGPT?
Roughly 700M weekly active users as of mid-2025, sending about 2.5–2.6B messages per day. That scale drives both OpenAI’s revenue and its massive running costs.
Source: Reuters

