Microsoft Spent Years Paying OpenAI and Anthropic. Now It's Competing With Both.

For the last four years, Microsoft's AI strategy has been legible: write large checks to frontier labs, deploy their models across Azure and Copilot, and collect margin on top. The checks were large. Microsoft took multibillion-dollar equity stakes in both OpenAI and Anthropic. It bet the consumer surface of its business on models it did not build and could not fully control.

On June 2, at its Build developer conference in San Francisco, Microsoft announced it is done waiting for the invoice.

The company announced a family of seven new models across image, voice, transcription, thinking, and coding. The two that matter most are MAI-Thinking-1, Microsoft AI's first reasoning model, trained from scratch without distillation, using clean, commercially licensed, enterprise-grade data — a mid-sized model with 35 billion active parameters and a 128K context window — and MAI-Code-1-Flash, a new Microsoft coding model built for fast, efficient assistance in everyday developer workflows, built end-to-end by Microsoft using clean and appropriately licensed data.

The benchmark numbers Microsoft chose to lead with are pointed. MAI-Thinking-1 reaches 97.0 percent on AIME 2025 and 94.5 percent on AIME 2026. On SWE-Bench Pro, Microsoft says it matches Claude Opus 4.6 on coding tasks, and in blind side-by-side evaluations, MAI-Thinking-1 was preferred over Claude Sonnet 4.6. Anthropic is one of Microsoft's investees. The comparison was not accidental.

The cost argument underneath the benchmark argument

The performance claims are the headline. The cost logic is the story.

After refining its models for the needs of consulting firm McKinsey, Microsoft was able to outperform OpenAI's GPT 5-5, with 10 times better cost efficiency, said Mustafa Suleyman, CEO of Microsoft AI. That number — 10x — is the one that Microsoft's enterprise sales team will be repeating for the next six months. Microsoft projected a 10x improvement in output tokens per dollar from the fine-tuned MAI model compared to GPT 5.5.

MAI-Code-1-Flash is already shipping. The model is rolling out to GitHub Copilot individual users in Visual Studio Code in the model picker and under the default auto picker. In the new token-based billing in GitHub Copilot, MAI-Code-1-Flash is priced cheaper than Claude Haiku 4.5. It outperforms Claude Haiku 4.5 with better price-to-performance across coding benchmarks.

GitHub Copilot switched to token-based billing at the start of June. The timing is not subtle. Microsoft now controls which model the default routing in Copilot selects, what that model costs per token, and who built it. That is three degrees of margin the company did not hold six months ago.

For MAI-Thinking-1, the deployment is more cautious. Microsoft has published a preprint describing the evaluation methodology, but full reproduction of the results by independent labs has not yet occurred, leaving those benchmark claims open to challenge until confirmed externally. MAI-Thinking-1 is now available in private preview through Microsoft Foundry. The reasoning model is not yet a product. It is a claim with a model card attached to it.

Seven models, one message

The full MAI family announced at Build extends well past reasoning and coding. Microsoft previously announced MAI-Voice-1 and MAI-1-preview, followed by MAI-Transcribe-1 and MAI-Image-2 earlier this year. The Build additions fill the remaining surface area. MAI Transcribe 1.5 combines state-of-the-art accuracy across 43 languages, with streaming coming soon. MAI-Voice-2 and its flash variant are now available in more than 15 additional languages with new voice options.

MAI-Image-2.5 and its flash variant are Microsoft's first models to serve both text-to-image and image-to-image workloads — ranking third and second on the Arena AI leaderboard respectively. The flash variant is for super-efficient production workloads at scale, while 2.5 gives maximum fidelity and professional-grade performance. This model is already available in PowerPoint and is rolling out to OneDrive.

The distribution plan for the MAI family is also expanding beyond Microsoft's own surfaces. MAI models will also become available through Fireworks AI, Baseten, and OpenRouter. That means Microsoft is entering the third-party model marketplace — selling against the same companies whose models it has historically distributed.

The structural shift underneath all of this is that Microsoft is now competing at every layer of the AI stack simultaneously: infrastructure via Azure, orchestration via Copilot Studio and Foundry, distribution via Fireworks and OpenRouter partnerships, and now models themselves. Microsoft is attempting to play at more layers of the AI stack as OpenAI and Anthropic continue to record historic growth and push toward the public market. Anthropic confidentially filed for an IPO on June 1, and OpenAI is also pursuing an offering, potentially this year. Once both companies are public, Microsoft's equity positions become liquid — and so does the question of whether the partnership model still serves Microsoft's interests.

For small businesses and independent developers, the immediate practical effect is simpler than the corporate chess match suggests. MAI-Code-1-Flash is now rolling out to approximately 10% of individual Copilot users as a starting point, with users who select Auto in VS Code model picker potentially routed to the model. If you use GitHub Copilot and let the auto-picker choose, you may already be running Microsoft's own model without having selected it. It costs less than Haiku. Microsoft says it outperforms Haiku. The question small teams actually need to answer is whether the output quality holds in their specific codebases — not in SWE-Bench Pro, which has nothing to do with a two-person startup's Rails monolith.

The Microsoft post highlights token-efficiency features, claiming MAI-Code-1-Flash can solve harder coding tasks with up to 60% fewer tokens. For developers paying per token under the new Copilot billing model, that number has a direct dollar value — one that will either validate the switch or quietly get walked back in the next model update.

Microsoft Spent Years Paying OpenAI and Anthropic. Now It's Competing With Both.

The cost argument underneath the benchmark argument

Seven models, one message

More from Off the Record

Latest from The News Bakery