CIOMicrosoftMAIenterprise-AIspeech

Microsoft breaks from OpenAI: Launches own AI models for speech, transcription, and images

Joachim Høgby

6. april 20266. april 20264 min lesingKilde:

Del

LinkedIn X Facebook E-post WhatsApp Telegram

Microsoft launched three in-house AI models under the MAI (Microsoft AI) initiative on April 5th: MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2. The release signals a clear strategic move to reduce OpenAI dependence and compete directly with Google and Anthropic.

MAI-Transcribe-1 is an enterprise-grade speech recognition model supporting 25 languages at approximately 50% lower GPU cost than leading alternatives. It achieves a lower average word error rate than GPT-Transcribe and Gemini 3.1 Flash on accuracy benchmarks.

MAI-Voice-1 generates 60 seconds of expressive audio in under one second on a single GPU. It can create custom voices from just a few seconds of audio, enabling scalable voice personalization at scale.

MAI-Image-2 is Microsoft's second-generation text-to-image model and debuted at the top of the Arena.ai leaderboard. It generates images at least twice as fast as its predecessor on Foundry and Copilot.

The models are already integrated into Copilot, Bing, Azure Speech, and PowerPoint, and are available via Microsoft Foundry and the MAI Playground for developers and enterprises.

For enterprise decision-makers, this matters because it signals increased competition, lower enterprise AI pricing, and Microsoft products increasingly running on first-party models rather than OpenAI infrastructure.

📬 Likte du denne?

AI-nyheter for ledere. Kuratert av en CIO som bygger det selv. Daglig i innboksen.

Relaterte saker

CIOInfrastructure

Meta velger AWS Graviton for agentisk AI i stor skala

Akkurat nå4 min lesing

Åpne saken

CIOInfrastructure

Meta taps AWS Graviton to scale agentic AI

Akkurat nå4 min lesing

Åpne saken

DeepSeek åpner V4 Preview med 1M kontekst og API-kompatibilitet

Breaking

CIOOpen Source

DeepSeek åpner V4 Preview med 1M kontekst og API-kompatibilitet

Akkurat nå4 min lesing

Åpne saken