Microsoft Breaks from OpenAI: Launches Own AI Models for Speech, Voice, and Images
Microsoft announced on April 2, 2026, three new proprietary AI models available through Microsoft Foundry: MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2. The launch signals the company is building its own AI model stack, independent of OpenAI.
MAI-Transcribe-1 is a speech recognition model delivering state-of-the-art transcription, already integrated into Microsoft Teams. MAI-Voice-1 is a voice generation model allowing businesses to create custom voices with just seconds of audio. MAI-Image-2 improves performance and speed for image generation, with better controls and face preservation.
All three models are available through Foundry and a new MAI Playground, making integration into custom solutions straightforward.
The signal is clear: Microsoft wants to reduce its OpenAI dependency and build its own capability in foundational AI models. For enterprises, this means the Microsoft platform now offers speech, voice, and image without relying on third-party models.
MAI-Image-2 was first introduced on March 19, 2026, but achieved broad commercial availability on April 2.
📬 Likte du denne?
AI-nyheter for ledere. Kuratert av en CIO som bygger det selv. Daglig i innboksen.