CIOMicrosoftAI

Microsoft Lets GPT and Claude Fact-Check Each Other in New Copilot Cowork

Joachim Høgby

30. mars 202630. mars 20263 min lesingKilde:

Del

LinkedIn X Facebook E-post WhatsApp Telegram

Microsoft today launched Copilot Cowork, a new Microsoft 365 capability that lets AI agents handle long-running, multi-step tasks autonomously. The headline feature is the "Critique" layer: OpenAI's GPT drafts a response, then Anthropic's Claude reviews it for accuracy and correct citations. Roles can be reversed, and a new "Model Council" feature lets users compare outputs from both models side by side.

The approach delivered a 13.8 percent improvement on the DRACO benchmark for the Researcher agent. Microsoft calls it a step toward more reliable AI by having rival models quality-check each other, reducing hallucinations.

Copilot Cowork is available through Microsoft's Frontier program for early access. Users describe their workflow, and the AI creates a plan and executes tasks across Word, Excel, Outlook, Teams, and SharePoint, while humans can monitor and course-correct along the way.

For CIOs, this means the multi-model strategy is now a reality in productivity tools. The question is whether this becomes the norm: AI systems fact-checking themselves using competitors' models.

📬 Likte du denne?

AI-nyheter for ledere. Kuratert av en CIO som bygger det selv. Daglig i innboksen.

Relaterte saker

CIOInfrastructure

Meta velger AWS Graviton for agentisk AI i stor skala

Akkurat nå4 min lesing

Åpne saken

CIOInfrastructure

Meta taps AWS Graviton to scale agentic AI

Akkurat nå4 min lesing

Åpne saken

DeepSeek åpner V4 Preview med 1M kontekst og API-kompatibilitet

Breaking

CIOOpen Source

DeepSeek åpner V4 Preview med 1M kontekst og API-kompatibilitet

Akkurat nå4 min lesing

Åpne saken