CIO CFOBoardDeepSeekAI-pricingV4-ProHuaweiAscendAI InfrastructureAI GovernanceVendor RiskEnterprise AIModel Cost

DeepSeek makes 75% price cut permanent on flagship V4-Pro

Joachim Hogby

23. mai 202623. mai 20264 min lesingKilde: Reuters

Del

LinkedIn X Facebook E-post WhatsApp Telegram

DeepSeek confirmed on May 23 that its 75% discount on the flagship V4-Pro model will become permanent. The promotional pricing set to expire on May 31 will not revert — the lower price is now the base price indefinitely.

This means V4-Pro, a model designed to compete directly with OpenAI, Anthropic, and Google on quality, now costs roughly 20 to 35 times less than Western frontier models for comparable workloads.

New permanent pricing:

Input: $0.435 per million tokens (down from $1.74)
Output: $0.87 per million tokens (down from $3.48)
Cache-hit: in some cases one-tenth of the original price
In yuan: 0.025–6 yuan per million tokens, down from 0.1–24 yuan

For an enterprise processing billions of tokens monthly, the annual savings could run into millions of dollars.

Not a budget model

V4-Pro is not a lightweight system. Its Mixture-of-Experts architecture uses an estimated 1.6 trillion total parameters with around 49 billion activated during inference. The model supports a one-million-token context window and can output up to 384,000 tokens in a single request.

By comparison, GPT-5.5 is estimated at $8–15 per million input tokens and $30–50 per million output, with Anthropic's Claude Opus series even more expensive for heavy reasoning and long-context tasks.

Huawei chips underneath

A key factor behind DeepSeek's ability to make the cut permanent is infrastructure. The V4 family is the company's first major AI model line optimized to run on Huawei's Ascend AI accelerators rather than primarily NVIDIA hardware.

Chinese tech giants — Tencent, Alibaba, and ByteDance — are reportedly racing to secure Ascend 950 and 950PR chips following the V4 launch, according to Reuters. Production remains constrained because US export controls continue limiting China's access to advanced chipmaking. Huawei aims to ship approximately 750,000 Ascend 950PR units during 2026.

What this means for leaders

The permanent pricing structure changes the equation for any organization paying full price for Western AI models. CIOs and CFOs should consider several things:

Cost opportunity: If DeepSeek delivers comparable quality on relevant tasks, the savings are real. Test V4-Pro against your own workloads before drawing conclusions.

Vendor risk: As Chinese models permanently pressure Western pricing, the case for locking all AI traffic to a single supplier weakens. Multi-model sourcing provides negotiating leverage and flexibility.

Data governance and regulation: EU and Norway have stricter data-handling requirements than China. DeepSeek access in Europe is not as simple as an API call from the US. Assess whether data sovereignty, privacy, and audit requirements prevent adoption.

Infrastructure fragmentation: China's AI stack now running on Huawei chips rather than NVIDIA means the global AI ecosystem is fragmenting. For organizations building their own AI infrastructure — or dependent on NVIDIA supply — this changes vendor power dynamics and pricing expectations.

This is no longer a promotion. It is a permanent pricing structure from a model claiming to be in the top tier — powered by a semiconductor stack outside the Western ecosystem.

Sources and media

Reuters: China's DeepSeek to make permanent 75% price cut on flagship V4-Pro AI model (May 23, 2026)
Bloomberg: DeepSeek to make permanent 75% discount on flagship AI model (May 23, 2026)
The Tech Portal: China's DeepSeek permanently cuts prices of flagship V4-Pro AI model by 75% (May 23, 2026)
Thumbnail: GPT/OpenAI Image 2 / hogby.ai

📬 Likte du denne?

AI-nyheter for ledere. Kuratert av en CIO som bygger det selv. Daglig i innboksen.

Relaterte saker

AICIOCISO

Artificial Analysis: Claude Opus 5 tar ledelsen på agentbenchmark

Artificial Analysis plasserer Claude Opus 5 øverst på AA-Briefcase for agentisk kunnskapsarbeid. Viktigst for ledere: bedre analyse, men lange kjøretider og høy innsats gjør styring avgjørende.

26. juli 20265 min lesing

Artificial Analysis

Åpne saken

Anthropic gjør Claude Opus 5 til ny toppmodell for agentarbeid

Breaking

AI-modellerAnthropicClaude

Anthropic gjør Claude Opus 5 til ny toppmodell for agentarbeid

Claude Opus 5 flytter Anthropic-kampen fra ren intelligens til styrbar kost, fart og sikkerhet i agentarbeid. Det er en tydelig CIO-sak, ikke bare en modellnyhet.

24. juli 20265 min lesing

Anthropic

Åpne saken

CIOCISOCTO

GitHub ruller Claude Opus 5 inn i Copilot for agentisk koding

Claude Opus 5 er tilgjengelig i GitHub Copilot for Pro+, Max, Business og Enterprise. GitHub fremhever agentiske kodeflyter, egenverifisering og strengere cyber-sperrer. For IT-ledere blir modellvalg i Copilot et spørsmål om styring, kostnad og sikkerhet – ikke bare autocomplete.

24. juli 20265 min lesing

GitHub

Åpne saken