DeepSeek V4
DeepSeek's flagship V3 successor, served as V4-Pro (1M context) and V4-Flash, whose 75% price cut became the permanent list price on May 22, 2026. Per [DeepSeek V4-Pro 75% Price Cut Goes Permanent](https://codersera.com/blog/deepseek-v4-pro-permanent-price-cut-may-2026/).
Summary
DeepSeek's flagship V3 successor, served as V4-Pro (1M context) and V4-Flash, whose 75% price cut became the permanent list price on May 22, 2026. Per [DeepSeek V4-Pro 75% Price Cut Goes Permanent](https://codersera.com/blog/deepseek-v4-pro-permanent-price-cut-may-2026/).
It is a model-class node on the cost-of-inference axis of the site. Its permanent price floor ($0.435/$0.87 per MTok for Pro) makes token-heavy patterns — agentic retrieval, validators, large-batch data processing on S3-backed corpora — economically viable, and it sets a reference price competitors must answer.
- The flagship API name is V4-Pro, not bare "V4"; V4-Flash is a separate, cheaper tier — don't conflate their prices.
- "Permanent" is the publisher's framing for a non-expiring list price, not a contractual guarantee; verify live rates before quoting.
- The cache-hit price ($0.003625/M Pro, $0.0028/M Flash) only applies to cached input, not all input — blended cost depends on cache-hit ratio.
- Competitor figures (GPT-5.5, Opus 4.7) come from third-party analysis, not DeepSeek's docs; treat multipliers as reported, not official.
extendsDeepSeek V3 — V4 is the direct successor flagship line.solvesHigh Cloud Inference Cost — its permanent floor is the clearest 2026 datapoint on collapsing inference cost.optimizes_forPerformance-per-Dollar — positioned explicitly on price/capability against US frontier models.competes_withGPT-5.5 / Claude Opus — third-party pricing puts it ~11-34x cheaper per token.
Definition
DeepSeek V4 is DeepSeek's flagship model family and the successor to V3, served via the first-party API in two tiers: V4-Pro (1M context, thinking and non-thinking modes) and the cheaper V4-Flash. On May 22, 2026 DeepSeek made its 75% V4-Pro discount permanent — converting a promotion that was set to expire May 31, 2026 into the standing list price. Per [DeepSeek V4-Pro 75% Price Cut Goes Permanent](https://codersera.com/blog/deepseek-v4-pro-permanent-price-cut-may-2026/).
Cheap, capable inference is the economic precondition for agent-heavy data infrastructure — agentic retrieval (DCI), validators, and LLM-assisted catalog/data systems all burn tokens. By publishing a non-expiring price floor, V4 directly drives down the cost-of-inference for self-hosted-adjacent AI-data pipelines. Per [DeepSeek's steep V4-Pro price cut escalates AI pricing war (InfoWorld)](https://www.infoworld.com/article/4176709/deepseeks-steep-v4-pro-price-cut-escalates-ai-pricing-war.html).
Cost-sensitive high-volume inference, agentic retrieval and search loops, batch document/data processing, long-context (1M) reasoning, LLM-assisted data and catalog tooling.
Recent developments
- V4 preview ships open weights at 1.6T parameters — within 0.2 points of the closed US frontier. The V4 preview (April 24, 2026) released V4-Pro at 1.6 trillion total parameters (49B active) plus a 284B V4-Flash, both as open weights — making V4-Pro the largest open-weights model available to download and self-host. It scores 80.6% on SWE-bench Verified (within 0.2 pts of Claude Opus 4.6), 93.5% on LiveCodeBench, a Codeforces rating of 3206, and a 1M-token context. DeepSeek frames it as trailing state-of-the-art closed models by only 3–6 months — the capability half of the pricing story below: frontier-adjacent quality you can run behind your own data-residency boundary. Per the DeepSeek V4-Pro complete guide.
- 75% V4-Pro price cut made permanent on May 22, 2026. What was a promo expiring May 31 became the standing rate: $0.435/M input (cache miss), $0.87/M output, $0.003625/M cached input. Per Models & Pricing | DeepSeek API Docs.
- V4-Flash tier published at a still-lower point. $0.14/M input (cache miss), $0.28/M output, $0.0028/M cached input. Per Models & Pricing | DeepSeek API Docs.
- Roughly an order of magnitude under US frontier APIs. V4-Pro is reported ~11.5x cheaper than GPT-5.5 on input and ~34.5x cheaper on output, and ~28.7x cheaper per output token than Claude Opus 4.7 ($0.435/$0.87 vs GPT-5.5 $5/$30 and Opus 4.7 $5/$25). Per DeepSeek V4-Pro 75% Price Cut Goes Permanent.
- Framed as an efficiency pass-through, not a loss-leader. Analysts characterize it as "an efficiency gain being passed through," which is why it is permanent rather than promotional, pressuring OpenAI, Anthropic, and Google. Per DeepSeek's steep V4-Pro price cut escalates AI pricing war (InfoWorld).
Connections 8
Outbound 7
scoped_to1extends1implements2optimizes_for1solves1competes_with1Inbound 1
competes_with1Resources 3
Authoritative first-party pricing for V4-Pro and V4-Flash (input cache-hit/miss and output).
Reputable trade-press confirmation of the permanent framing and pricing-war context.
Provides the May 22, 2026 date and the head-to-head multipliers vs GPT-5.5 and Claude Opus 4.7.