4 min learnNew DelhiApr 26, 2026 11:11 AM IST
Greater than a yr after taking the AI world by storm, Chinese language AI startup DeepSeek has unveiled two variations of its all-new massive language mannequin (LLM) known as DeepSeek V4 Flash and DeepSeek V4 Professional.
Each the Flash and Professional fashions are open-weight fashions with context home windows of over 1 million tokens every, which permits customers to enter massive paperwork or whole codebases in prompts. The Professional mannequin has a complete of 1.6 trillion parameters (49 billion energetic), reportedly making it the largest open-weight mannequin accessible.
It outpaces Moonshot AI’s Kimi Ok 2.6 mannequin (1.1 trillion parameters), MiniMax’s M1 (456 billion parameters), and greater than doubles its predecessor, DeepSeek V3.2 (671 billion parameters).
The Flash mannequin, however, is the smaller of the 2 with over 284 billion parameters (13 billion energetic). Each fashions have been launched underneath analysis preview. Not like lots of its closed rival AI fashions, the V4 Flash and V4 Professional reportedly can’t be used to generate audio, video, and pictures as they assist textual content outputs solely.
DeepSeek’s newest LLM follows a long-awaited improve to final yr’s V3.2 and R1 reasoning mannequin. That earlier launch shook markets and hurtled DeepSeek into the highlight because it demonstrated that an open-weight mannequin might compete with cutting-edge fashions from OpenAI and Google whereas utilizing far fewer sources. Their debut challenged long-standing assumptions about coaching prices and efficiency whereas reshaping competitors and pricing throughout the AI business.
It marked a serious inflection level in AI growth, as the corporate discovered methods to extract extra compute from much less superior Nvidia H20 GPUs by combining superior machine studying strategies similar to distillation, mixture-of-experts (MoE), and multi-head latent consideration (MLA).
Benchmark efficiency, pricing
The DeepSeek V4 Flash and V4 Professional fashions have an MoE structure, which breaks down duties into subtasks and delegates them to smaller, specialised ‘knowledgeable’ elements. Each fashions are extra environment friendly and performant than DeepSeek V3.2 as a consequence of architectural enhancements, in accordance with DeepSeek. Not like its predecessor fashions that had been skilled on H20 GPUs, the V4 fashions run on the newest chips designed by Chinese language chipmaker Huawei whilst shipments of Nvidia’s H200 GPUs to the nation are reportedly stymied by disagreements over the phrases of the gross sales each by China and the US.
Story continues beneath this advert
When it comes to efficiency, DeepSeek claimed that its new V4-Professional-Max mannequin outperforms open-weight friends throughout reasoning benchmarks, and outpaces OpenAI’s GPT-5.2 and Gemini 3.0 Professional on some duties. On sure coding benchmarks, DeepSeek mentioned each V4 fashions’ efficiency was “akin to GPT-5.4.”
Nevertheless, the fashions appear to path barely behind frontier fashions similar to OpenAI’s GPT-5.4 and Google’s newest Gemini 3.1 Professional, in the case of data exams. That is due to a “developmental trajectory that trails state-of-the-art frontier fashions by roughly 3 to six months,” the corporate mentioned.
Notably, DeepSeek has sought to take care of its price benefit by making its V4 pair of fashions extra reasonably priced than any frontier mannequin accessible at present.
The smaller V4 Flash mannequin prices $0.14 per million enter tokens and $0.28 per million output tokens, undercutting GPT-5.4 Nano, Gemini 3.1 Flash, GPT-5.4 Mini, and Claude Haiku 4.5. In the meantime, the bigger V4 Professional mannequin prices $0.145 per million enter tokens and $3.48 per million output tokens, coming in decrease than Gemini 3.1 Professional, GPT-5.5, Claude Opus 4.7, and GPT-5.4.
© IE On-line Media Providers Pvt Ltd


