CoreWeave Delivers Leading Inference Performance in MLPerf® Benchmark
Latest submissions featuring NVIDIA Grace Blackwell architectures demonstrate how
This press release features multimedia. View the full release here: https://www.businesswire.com/news/home/20260401967118/en/
CoreWeave leads MLPerf v6.0, doubling performance and delivering top results.
The AI industry is undergoing a fundamental shift with inference as the new critical focus. As enterprises move AI from experimentation into production and agentic workloads become the new standard, inference has emerged as the critical measure of performance. At the same time, demand for inference is growing faster than the underlying hardware can be deployed, and the gap between theoretical system performance and real-world output has emerged as a defining constraint on how quickly AI companies can grow. CoreWeave's MLPerf v6.0 results reflect the company's continued investment in full-stack optimization, consistently turning cutting-edge hardware into real-world inference performance.
"Inference is the defining layer in AI. It's where models are actually put to work and where performance in production shows up. Benchmarks like MLPerf help measure how theoretical performance translates into real-world output," said
CoreWeave’s v6.0 submissions reflected NVIDIA’s reference configurations as a verified, production-ready baseline across two of the most demanding reasoning models available: DeepSeek-R1 and GPT-OSS-120B. Key results include:
- Continued NVIDIA GB200 NVL72 Leadership: Led performance for DeepSeek-R1 in server and offline mode in tokens per second per GPU1. The configuration of GB200 NVL72 demonstrated standout throughput on DeepSeek-R1’s sparse Mixture-of-Experts architecture, where efficient serving requires dynamic expert routing and high-bandwidth internode communication.
- NVIDIA GB300 NVL72 Portfolio Leadership: Delivered high server throughput measured in tokens per second per GPU and per-GPU efficiency in the portfolio on DeepSeek-R1, 2X CoreWeave’s own MLPerf® 5.1 results on the same hardware footprint2.
- Innovation at Speed: Today, eight of the leading 10 model providers rely on CoreWeave Cloud, enabling customers to innovate at speed.
"The gap between benchmark performance and production reality has been one of the most persistent challenges in AI,” said
CoreWeave’s MLPerf v6.0 results provide additional validation as the only AI cloud to earn top Platinum ranking in both SemiAnalysis ClusterMAX™ 1.0 and 2.0, which evaluate AI cloud performance, efficiency and reliability. These benchmark results reflect CoreWeave’s platform strategy: delivering infrastructure purpose-built for the demands of production AI, from high-performance compute through the software layer that builders depend on to develop, test, and deploy at scale.
About CoreWeave
CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave delivers a platform of technology, tools, and teams that enables innovators to move at the pace of innovation, building and scaling AI with confidence. Established in 2017, CoreWeave completed its public listing on Nasdaq (CRWV) in
1 CoreWeave MLPerf 6.0-0022 server and offline mode. TPS/GPU is not an official MLPerf metric. Used in this article to normalize submissions that use different numbers of GPUs
2 Verified MLPerf score of v5.1 Inference Closed DeepSeek R1 server. Retrieved from https://mlcommons.org/benchmarks/inference,
View source version on businesswire.com: https://www.businesswire.com/news/home/20260401967118/en/
Source: