We ran 10,000 representative prompts through both APIs and broke down the cost per million tokens. The results might surprise you.
Everyone in AI knows DeepSeek is cheaper than OpenAI. But how much cheaper depends heavily on your workload. We ran 10,000 representative prompts across five categories and measured total cost for equivalent quality outputs.
Test methodology
We used a dataset of 10,000 prompts split across five categories: code generation (25%), summarization (25%), multi-step reasoning (20%), creative writing (20%), and data extraction (10%). We evaluated both cost and output quality using a GPT-4 judge.
Results
For code generation, DeepSeek R1 cost 94% less than GPT-4o per task while matching quality in 89% of cases. For summarization, quality was essentially indistinguishable at a 91% cost reduction. Multi-step reasoning was the closest category — DeepSeek R1 still came in 87% cheaper.
When to use OpenAI
OpenAI's tooling ecosystem, function calling reliability, and vision capabilities still lead in some specific cases. If you're doing complex multi-modal workflows that require GPT-4V or need the latest GPT-4o features, the premium may be justified.
For everything else — and that's most workloads — DeepSeek R1 via Nova is the rational default.
Raj Patel
Lead Engineer at Nova