An icon of an eye to tell to indicate you can view the content by clicking
Signal
Original article date:

Frontier AI Models Outperform Specialized Clinical Tools on Every Benchmark — With Implications Beyond Medicine

June 12, 2026
5 min read

A peer-reviewed study published in Nature Medicine finds that general-purpose frontier AI models — GPT-5.2, Gemini 3.1 Pro, and Claude Opus 4.6 — consistently outperform specialized clinical AI tools in medical settings. The findings carry a broader implication for any organization evaluating whether to buy purpose-built AI tools or rely on frontier models.

Researchers from NYU Langone Health ran three evaluations: 500 US Medical Licensing Examination-style questions (MedQA), 500 clinician-alignment items (HealthBench), and 100 real physician queries (RCQ) drawn from live clinical deployments. The clinical tools evaluated were OpenEvidence and UpToDate Expert AI — both built on large language models and designed specifically for medical use.

Frontier models won across all three stages. On MedQA, Gemini scored 97.4%, GPT 94.2%, and Claude 90.2% — compared to 89.6% for OpenEvidence and 88.4% for UpToDate. On HealthBench, GPT scored 88.0 versus 62.6 and 61.3 for the clinical tools. On real physician queries, clinical tools had 49–87% lower odds of receiving a higher clinician rating than Gemini. Google Search AI Overview matched — not exceeded — the clinical AI tools in the real-world query evaluation.

Key Takeaways

  • Specialized AI tools did not outperform frontier models on medical knowledge, expert clinical alignment, or real-world physician queries.
  • Scale and alignment may outweigh domain-specific tuning for tasks that primarily involve knowledge retrieval and reasoning.
  • Procurement and regulatory implications: the authors call for independent evaluation of AI tools before clinical adoption — a principle that applies to AI procurement in any sector.

The study is open access and the code is publicly available at github.com/nyuolab/clinical-llm-benchmarks.

Read the full article on Nature Medicine