OpenAI highlights GPT-5 scores on math, coding, and health benchmarks: 94.6% on AIME 2025 without tools, 74.9% on SWE-bench Verified, 46.2% on HealthBench Hard (Carl Franzen/VentureBeat)
12d ago
Technology
Techmeme

OpenAI has released performance metrics for its GPT-5 model, showcasing its capabilities in mathematics, coding, and healthcare. The model achieved a score of 94.6% on the AIME 2025 math competition without using external tools. In coding, it scored 74.9% on SWE-bench Verified. Its performance on HealthBench Hard was 46.2%. These scores indicate advancements in the model's ability to handle complex tasks across various domains. The release follows extensive anticipation and speculation surrounding OpenAI's next-generation large language models (LLMs).