AI Models from Google DeepMind and OpenAI Secure Gold at International Mathematical Olympiad

In a landmark achievement, artificial intelligence systems from Google DeepMind and OpenAI have attained gold‑medal scores at this year’s International Mathematical Olympiad, demonstrating the extraordinary leaps AI is making in high‑level reasoning. For the first time in history, these models solved five out of six IMO problems—placing them firmly in the gold medal band alongside top high‑school mathematicians.
Google’s entry, a general‑purpose reasoning model known as Gemini Deep Think, was officially evaluated and verified by the IMO committee. OpenAI, on the other hand, opted not to enter officially but reported that its experimental reasoning system achieved an equivalent gold‑level performance, as confirmed by three former IMO gold medallists. Both programs rely on natural language processing and advanced reasoning to interpret and solve complex proofs—a departure from past AI models that depended heavily on symbol manipulation.
The significance of this breakthrough goes beyond historic firsts. According to Brown University mathematics professor Junehyuk Jung, an IMO veteran and visiting researcher with DeepMind, these developments suggest that AI could soon assist in addressing unsolved problems in research‑level mathematics. “The moment we can solve hard reasoning problems in natural language,” he explained, “we open doors to collaboration between AI and mathematicians”.
OpenAI’s strategy was notably resource‑intensive, deploying an “experimental” model designed to maximize test‑time computations. By enabling parallel reasoning chains and extending the model’s “thinking” period, they reached gold‑level accuracy—albeit at considerable computational cost. Though detailed hardware figures were withheld, the approach underscores a broader shift toward depth and rigor in AI reasoning.
The broader AI community is taking note. Models that once topped benchmarks in games like Go and Poker are now proving capable in abstract intellectual challenges traditionally dominated by humans. The success of these reasoning systems marks a pivotal evolution from narrowly focused agents to general‑purpose AI capable of tackling complex, multi‑step problems in natural language.
For academia and industry alike, the implications are vast. AI tools that can interpret, reason and articulate mathematical proofs could transform scientific research, accelerating discoveries across fields that require deep analytical insight—from physics and biology to economics.
AI firms have previewed plans to extend current capabilities further. OpenAI noted that the models behind this result may not be publicly available for months, while Google is preparing a broader release of Gemini Deep Think to selective partners.
Despite the optimism, experts emphasize caution: AI’s reasoning proficiency is advancing rapidly but remains imperfect. Careful validation, transparency and collaboration with the academic community will be essential to ensuring these systems augment rather than mislead human researchers.
Still, this milestone indicates a turning point. When machines not only calculate but reason with mathematical sophistication comparable to human experts, a future of AI–human co‑creation in science moves from hypothetical to achievable. The gold medals at the IMO are proof—in more ways than one—that we’re entering a new chapter in intelligent computation.