*
AI models solved math problems by processing them using
natural
language
*
AI could soon tackle unsolved research problems, says math
professor and former champion
*
OpenAI self-published results before official verification
By Kenrick Cai and Jaspreet Singh
July 21 (Reuters) - Alphabet's Google and
OpenAI said their artificial-intelligence models won gold medals
at a global mathematics competition, signaling a breakthrough in
math capabilities in the race to build powerful systems that can
rival human intelligence.
The results marked the first time that AI systems crossed the
gold-medal scoring threshold at the International Mathematical
Olympiad for high-school students. Both companies' models solved
five out of six problems, achieving the result using
general-purpose "reasoning" models that processed mathematical
concepts using natural language, in contrast to the previous
approaches used by AI firms.
The achievement suggests AI is less than a year away from
being used by mathematicians to crack unsolved research problems
at the frontier of the field, according to Junehyuk Jung, a math
professor at Brown University and visiting researcher in
Google's DeepMind AI unit.
"I think the moment we can solve hard reasoning problems in
natural language will enable the potential for collaboration
between AI and mathematicians," Jung told Reuters.
The same idea can apply to research quandaries in other
fields such as physics, said Jung, who won an IMO gold medal as
a student in 2003.
Of the 630 students participating in the 66th IMO on the
Sunshine Coast in Queensland, Australia, 67 contestants, or
about 11%, achieved gold-medal scores.
Google's DeepMind AI unit last year achieved a silver medal
score using AI systems specialized for math. This year, Google
used a general-purpose model called Gemini Deep Think, a version
of which was previously unveiled at its annual developer
conference in May.
Unlike previous AI attempts that relied on formal languages
and lengthy computation, Google's approach this year operated
entirely in natural language and solved the problems within the
official 4.5-hour time limit, the company said in a blog post.
OpenAI, which has its own set of reasoning models, similarly
built an experimental version for the competition, according to
a post by researcher Alexander Wei on social media platform X.
He noted that the company does not plan to release anything with
this level of math capability for several months.
This year marked the first time the competition coordinated
officially with some AI developers, who have for years used
prominent math competitions like IMO to test model capabilities.
IMO judges certified the results of those companies, including
Google, and asked them to publish results on July 28.
"We respected the IMO Board's original request that all AI
labs share their results only after the official results had
been verified by independent experts and the students had
rightly received the acclamation they deserved," Google DeepMind
CEO Demis Hassabis said on X on Monday.
However, OpenAI, which did not work with the IMO,
self-published its results on Saturday, allowing it to be first
among AI firms to claim gold-medal status.
In turn, the competition on Monday allowed cooperating
companies to publish results, Gregor Dolinar, president of IMO's
board, told Reuters.