Current AI tools like Chat GPT and Gemini can do some remarkable things, but solving even basic math problems isn't one of them. But researchers at Google are making progress on improving their mathematical capabilities.
The researchers found that their two math programs could provide proofs for IMO puzzles as well as a silver medalist could. The programs solved two algebra problems and one number theory problem out of six in total. It got one problem in minutes but took up to several days to figure out others. Google DeepMind has not disclosed how much computer power it threw at the problems.
Google DeepMind calls the approach used for both AlphaProof and AlphaGeometry “neuro-symbolic” because they combine the pure machine learning of an artificial neural network, the technology that underpins most progress in AI of late, with the language of conventional programming.
“What we’ve seen here is that you can combine the approach that was so successful, and things like AlphaGo, with large language models and produce something that is extremely capable,” says David Silver, the Google DeepMind researcher who led work on AlphaZero. Silver says the techniques demonstrated with AlphaProof should, in theory, extend to other areas of mathematics.
ArsTechnica also reports on the Google research.
Despite Google's claims, Sir Timothy Gowers offered a more nuanced perspective on the Google DeepMind models in a thread posted on X. While acknowledging the achievement as "well beyond what automatic theorem provers could do before," Gowers pointed out several key qualifications.
"The main qualification is that the program needed a lot longer than the human competitors—for some of the problems over 60 hours—and of course much faster processing speed than the poor old human brain," Gowers wrote. "If the human competitors had been allowed that sort of time per problem they would undoubtedly have scored higher."
The New York Times reports (gift link) on other companies that are using different approaches to improving AI tools' math skiils.
For more than a year, ChatGPT has used a similar workaround for some math problems. For tasks like large-number division and multiplication, the chatbot summons help from a calculator program.
Math is an “important ongoing area of research,” OpenAI said in a statement, and a field where its scientists have made steady progress. Its new version of GPT achieved nearly 64 percent accuracy on a public database of thousands of problems requiring visual perception and mathematical reasoning, the company said. That is up from 58 percent for the previous version.
No comments:
Post a Comment