DeepMind’s LLM Just Solved an Unsolvable Math Problem

Google’s AI division, DeepMind, has used an LLM-powered tool called FunSearch to solve one of the world’s most famous unsolved math problems.

In a paper published in Nature, researchers from the AI firm said it is the first time a large language model has been used to solve a long-standing human puzzle by producing verifiable and valuable fresh insights that have never existed before.

“When we started the project there was no indication that it would produce something genuinely new,” said Pushmeet Kohli, the head of AI for science at DeepMind.

“As far as we know, this is the first time that a genuine, new scientific discovery has been made by a large language model. It’s not in the training data—it wasn’t even known.”

As part of the research, Deepmind scientists tested FunSearch on two puzzles. The first was a challenge in pure mathematics known as the cap set problem, which is about finding the largest set of points in space where no three points form a straight line.

FunSearch was able to solve this by churning out programs that generate new large cap sets that go beyond the best that mathematicians have come up with.

The second puzzle was the bin packing problem about finding the best ways to pack items of different sizes into containers. The problem is typically solved by either packing items into the first bin that has room or into the bin with the least available space where the item will still fit.

But, according to the researchers’ paper, FunSearch found a better approach by attempting to fill small gaps that were unlikely to ever be filled.

How does Deepmind's LLM Funsearch work?

google deepmind llm

Credit: Picturellarious - stock.adobe.com

FunSearch, short for “searching the function space,” uses an LLM called Codey to write solutions to maths problems using computer programs. Codey is paired with an “evaluator” that automatically ranks the programs based on how well they perform.

After a couple of million suggestions and a few dozen repetitions of the overall process, the best programs are then combined and fed back to the LLM to improve on, allowing the system to steadily change programs into more powerful ones that can make discoveries.

“Instead of generating a solution, FunSearch generates a program that finds the solution” said Jordan Ellenberg, professor of mathematics at the University of Wisconsin-Madison, and co-author on the paper.

“A solution to a specific problem might give me no insight into how to solve other related problems. But a program that finds the solution, that’s something a human being can read and interpret and hopefully thereby generate ideas for the next problem and the next and the next.”

A new age for Deepmind LLMs

Large language models (LLMs) are the AI models behind gen AI tools like OpenAI’s ChatGPT and Google’s Bard. They’re not known for making discoveries or providing new facts as they recycle information from their training data to generate responses rather than curating new information.

They’re also known to make things up from time to time, producing information that is not factually correct or outdated in what has become known as ‘hallucinations’.

Deepmind’s researchers, however, are trying to change that. Their game-playing LLM AlphaTensor found a way to speed up a calculation at the heart of many different kinds of code, beating a 50-year record. Is AlphaDev LLM also recently found ways to make key algorithms used trillions of times a day run faster.

Researchers are now exploring the scientific problems FunSearch can handle – but there are still big hurdles to overcome.

Daily AI News in 60 Seconds

1/8 Google DeepMind is using an LLM to solve math problems

DeepMind has used an LLM called FunSearch to solve an unsolved math problem. FunSearch combines a language model called Codey with other systems to suggest code that will solve the problem. pic.twitter.com/vNALxbqrM0

— The AI Edge (@The_AI_Edge) December 19, 2023

One of which – according to the paper, is that the problems need to have solutions that can be verified automatically, which rules out many questions in biology, where hypotheses often need to be tested with lab experiments.

Still, Deepmind’s researchers are excited about how the new LLM will impact computer science.

“This is actually going to be transformational in how people approach computer science and algorithmic discovery,” said Kohli. For the first time, we’re seeing LLMs not taking over, but definitely assisting in pushing the boundaries of what is possible in algorithms.”