Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

Researchers built an ‘AI Scientist’ — what can it do?

Could science be fully automated? A team of machine-learning researchers has now tried.
‘AI Scientist’, created by a team at the Tokyo-based company Sakana AI and at laboratories in Canada and the United Kingdom, can perform the full cycle of research, from scanning the literature on a topic and formulating hypotheses, to trying out solutions and writing a paper. AI Scientist even does some of the job of peer reviewers and evaluates its own results.
AI Scientist joins a slew of efforts to automate parts of the scientific process using artificial intelligence (AI) agents. “To my knowledge, no one has yet done the total scientific community, all in one system,” says AI Scientist co-creator Cong Lu, a machine-learning researcher at the University of British Columbia in Vancouver, Canada. The results1 were posted on the arXiv preprint server last month.
“It’s impressive that they’ve done this end-to-end,” says Jevin West, a computational social scientist at the University of Washington in Seattle. “And I think we should be playing around with these ideas, because there could be potential for helping science.”
The output has not been earth-shattering so far, and the system can do research only in the field of machine learning itself. In particular, AI Scientist is lacking what most scientists would consider a crucial part of doing science — the ability to do lab work. “There’s still a lot of work to go from AI that makes a hypothesis to implementing that in a robot scientist,” says Gerbrand Ceder, a materials scientist at Lawrence Berkeley National Laboratory in California. Still, Ceder adds, “if you look into the future, I have zero doubt in mind that this is where much of science will go”.
AI Scientist is based on a large language model (LLM). It uses a machine-learning algorithm to search the literature for similar work. The team then used evolutionary computation, a technique inspired by the mutations and natural selection in Darwinian evolution. It applies small, random changes to an algorithm and selects the ones that improve the model’s efficiency.
To do so, AI Scientist conducts its own ‘experiments’ by running the algorithms and measuring how well they perform. At the end, it produces a paper, and evaluates it in a sort of automated peer review. After ‘augmenting the literature’ this way, the algorithm can then start the cycle again, building on its own results.
The authors admit that the papers that AI Scientist has produced contained only incremental developments. Some other people were scathing in their comments on social media. “As an editor of a journal, I would likely desk-reject them. As a reviewer, I would reject them,” said one commenter on the online forum Hacker News.
West says that the authors took a reductive view of how researchers learn about the current state of their field. A lot of what academics know comes from going to conferences or chatting to colleagues at the water cooler, for example, as opposed to a literature review. “Science is more than a pile of papers,” says West. “You can have a 5-minute conversation that will be better than a 5-hour study of the literature.”
West’s colleague at Washington, Shahan Memon, agrees — but both West and Memon praise the authors for having made their code and results fully open. This has enabled them to analyse AI Scientist’s results. They’ve found, for example, that it has a ‘popularity bias’, favouring papers with high citation counts. Memon and West say that they are also looking into measuring whether AI Scientist’s choices were the most relevant ones.
AI Scientist is, of course, not the first attempt at automating parts of the job of a researcher. The dream of automating scientific discovery is as old as artificial intelligence itself — dating back to the 1950s, says Tom Hope, a computer scientist at the Allen Institute for AI based in Jerusalem. Almost decade ago, for example, the ‘Automatic Statistician’2 was able to analyse sets of data and write papers. Ceder and his colleagues have even automated some bench work: the ‘robot chemist’ they unveiled last year can synthesize new materials and experiment with them3.
Hope says that current LLMs “are not able to formulate novel and useful scientific directions beyond basic superficial combinations of buzzwords”. Still, Ceder says that, even if AI won’t able to do the more creative part of the work any time soon, it could still automate the repetitive aspects of research. “At the low level, you’re trying to analyse what something is, how something responds. That’s not the creative part of science, but it’s 90% of what we do.” Lu says that he got similar feedback from a lot of other researchers. “People will say, I have 100 ideas that I don’t have time for. Get the AI Scientist to do those.”
Lu says that, to broaden AI Scientist’s capabilities — to even abstract fields beyond machine learning, such as pure mathematics — it might need to include other techniques beyond LLMs. Work by Google’s London-based DeepMind, for example, has shown the power of using LLMs with techniques of ‘symbolic’ AI, when solving mathematics problems. It builds logical rules into a system rather than merely relying on it learning from statistical patterns in data. But the current iteration is but a start, he says. “We really believe this is the GPT-1 of AI science,” he says, referring to an early LLM built by OpenAI in San Francisco, California.
The results feed into a debate that is at the top of many researchers’ concerns these days, says West. “All my colleagues in different sciences are trying to figure out, where does AI fit in in what we do? It does force us to think what is science in the twenty-first century — what it could be, what it is, what it is not,” he says.

en_USEnglish