SAN FRANCISCO, June 18 (Reuters) - SandboxAQ, an
artificial intelligence startup spun out of Alphabet's
Google and backed by Nvidia ( NVDA ), on Wednesday released a
trove of data it hopes will speed up the discovery of new
medical treatments by helping scientists understand how drugs
stick to proteins.
The goal is to help scientists predict whether a drug will
bind to its target in the human body.
But while the data is backed up by real-world scientific
experiments, it did not come from a lab. Instead, SandboxAQ,
which has raised nearly $1 billion in venture capital, generated
the data using Nvidia's ( NVDA ) chips and will feed it back into AI
models that it hopes scientists can use to rapidly predict
whether a small-molecule pharmaceutical will bind to the protein
that researchers are targeting, a key question that must be
answered before a drug candidate can move forward.
For example, if a drug is meant to inhibit a biological
process like the progression of a disease, scientists can use
the tool to predict whether the drug molecule is likely to bind
to the proteins involved in that process.
The approach is an emerging field that combines traditional
scientific computing techniques with advancements in AI. In many
fields, scientists have long had equations that can precisely
predict how atoms combine into molecules.
But even for relatively small three-dimensional
pharmaceutical molecules, the potential combinations become far
too vast to calculate manually, even with today's fastest
computers. So SandboxAQ's approach was to use existing
experimental data to calculate about 5.2 million new,
"synthetic" three-dimensional molecules - molecules that haven't
been observed in the real world, but were calculated with
equations based on real-world data.
That synthetic data, which SandboxAQ is releasing publicly,
can be used to train AI models that can predict whether a new
drug molecule is likely to stick to the protein researchers are
targeting in a fraction of the time it would take to calculate
it manually, while retaining accuracy. SandboxAQ will charge
money for its own AI models developed with the data, which it
hopes will get results that rival running lab experiments, but
virtually.
"This is a long-standing problem in biology that we've all,
as an industry, been trying to solve for," Nadia Harhen, general
manager of AI simulation at SandboxAQ, told Reuters on Tuesday.
"All of these computationally generated structures are tagged to
a ground-truth experimental data, and so when you pick this data
set and you train models, you can actually use the synthetic
data in a way that's never been done before."