.jpeg)
Photo: Gorodenkoff Productions OU/Getty Images
Silicon Valley giant NVIDIA announced a partnership with the Arc Institute and Stanford University to launch a new foundation model, Evo 2, that understands the genetic code for all domains of life.
The model, which is being touted as the largest publicly available AI model for genomic data, was built using the NVIDIA DGX Cloud platform on Amazon Web Services (AWS) in collaboration with Arc and Stanford.
According to NVIDIA, Evo 2 is accessible to global developers on the NVIDIA BioNeMo platform, including as an NVIDIA NIM microservice for easy and safe AI deployment.
NVIDIA touts Evo 2 as being trained on a huge dataset of nearly 9 trillion nucleotides, the building blocks of DNA and RNA, and "can be applied to biomolecular research applications including predicting the form and function of proteins based on their genetic sequence, identifying novel molecules for healthcare and industrial applications and evaluating how gene mutations affect their function."
Additionally, the NVIDIA NIM microservice for Evo 2 allows users to generate an assortment of biological sequences, with settings to adjust model parameters.
Meanwhile, developers who are interested in fine-tuning Evo 2 on their proprietary datasets can download the model via the open-source NVIDIA BioNeMo Framework, which is a collection of accelerated computing tools for biomolecular research.
"Evo 2 represents a major milestone for generative genomics," Patrick Hsu, Arc Institute cofounder and core investigator and assistant professor of bioengineering at the University of California, Berkeley, said in a statement.
"By advancing our understanding of these fundamental building blocks of life, we can pursue solutions in healthcare and environmental science that are unimaginable today."
Dave Burke, Arc’s chief technology officer, said that deploying a model like Evo 2 is like sending a powerful new telescope out to the farthest reaches of the universe.
"We know there’s immense opportunity for exploration, but we don’t yet know what we’re going to discover," Burke said in a statement.
The Arc Institute helps researchers take on long-term scientific challenges by providing scientists with multiyear funding while at the same time letting scientists focus on creative research instead of grant writing.
THE LARGER TREND
In 2022, NVIDIA expanded its portfolio of products for healthcare with the launch of Clara Holoscan MGX, a tool designed to help medical device organizations develop artificial intelligence tools. The new technology was created to help industry players meet regulatory standards.
The platform builds on its previously launched product, the Clara Holoscan, which was developed to give industry stakeholders a computational infrastructure to stream data from medical devices. According to the company, Clara Holoscan MGX is able to process "high-throughput data streams for real-time insights."
In 2021, NVIDIA teamed up with pharma company AstraZeneca and the University of Florida on new artificial intelligence research projects that were aimed at boosting drug discovery and patient care.
NVIDIA and AstraZeneca revealed a new drug-discovery model called MegaMoIBART, which is aimed at "reaction prediction, molecular optimization and de novo molecular generation."
MegaMoIBART was deployed on NVIDIA's platform for computational drug discovery, Clara Discovery, and used a new kind of technology called transformer neural networks.
That same year, NVIDIA joined forces with Harvard University on an AI-based toolkit designed to help researchers gain more access and insights into DNA. Researchers were also able to run a whole genome analysis in 30 minutes.
The tool, AtacWorks, was able to identify specific sequencing data and pinpoint areas with easy-access DNA, meaning functional DNA that is not surrounded by proteins.
This month, the Arc Institute launched the Arc Virtual Cell Atlas, a resource for computation-ready single-cell measurements, beginning with data from more than 300 cells.
The initial release of the Atlas is Arc's first step toward assembling, curating and generating large-scale cellular data to fuel new insights from AI-driven biological discovery.