Image for Article: Large genome model: Open source AI trained on trillions of bases

Article Details

Title
Article: Large genome model: Open source AI trained on trillions of bases
Impact Score
6 / 10
AI Summary (Processed Content)

The AI system Evo, originally trained on bacterial genomes to predict protein sequences, has been significantly upgraded to Evo 2. This new, open-source model was trained on trillions of DNA base pairs across all domains of life, including complex eukaryotes.

As a result, Evo 2 has learned to identify intricate genomic features in eukaryotes, such as regulatory DNA and splice sites, which are challenging to analyze due to their weak and scattered patterns. This advancement overcomes the initial limitation of applying the method to more complex genome structures.

The main topics covered are the evolution of the Evo AI system, the structural differences between bacterial and eukaryotic genomes, and the new capabilities of Evo 2 in understanding complex genetic features.

Original URL
https://arstechnica.com/science/2026/03/large-genome-model-open-source-ai-trained-on-trillions-of-bases/
Source Feed
Ars Technica
Published Date
2026-03-04 22:14
Fetched Date
2026-03-04 23:31
Processed Date
2026-03-04 23:35
Embedding Status
Present
Cluster ID
Not Clustered
Raw Extracted Content