In a significant advance for pandemic preparedness, a new international scientific effort has demonstrated the power—and current limitations—of artificial intelligence in accelerating drug discovery against coronaviruses. A newly released preprint from the Open Molecular Software Foundation (OMSF) and collaborators details the results of the ASAP-Polaris-OpenADMET Challenge, a blind community competition focused on pan-coronavirus drug discovery.
Backed by the NIH’s Antiviral Drug Discovery (AViDD) program, the challenge invited researchers worldwide to test machine learning models on previously unreleased drug discovery data targeting the main proteases (Mpro) of SARS-CoV-2 and MERS-CoV—enzymes essential to coronavirus replication. The effort was co-organized by the AI-driven Structure-enabled Antiviral Platform (ASAP), the Polaris benchmarking platform, and the OpenADMET project (funded by ARPA-H), and attracted 66 participating teams from academia, industry, and government.
What the Challenge Tested
The challenge focused on three core tasks that mirror real-world antiviral drug development:
How potent are candidate drugs?
Teams predicted the strength of molecular inhibition against SARS-CoV-2 and MERS-CoV Mpro targets.
How safe and drug-like are these compounds?
Participants modeled drug metabolism and other pharmacokinetic properties (ADMET).
How do the molecules bind?
Competitors predicted the 3D binding poses of small molecules in the active site of viral enzymes.
All predictions were made against a held-out, blinded test set compiled from ASAP’s real drug discovery campaign, ensuring rigorous, unbiased evaluation.
Key Findings: What Worked, What Didn’t
AI and Machine Learning Can Match—and Sometimes Beat—Lab Precision
Top-performing AI models predicted molecular potency (how well a compound blocks a virus enzyme) with nearly lab-level precision. The best models had average errors of just half a log unit—within the range of typical experimental variability. This suggests that, with high-quality data, machine learning can support critical decisions in selecting drug candidates for synthesis and testing.
Some Drug-Like Properties Are Easier to Predict Than Others
AI models performed well in predicting some pharmacokinetic properties—especially lipophilicity (LogD) and cell permeability, both important for oral drugs. However, solubility and liver clearance proved harder to predict accurately, likely due to sparse or noisy training data. This highlights a need for better datasets and modeling approaches for these challenging endpoints.
Structure-Based Predictions Are Advancing—but Not Fully Reliable Yet
When asked to predict how drug molecules fit inside their viral targets (ligand poses), some models—especially AI-powered “co-folding” methods—performed remarkably well, correctly predicting the binding mode in over 80% of cases. Still, performance varied across targets and compound types, and some physics-based and deep-learning approaches failed to generate accurate poses, pointing to room for further refinement.
Why This Matters for National Health Security
This challenge directly supports efforts to improve preparedness for future coronavirus outbreaks and other viral threats. The ability to rapidly evaluate antiviral candidates using AI can shorten the drug discovery timeline—from years to months—and enable more agile responses during emerging pandemics.
For public health professionals, policymakers, and biodefense strategists, this work offers a model for integrating AI and open science into national countermeasure pipelines. The ASAP Consortium’s commitment to equitable access—by combining early open science with protective IP strategies—demonstrates how innovation and global health equity can go hand in hand.
The use of blinded, real-world data rather than synthetic benchmarks also sets a new gold standard for evaluating computational methods. It ensures models are tested under conditions that mimic true drug development environments, providing insights that are both scientifically robust and operationally meaningful.
A Template for Future Collaboration
The challenge was powered by Polaris, a benchmarking platform purpose-built for drug discovery. It facilitated seamless submission, scoring, and community engagement. Participants used shared tools, APIs, and Discord-based support to iterate and improve their models over the course of the competition.
Importantly, all final datasets and evaluation metrics were made public, and many participants have published their approaches in a dedicated special issue of Journal of Chemical Information and Modeling.
The initiative also supports the broader OpenADMET project, which seeks to build open models and datasets for predicting drug properties—a cornerstone of the ARPA-H mission.
Looking Ahead
While no single model proved universally superior, the challenge identified clear best practices—including the value of pretraining on large chemical datasets, multi-task learning, and ensemble modeling. It also exposed gaps in current approaches, especially for complex endpoints like solubility and metabolic clearance.
As the ASAP Discovery Consortium moves toward clinical development of its lead pan-coronavirus candidate, this challenge demonstrates how community-led, open-science efforts can contribute to faster, more trustworthy therapeutic development. Similar challenges are planned for future phases of the OpenADMET initiative.
MacDermott-Opeskin H, Scheen J, Wognum C, et al. A Computational Community Blind Challenge on Pan-Coronavirus Drug Discovery Data. ChemRxiv [Preprint]; July 2025
Edited by Stephanie Lizotte