Within Molecular Binding

Confidence Score Gaps

AlphaFold 3 can produce persuasive interaction models whose confidence scores may miss large errors, especially in novel or flexible cases.

On this page

  • What confidence scores appear to promise
  • Where benchmarking finds hidden structural errors
  • How researchers should treat predictions as hypotheses
Preview for Confidence Score Gaps

Introduction

AlphaFold 3 can generate molecular interaction models that look remarkably precise. Proteins appear neatly docked to drugs, antibodies seem to fit their targets cleanly, and confidence maps often suggest that the AI “knows” the answer. That visual authority is part of why the system has generated such excitement in drug discovery and structural biology. But one of the most important lessons from early benchmarking is that confidence scores are not the same thing as biological truth.

Confidence gaps illustration 1 In some cases, AlphaFold 3 assigns high confidence to structures that later turn out to contain major errors, especially in flexible proteins, novel interaction types, poorly represented chemistries, or systems that can adopt multiple conformations. Researchers increasingly treat the model not as an oracle, but as a fast hypothesis generator whose outputs still require experimental testing. That distinction matters well beyond one AI tool. If advanced AI is to accelerate medicine and scientific discovery as part of a broader “AI bloom”, scientists need systems that not only produce persuasive answers, but also communicate uncertainty honestly.

What confidence scores appear to promise

AlphaFold 3 inherited and extended several confidence metrics from earlier AlphaFold systems. The best known is pLDDT, a score estimating confidence in local atomic positions on a 0–100 scale. High values are intended to indicate that the predicted structure is likely reliable. The system also uses metrics such as predicted aligned error (PAE) and interface confidence estimates for molecular interactions. EMBL-EBI guidance describes pLDDT values above 90 as highly confident, while values below 50 often indicate unreliable regions. [ebi.ac.uk]ebi.ac.ukIt uses a 0-100 scale, where higher values indicate higher confidence.Read more…

These metrics are genuinely useful. AlphaFold’s success partly came from the fact that its confidence estimates often correlate reasonably well with experimental accuracy. Researchers can rapidly identify likely stable regions and deprioritise obviously weak predictions. That ability dramatically reduces wasted effort compared with blind modelling.

The problem is subtler: confidence scores can appear more trustworthy than they really are.

AlphaFold 3 produces detailed atomic models with smooth geometries, chemically plausible interactions and colour-coded certainty maps. For non-specialists, and sometimes even specialists, this creates a strong psychological impression that the model has “solved” the interaction. In reality, the confidence metric only estimates how self-consistent the prediction is relative to patterns learned during training. It does not independently verify that the biology is correct.

That distinction becomes especially important in exactly the kinds of frontier problems that matter most for future drug discovery: novel targets, flexible binding states, intrinsically disordered proteins, allosteric regulation and unusual chemistries.

Where benchmarking finds hidden structural errors

Independent benchmarking studies have repeatedly found cases where AlphaFold 3 looks confident while still producing materially incorrect structures.

A 2025 benchmarking study across multiple biomolecular datasets found that although AlphaFold 3 improved local structural accuracy over AlphaFold 2, gains in global accuracy were often limited and some interaction categories remained difficult. Protein multimers, antibody-antigen systems and nucleic-acid-related interactions still showed significant weaknesses. [OUP Academic]academic.oup.comOUP Academiccomprehensive benchmarking of the AlphaFold3 for predicting…by C Peng · 2025 · Cited by 7 — In this work, we benchmark Alp… [PMC]nih.govPMCA comprehensive benchmarking of the Alpha Fold3 forPMCby C Peng · 2025 · Cited by 6 — In this work, we benchmark AlphaFold3's performance across nine datasets, protein monomers, orphan pro…

Another evaluation using the SKEMPI protein interaction database found that some AlphaFold 3 complex structures contained “large errors” not captured by the model’s interface confidence metrics. The same study concluded that predictions involving intrinsically flexible regions or domains were not reliably assessed by the confidence system. [ACS Publications]pubs.acs.orgACS PublicationsEvaluation of AlphaFold 3's Protein–Protein Complexes for…by JJ Wee · 2024 · Cited by 79 — In this work, we evaluate A…

This matters because flexibility is not a minor edge case in biology. Many proteins only become functional when changing shape. Others fluctuate between multiple biologically relevant conformations. Drug molecules may stabilise one state while suppressing another. Immune recognition often depends on transient geometries rather than rigid lock-and-key fits.

AlphaFold 3, however, still tends to output a single dominant structure. EMBL-EBI training materials explicitly note that the system predicts static structures and does not fully capture the dynamic behaviour of molecules in solution. [ebi.ac.uk]ebi.ac.ukWhat AlphaFold 3 struggles withA key limitation of protein structure prediction models is that they typically predict static structures a…

In practice, this can produce a dangerous combination:

  • a visually convincing structure
  • a high confidence score [x.com]x.comAn Evaluation of Biomolecular Energetics Learned by…Importantly, AlphaFold's confidence scores (pLDDT) were high even for residues wit…
  • but an incorrect biological interpretation

Researchers studying autoinhibited proteins — proteins that switch between inactive and active forms — found that AlphaFold systems struggled with conformational diversity because training data over-represent stable states captured in crystallographic databases. [Nature]nature.comNatureBenchmarking all-atom biomolecular structure prediction…by S Xu · 2025 · Cited by 21 — We find that AlphaFold 3 leads overall, y…

Similarly, studies on proteins with alternative folds have argued that AlphaFold can produce high-confidence predictions that contradict experimental evidence when proteins adopt unusual or underrepresented conformations. [arXiv]arxiv.orgarXivBenchmarking AlphaFold3's protein-protein complex accuracy and machine learning prediction reliability for binding free energy chang…

Flexible and disordered biology remains especially hard

One recurring failure mode involves intrinsically disordered regions: protein segments that do not maintain one stable structure.

These regions are central to cell signalling, gene regulation and disease biology. They are also unusually difficult for AI structure predictors because they violate the assumption that one sequence maps cleanly to one stable shape.

AlphaFold often marks disordered regions with low confidence, which can be useful. But newer studies suggest the opposite problem can also occur: the system may generate highly confident structures for regions that are experimentally known to remain disordered. [arXiv]arxiv.orgarXivBenchmarking AlphaFold3's protein-protein complex accuracy and machine learning prediction reliability for binding free energy chang…

That creates a serious interpretive risk. Scientists may mistake AI-generated order for genuine biological structure simply because the output looks coherent and precise.

This is not merely a technical inconvenience. Some of the most medically important biological systems involve disorder, flexibility and transient interactions:

  • cancer signalling pathways
  • transcription factors
  • immune recognition
  • viral-host interactions
  • allosteric drug regulation

These are precisely the systems where an AI-driven acceleration of biology could have enormous long-term value for human health and longevity. But they are also systems where overconfidence can misdirect experiments, waste resources or encourage premature conclusions.

Novel chemistry exposes another weakness

AlphaFold 3 performs best when new problems resemble patterns already present in training data.

That is true for most machine learning systems, but biology makes the issue especially consequential because the highest-value discoveries often involve genuinely novel chemistry.

Several studies have warned that AlphaFold-style systems may partially rely on memorised statistical regularities rather than deeper physical understanding of molecular energetics. Accuracy can fall sharply when evaluating interactions unlike those seen during training. [Nexco]nexco.chThe Limitations of Protein Ligand Co folding with Alpha Fold 3, UnveiledNexcoThe Limitations of Protein-Ligand Co-folding with…Nov 17, 2025 — In brief, these analyses suggest that while AlphaFold 3 is defin… [Wikipedia This becomes especially important in drug discovery.]WikipediaAlpha FoldAlphaFoldAlphaFold is an artificial intelligence (AI) program developed by DeepMind, a subsidiary of Alphabet, which performs predicti…

A pharmaceutical researcher does not mainly care whether AlphaFold can reproduce familiar interactions already well represented in structural databases. They care whether it can predict new binding modes, unusual chemistries or previously unknown targets.

Benchmarking work on protein-ligand systems has shown that AlphaFold 3 still struggles with allosteric systems and certain binding-pocket configurations. [Nature]nature.comNatureChallenging AlphaFold in predicting proteins with large-…by BH Perkins-Jechow · 2025 · Cited by 6 — Here, we benchmarked AlphaFo…

In some cases, the model generates chemically plausible but incorrect docking arrangements that receive relatively strong confidence scores. Researchers have described these as “hallucinations” — outputs that look realistic but are not physically correct. [Drug Discovery Trends]drugdiscoverytrends.comDrug Discovery TrendsAlphaFold 3 offers even more accurate protein structure…8 May 2024 — One of the key challenges in computational s…Published: May 2024

The danger is not that scientists blindly trust every prediction. Structural biologists are generally cautious. The larger risk is subtler:

  • visually persuasive AI outputs can shape research priorities
  • confidence scores can narrow perceived uncertainty too early
  • institutions may overestimate how automated molecular discovery has become

That distinction matters for public narratives around AI-enabled scientific acceleration.

Confidence gaps illustration 2

Why this matters for the broader AI bloom argument

AlphaFold 3 is often presented as evidence that AI could dramatically accelerate medicine, biotechnology and eventually human flourishing itself. In many ways, it genuinely supports that case. Predicting molecular interactions faster and more cheaply could reduce years of experimental work, expand access to structural biology and help researchers explore diseases that previously lacked detailed molecular maps.

But the confidence-score problem reveals something important about the current stage of AI progress.

Scientific acceleration is not simply about generating more answers. It is about generating reliable knowledge under uncertainty.

AlphaFold 3 demonstrates that AI systems can compress enormous amounts of biological pattern recognition into highly useful predictions. At the same time, it shows that scientific reasoning still depends heavily on experimental validation, physical interpretation and careful treatment of uncertainty.

This tempers some of the more exaggerated narratives around “solving biology”. Even extraordinarily capable AI systems may remain uneven across domains where:

  • training data are sparse
  • physical dynamics matter
  • multiple states coexist
  • or novel chemistry exceeds historical examples

That does not weaken the broader possibility of AI-enabled abundance and scientific flourishing. In some ways it strengthens it by clarifying where future advances are still needed. The path toward radically accelerated science is likely to involve combinations of:

  • generative AI models
  • laboratory robotics
  • molecular simulation
  • high-throughput experimentation
  • and improved uncertainty estimation

rather than one model simply replacing experimental science.

Confidence gaps illustration 3

How researchers increasingly treat AlphaFold predictions

The emerging norm in structural biology is to treat AlphaFold 3 outputs as powerful hypotheses rather than final answers.

Researchers increasingly combine AI predictions with:

  • cryo-electron microscopy
  • X-ray crystallography
  • molecular dynamics simulations
  • mutational experiments
  • biochemical assays
  • and orthogonal computational methods

Confidence scores are useful guides, but they are now interpreted alongside broader biological context.

For example:

  • a high-confidence prediction in a rigid conserved protein family may deserve substantial trust
  • the same score in a flexible signalling complex or novel ligand system may deserve far more scepticism

Some groups are also developing refined confidence metrics that better account for flexible interfaces and partially ordered interactions. [arXiv]arxiv.orgarXivBenchmarking AlphaFold3's protein-protein complex accuracy and machine learning prediction reliability for binding free energy chang…

This reflects a broader lesson likely to matter across future AI-enabled science. As AI systems become more capable, the central challenge may shift from producing plausible outputs to calibrating confidence correctly.

A system that occasionally says “I do not know” may ultimately accelerate science more safely and effectively than one that always produces a polished answer.

The deeper lesson: persuasive AI is not the same as solved science

AlphaFold 3 remains one of the most important AI systems ever built for biology. It has already changed how many scientists approach molecular structure prediction, and it may contribute to major advances in medicine over time. Nature [2blog.google]blog.googleAlpha Fold 3 predicts the structure and interactions of allAlphaFold 3 predicts the structure and interactions of all…May 8, 2024 — Our new AI model AlphaFold 3 can predict the structure and in…Published: May 8, 2024

But its confidence gaps are equally important to understand.

The system’s outputs often look more definitive than the underlying evidence justifies. High-confidence predictions can still conceal incorrect interfaces, missed conformational states or biologically unrealistic interactions. The risk grows in exactly the frontier domains where future medical breakthroughs are most needed.

That tension captures a larger truth about advanced AI and scientific progress. AI can massively expand humanity’s ability to search possibility space, generate hypotheses and compress scientific labour. Yet scientific understanding still depends on reality pushing back through experiment, replication and physical constraints.

For advocates of an AI-enabled human bloom, AlphaFold 3 is therefore both an encouraging signal and a cautionary one. It shows that AI can already amplify scientific capability dramatically. But it also shows that genuine knowledge, especially in complex living systems, remains harder than generating convincing predictions.

Endnotes

  1. Source: ebi.ac.uk
    Link: https://www.ebi.ac.uk/training/online/courses/alphafold/alphafold-3-and-alphafold-server/how-to-assess-the-quality-of-alphafold-3-predictions/
    Source snippet

    It uses a 0-100 scale, where higher values indicate higher confidence.Read more...

  2. Source: academic.oup.com
    Link: https://academic.oup.com/bib/article/26/6/bbaf616/8351050
    Source snippet

    OUP Academiccomprehensive benchmarking of the AlphaFold3 for predicting...by C Peng · 2025 · Cited by 7 — In this work, we benchmark Alp...

  3. Source: nature.com
    Link: https://www.nature.com/articles/s41467-025-67127-3
    Source snippet

    NatureBenchmarking all-atom biomolecular structure prediction...by S Xu · 2025 · Cited by 21 — We find that AlphaFold 3 leads overall, y...

  4. Source: pubs.acs.org
    Link: https://pubs.acs.org/doi/10.1021/acs.jcim.4c00976
    Source snippet

    ACS PublicationsEvaluation of AlphaFold 3's Protein–Protein Complexes for...by JJ Wee · 2024 · Cited by 79 — In this work, we evaluate A...

  5. Source: arxiv.org
    Link: https://arxiv.org/abs/2406.03979
    Source snippet

    arXivBenchmarking AlphaFold3's protein-protein complex accuracy and machine learning prediction reliability for binding free energy chang...

  6. Source: ebi.ac.uk
    Link: https://www.ebi.ac.uk/training/online/courses/alphafold/alphafold-3-and-alphafold-server/introducing-alphafold-3/what-alphafold-3-struggles-with/
    Source snippet

    What AlphaFold 3 struggles withA key limitation of protein structure prediction models is that they typically predict static structures a...

  7. Source: nature.com
    Link: https://www.nature.com/articles/s42004-025-01763-0
    Source snippet

    NatureChallenging AlphaFold in predicting proteins with large-...by BH Perkins-Jechow · 2025 · Cited by 6 — Here, we benchmarked AlphaFo...

  8. Source: arxiv.org
    Link: https://arxiv.org/abs/2410.14898
    Source snippet

    arXivProteins with alternative folds reveal blind spots in AlphaFold-based protein structure predictionOctober 18, 2024...

    Published: October 18, 2024

  9. Source: arxiv.org
    Link: https://arxiv.org/html/2510.15939v2
    Source snippet

    (a) DisProt shows order in the residue, but AF3 predicts the residue with low confidence...Read more...

  10. Source: nexco.ch
    Title: The Limitations of Protein Ligand Co folding with Alpha Fold 3, Unveiled
    Link: https://nexco.ch/blog/The-Limitations-of-Protein-Ligand-Co-folding-with-AlphaFold-3%2C-Unveiled
    Source snippet

    NexcoThe Limitations of Protein-Ligand Co-folding with...Nov 17, 2025 — In brief, these analyses suggest that while AlphaFold 3 is defin...

  11. Source: Wikipedia
    Title: Alpha Fold
    Link: https://en.wikipedia.org/wiki/AlphaFold
    Source snippet

    AlphaFoldAlphaFold is an artificial intelligence (AI) program developed by DeepMind, a subsidiary of Alphabet, which performs predicti...

  12. Source: nature.com
    Link: https://www.nature.com/articles/s41467-025-63947-5
    Source snippet

    Nature, 630,493–500 (2024). Krishna, R. et al. Generalized...Read more...

  13. Source: nature.com
    Link: https://www.nature.com/articles/s41467-024-48837-6
    Source snippet

    Structure prediction of protein-ligand complexes from...by P Bryant · 2024 · Cited by 85 — Here we develop an AI system that can predict...

  14. Source: arxiv.org
    Link: https://arxiv.org/abs/2412.15970
    Source snippet

    arXivactifpTM: a refined confidence metric of AlphaFold2 predictions involving flexible regionsDecember 20, 2024...

    Published: December 20, 2024

  15. Source: nature.com
    Link: https://www.nature.com/articles/s41586-024-07487-w
    Source snippet

    NatureAccurate structure prediction of biomolecular interactions...by J Abramson · 2024 · Cited by 14701 — Here we describe our AlphaFol...

  16. Source: blog.google
    Title: Alpha Fold 3 predicts the structure and interactions of all
    Link: https://blog.google/innovation-and-ai/products/google-deepmind-isomorphic-alphafold-3-ai-model/
    Source snippet

    AlphaFold 3 predicts the structure and interactions of all...May 8, 2024 — Our new AI model AlphaFold 3 can predict the structure and in...

    Published: May 8, 2024

  17. Source: ebi.ac.uk
    Link: https://www.ebi.ac.uk/training/online/courses/alphafold/an-introductory-guide-to-its-strengths-and-limitations/what-is-alphafold/
    Source snippet

    What is AlphaFold?AlphaFold2 is a multicomponent artificial intelligence (AI) system that uses machine learning to predict a protein's 3D...

  18. Source: pubs.acs.org
    Link: https://pubs.acs.org/doi/10.1021/acs.jcim.5c00906
    Source snippet

    Prediction of Alternate Frame Folding Systems with...Jul 27, 2025 — In this work, we use a family of green fluorescent proteins engineer...

  19. Source: pubs.acs.org
    Link: https://pubs.acs.org/doi/10.1021/acs.jcim.5c01084
    Source snippet

    is a Comprehensive Benchmarking Framework...12 Aug 2025 — Metrics for protein–peptide complex structure prediction.... Benchmarking Alp...

  20. Source: nature.com
    Link: https://www.nature.com/articles/s41586-021-03819-2
    Source snippet

    Highly accurate protein structure prediction with AlphaFoldby J Jumper · 2021 · Cited by 49425 — The AlphaFold network directly predicts...

  21. Source: nature.com
    Link: https://www.nature.com/articles/s41586-024-07487-w_reference.pdf
    Source snippet

    257. We note model limitations of AlphaFold 3 with respect to stereochemistry, hallucinations. 258 dynamics, and accuracy...Read more...

  22. Source: arxiv.org
    Link: https://arxiv.org/html/2508.18446v1
    Source snippet

    AlphaFold 3 as a Differentiable Framework for Structural...25 Aug 2025 — Indeed, even AlphaFold's impressive performance falters for pro...

  23. Source: academic.oup.com
    Link: https://academic.oup.com/bib/article/26/4/bbaf324/8190210
    Source snippet

    A key challenge in protein engineering is understanding how mutations affect protein fitness and stability...

  24. Source: academic.oup.com
    Link: https://academic.oup.com/pcm/article/8/3/pbaf015/8180385
    Source snippet

    3: an unprecedent opportunity for fundamental...by Z Fang · 2025 · Cited by 36 — This limitation restricts the application of AF3 in the...

  25. Source: drugdiscoverytrends.com
    Link: https://www.drugdiscoverytrends.com/meet-alphafold-3-which-can-accurately-model-more-than-99-of-molecular-types-in-the-protein-data-bank/
    Source snippet

    Drug Discovery TrendsAlphaFold 3 offers even more accurate protein structure...8 May 2024 — One of the key challenges in computational s...

    Published: May 2024

  26. Source: deepmind.google
    Link: https://deepmind.google/science/alphafold/
    Source snippet

    Google DeepMindAlphaFold — Google DeepMindAlphaFold has revealed millions of intricate 3D protein structures, and is helping scientists u...

  27. Source: alphafold.ebi.ac.uk
    Link: https://alphafold.ebi.ac.uk/
    Source snippet

    Protein Structure DatabaseAlphaFold is an AI system developed by Google DeepMind that predicts a protein's 3D structure from its amino ac...

  28. Source: alphafold.ebi.ac.uk
    Title: ebi.ac.uk FA Qs
    Link: https://alphafold.ebi.ac.uk/faq
    Source snippet

    AlphaFold Protein Structure DatabaseRegions with pLDDT between 50 and 70 are low confidence and should be treated with caution.... For p...

  29. Source: deepmind.google
    Title: alphafold five years of impact
    Link: https://deepmind.google/blog/alphafold-five-years-of-impact/
    Source snippet

    AlphaFold: Five Years of ImpactNov 25, 2025 — Explore five years of AlphaFold's impact on biology. Learn how this Nobel Prize-winning AI...

Additional References

  1. Source: researchgate.net
    Link: https://www.researchgate.net/publication/398103542_A_comprehensive_benchmarking_of_the_AlphaFold3_for_predicting_biomacromolecules_and_their_interactions
    Source snippet

    A comprehensive benchmarking of the AlphaFold3 for...5 Dec 2025 — In this work, we benchmark AlphaFold3's performance across nine datase...

  2. Source: medium.com
    Link: https://medium.com/%40cognidownunder/alphafold-changed-biology-forever-when-it-solved-protein-folding-78bb8768483a
    Source snippet

    AlphaFold 3 Predicts Everything Now, Not Just Proteins...AlphaFold changed biology forever when it solved protein folding. Now AlphaFold...

  3. Source: alphafoldserver.com
    Link: https://alphafoldserver.com/
    Source snippet

    AlphaFold ServerAlphaFold Server is a web-service that can generate highly accurate biomolecular structure predictions containing protein...

  4. Source: creative-biostructure.com
    Link: https://www.creative-biostructure.com/alphafold3-accurate-molecular-interaction-prediction.htm?srsltid=AfmBOoqAlFPWyvCVKU6_vG1gPbTJV9t-x6BFgA9_M10jDgNaJB5t4aiZ
    Source snippet

    AlphaFold3: Accurate Structure Prediction of Molecular...A fundamental limitation of AF3 is its focus on predicting static structures, w...

  5. Source: x.com
    Link: https://x.com/BiologyAIDaily/status/1941486774834037247
    Source snippet

    An Evaluation of Biomolecular Energetics Learned by...Importantly, AlphaFold's confidence scores (pLDDT) were high even for residues wit...

  6. Source: semanticscholar.org
    Link: https://www.semanticscholar.org/paper/Assessing-scoring-metrics-for-AlphaFold2-and-Genz-Nair/8e81365097d4eea07a8a6fe5d3df35615271cc7c
    Source snippet

    Assessing scoring metrics for AlphaFold2 and AlphaFold3...The new C2Qscore developed in this study improves the reliability of AlphaFold...

  7. Source: github.com
    Link: https://github.com/google-deepmind/alphafold
    Source snippet

    Open source code for AlphaFold 2.This package provides an implementation of the inference pipeline of AlphaFold v2. For simplicity, we re...

  8. Source: frontiersin.org
    Link: https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2026.1739303/full
    Source snippet

    The transformative impact of AI-enabled AlphaFold 3by C Chakraborty — The model achieved approximately 76% accuracy in predicting protein...

  9. Source: medium.com
    Link: https://medium.com/data-science/sparks-of-chemical-intuition-and-gross-limitations-in-alphafold-3-8487ba4dfb53
    Source snippet

    “Sparks of Chemical Intuition”—and Gross Limitations!“Sparks of Chemical Intuition”—and Gross Limitations!—in AlphaFold 3. Observations a...

  10. Source: 3decision.discngine.com
    Link: https://3decision.discngine.com/blog/2024/8/8/evaluating-protein-protein-interactions-in-af3-predicted-complexes-a-pd-1-case-study
    Source snippet

    protein-protein interactions in AF3 predicted...by E Martino — The latest release of AlphaFold (AF3) has addressed some limitations of t...

Amazon book picks

Further Reading

Books and field guides related to Confidence Score Gaps. Use these as the next step if you want deeper reading beyond the article.

eBay marketplace picks

Marketplace Samples

Example marketplace items related to this page. Use the search link to explore similar finds on eBay.

Topic Tree

Follow this branch

Parent topic

Molecular Binding

Related pages 2