AlphaFold 2, but not AlphaFold 3, predicts confident but unrealistic β-solenoid structures for repeat proteins

Olivia S Pratt; Luc G Elliott; Margaux Haon; Shahram Mesdaghi; Rebecca M Price; Adam J Simpkin; Daniel J Rigden

doi:10.1016/j.csbj.2025.01.016

AlphaFold 2, but not AlphaFold 3, predicts confident but unrealistic β-solenoid structures for repeat proteins

Comput Struct Biotechnol J. 2025 Jan 22:27:467-477. doi: 10.1016/j.csbj.2025.01.016. eCollection 2025.

Authors

Olivia S Pratt¹, Luc G Elliott¹, Margaux Haon^{1

2}, Shahram Mesdaghi^{1

3}, Rebecca M Price¹, Adam J Simpkin¹, Daniel J Rigden¹

Affiliations

¹ Department of Biochemistry, Cell and Systems, Biology, Institute of Structural, Molecular and Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, United Kingdom.
² Department of Chemistry, University of Liverpool, Crown Street, Liverpool L69 7ZD, United Kingdom.
³ Computational Biology Facility, MerseyBio,University of Liverpool, Crown Street, Liverpool L69 7ZB, United Kingdom.

Abstract

AlphaFold 2 (AF2) has revolutionised protein structure prediction but, like any new tool, its performance on specific classes of targets, especially those potentially under-represented in its training data, merits attention. Prompted by a highly confident prediction for a biologically meaningless, randomly permuted repeat sequence, we assessed AF2 performance on sequences composed of perfect repeats of random sequences of different lengths. AF2 frequently folds such sequences into β-solenoids which, while ascribed high confidence, contain unusual and implausible features such as internally stacked and uncompensated charged residues. A number of sequences confidently predicted as β-solenoids are predicted by other advanced methods as intrinsically disordered. The instability of some predictions is demonstrated by molecular dynamics. Importantly, other deep learning-based structure prediction tools predict different structures or β-solenoids with much lower confidence suggesting that AF2 alone has an unreasonable tendency to predict confident but unrealistic β-solenoids for perfect repeat sequences. The potential implications for structure prediction of natural (near-)perfect sequence repeat proteins are also explored.

Keywords: Alphafold; Beta-solenoid; Model confidence; Repeat proteins; Structure prediction.