Machine learning models are informed and cost-effective strategies to design bioactive peptides. They have notably been applied to discover sequences of antimicrobial nature, but rarely their mechanism(s) of action.1 Antimicrobial peptides (AMPs) kill bacteria by disrupting their lipid membranes leading to cell death or translocating bacterial cells to perturb their intracellular targets. Most biophysical experiments and molecular simulations supporting our understanding of their interactions with lipid membranes predominantly use α-helical probes.2 While these mechanisms would apply to the majority of AMPs, they may not generalize to other structures.
Here, we developed machine learning models predicting the membrane activity of AMPs using membrane-disrupting peptides and membrane-penetrating peptides. Our in-depth analysis revealed our best-performing models (86-88% accuracy) had overrepresented alpha-helical structures, favouring predictions towards that structural class.3 Using ProteinMPNN4 with different temperatures, we generated sequences sharing 37 underrepresented folds while preserving their antimicrobial nature to mitigate the original structural bias. We then developed models with similar performances (85-90 % accuracy) capable of predicting the membrane activity for a broader range of peptide structures.3