Human bone morphogenic proteins (BMPs) are morphogens that interact with receptors on Mesenchymal Stem Cells (MSCs), leading to their differentiation into osteocytes. BMP2 is commonly used in grafts to treat conditions like osteoporosis and lumbar spinal fusions. However, its use has several drawbacks, including high cost, a tendency to diffuse from the action site, and a short half-life, which requires administering doses far higher than natural levels. This often results in side effects such as bone overgrowth and neurological complications.
1 Consequently, researchers have turned to peptides as an alternative biomolecule source, as they are smaller, more stable, and less expensive than proteins. One such peptide is the bioactive domain of BMP2, known as the knuckle epitope (BMP2-KEP). However, this sequence adopts a collapsed structure in its free state, unlike its native form within the protein resulting in lower observed osteogenic activity.
2 To design new 20-mer sequences, machine learning (ML) techniques can be employed. Unfortunately, there is insufficient structure-property data on peptides to adequately train ML models. In our study, we used a mesoscale simulation model, SIMFIM, to generate structure-property data for multiple sequences.
3 These 20-mer sequences were derived by modifying the BMP2-KEP sequence using various strategies: a) replacing amino acids in sheet-forming regions, b) replacing amino acids in coil-forming regions, and c) replacing amino acids at other sites with hydrophilic, hydrophobic, β-sheet-forming, and charged residues. Structural properties, such as radius of gyration (R
g) and end-to-end distance (EtE), were computed at equilibrium time points, averaged (<R
g> and <EtE>), and used as target outputs in the database. The distribution of these properties suggested that altering the sheet forming regions didn’t alter the structural properties significantly from the free BMP2-KEP, altering the coil regions led to an increase in the median <Rg> and <EtE> of the sequences compared to the free BMP2-KEP whereas introducing charged residues caused the sequence to collapse further except when multiple positively charged Histidine residues were used to modify the sequences. The amino acids in these sequences were represented by Amino Acid Descriptors (AADs), such as the z-scale and t-scale, reflecting various properties like hydrophilicity, size, and electronic nature. Feature engineering techniques, including normalization and principal component analysis (PCA), were applied. We trained and tested several ML models, including linear regression, regularized linear regression, support vector regression, random forests, and neural networks, using GridSearchCV and 5-fold cross-validation to optimize hyperparameters. The R² performance metric was used to evaluate each model, with additional metrics like Mean Square Error (MSE), Pearson's Correlation Coefficient (PCC), Mean Absolute Error (MAE), and Mean Bias Error (MBE) calculated for models with similar R² values to determine the best-performing models. Important residue locations and properties influencing model predictions were identified through permutation importance and Shapley interaction analysis and it was seen that the size and hydrophilicity of several residue locations were important. Pairs of important residue locations were modified in BMP2-KEP, and their <R
g> and <EtE> values were predicted. The trained models suggested certain sequences with minimal modification of BMP2-KEP that had higher <R
g> and <EtE> values than the free BMP2-KEP sequence.
(1) Halloran, D.; Durbano, H. W.; Nohe, A. Bone Morphogenetic Protein-2 in Development and Bone Homeostasis. Journal of Developmental Biology 2020, 8 (3), 19. DOI: 10.3390/jdb8030019 (acccessed 2025-03-26 19:09:08).DOI.org (Crossref).
(2) Moeinzadeh, S.; Barati, D.; Sarvestani, S. K.; Karimi, T.; Jabbari, E. Experimental and Computational Investigation of the Effect of Hydrophobicity on Aggregation and Osteoinductive Potential of BMP-2-Derived Peptide in a Hydrogel Matrix. Tissue Eng Pt A 2015, 21 (1-2), 134-146. DOI: 10.1089/ten.tea.2013.0775.
(3) Dash, R. A.; Jabbari, E. A Structure Independent Molecular Fragment Interfuse Model for Mesoscale Dissipative Particle Dynamics Simulation of Peptides. ACS Omega 2024, 9 (16), 18001-18022. DOI: 10.1021/acsomega.3c09534 From NLM PubMed-not-MEDLINE.