A Topological Data Analysis of the Protein Structure

Main Article Content

Zakaria Lamine, My Ismail Mamouni, Mohammed Wadia Mansouri

Abstract

Persistent homology is a tool from a set of methods called Topological data analysis, showing until nowadays a lot of success when it comes to application in biology since this latest uses metrics only for measuring similarities, Embedding the geometric details and focusing on the global shape is the key point making the success of persistent homology, this will be investigated in the paper since enormous work already done in the field and results seems to be endless, as an efficient topological data analysis tool. In this work we will be confirming the latest assumption (topology embeds geometry) by displaying the structure of COILED SERINE which is a protein estimated to constitute 3-5 percent of the encoded residues in most genomes, and giving a substitute of the optimal characteristic distance that can be used in the flexibility-rigidity index, a classic method used to simulate molecule movements and flexible behavior, when it comes to atomic rigidity functions. We will also analyze interesting patterns in the binding site of the beta sheet generated from the pdb file 2JOX. We will be detecting and giving a simple description of different patterns generated by using javaplex generating barcodes and linear statistical results as a summary statistics.

Article Details

References

  1. K. Xia, X. Feng, Y. Tong, G.W. Wei, Persistent Homology for the Quantitative Prediction of Fullerene Stability, J. Comput. Chem. 36 (2014), 408–422. https://doi.org/10.1002/jcc.23816.
  2. V. Kovacev-Nikolic, P. Bubenik, D. Nikolić, G. Heo, Using Persistent Homology and Dynamical Distances to Analyze Protein Binding, Stat. Appl. Genet. Mol. Biol. 15 (2016), 19–38. https://doi.org/10.1515/sagmb-2015-0057.
  3. T. Ichinomiya, I. Obayashi, Y. Hiraoka, Protein-Folding Analysis Using Features Obtained by Persistent Homology, Biophys. J. 118 (2020), 2926–2937. https://doi.org/10.1016/j.bpj.2020.04.032.
  4. D. Bramer, G.W. Wei, Atom-Specific Persistent Homology and Its Application to Protein Flexibility Analysis, Comput. Math. Biophys. 8 (2020), 1-35. https://doi.org/10.1515/cmb-2020-0001.
  5. Z. Cang, G. Wei, Analysis and Prediction of Protein Folding Energy Changes Upon Mutation by Element Specific Persistent Homology, Bioinformatics. 33 (2017), 3549–3557. https://doi.org/10.1093/bioinformatics/btx460.
  6. Z. Cang, L. Mu, G.W. Wei, Representability of Algebraic Topology for Biomolecules in Machine Learning Based Scoring and Virtual Screening, PLoS Comput. Biol. 14 (2018), e1005929. https://doi.org/10.1371/journal.pcbi.1005929.
  7. M. Buchet, F. Chazal, S.Y. Oudot, D.R. Sheehy, Efficient and Robust Persistent Homology for Measures, Comput. Geom. 58 (2016), 70–96. https://doi.org/10.1016/j.comgeo.2016.07.001.
  8. G. Carlsson, Topology and Data, Bull. Amer. Math. Soc. 46 (2009), 255–308. https://doi.org/10.1090/s0273-0979-09-01249-x.
  9. H. Edelsbrunner, D. Morozov, Persistent Homology: Theory and Practice, in: R. Latała, A. Ruciński, P. Strzelecki, J. Świątkowski, D. Wrzosek, P. Zakrzewski (Eds.), European Congress of Mathematics Kraków, 2 - 7 July, 2012, European Mathematical Society Publishing House, Zuerich, Switzerland, 2013: pp. 31-50. https://doi.org/10.4171/120-1/3.
  10. J. Gräßler, D. Koschützki, F. Schreiber, Centilib: Comprehensive Analysis and Exploration of Network Centralities, Bioinformatics. 28 (2012), 1178–1179. https://doi.org/10.1093/bioinformatics/bts106.
  11. Z. Hu, J.H. Hung, Y. Wang, Y.C. Chang, C.L. Huang, M. Huyck, C. DeLisi, VisANT 3.5: Multi-Scale Network Visualization, Analysis and Inference Based on the Gene Ontology, Nucleic Acids Res. 37 (2009), W115–W121. https://doi.org/10.1093/nar/gkp406.
  12. T. Ichinomiya, I. Obayashi, Y. Hiraoka, Protein-Folding Analysis Using Features Obtained by Persistent Homology, Biophys. J. 118 (2020), 2926–2937. https://doi.org/10.1016/j.bpj.2020.04.032.
  13. M.S. Lee, Q.C. Ji, Protein Analysis Using Mass Spectrometry: Accelerating Protein Biotherapeutics From Lab to Patient, Wiley, Hoboken, 2017.
  14. J. Liu, K.L. Xia, J. Wu, S.S.T. Yau, G.W. Wei, Biomolecular Topology: Modelling and Analysis, Acta. Math. Sin.-English Ser. 38 (2022), 1901–1938. https://doi.org/10.1007/s10114-022-2326-5.
  15. K. Opron, K. Xia, Z. Burton, G. Wei, Flexibility–rigidity Index for Protein–nucleic Acid Flexibility and Fluctuation Analysis, J. Comput. Chem. 37 (2016), 1283–1295. https://doi.org/10.1002/jcc.24320.
  16. V.V. Prasolov, Elements of Homology Theory, American Mathematical Society, Providence, R.I, 2007.
  17. K. Xia, K. Opron, G.W. Wei, Multiscale Multiphysics and Multidomain Models—flexibility and Rigidity, J. Chem. Phys. 139 (2013), 194109. https://doi.org/10.1063/1.4830404.
  18. K. Xia, G.W. Wei, Stochastic Model for Protein Flexibility Analysis, Phys. Rev. E. 88 (2013), 062709. https://doi.org/10.1103/physreve.88.062709.
  19. K. Xia, X. Feng, Y. Tong, G.W. Wei, Persistent Homology for the Quantitative Prediction of Fullerene Stability, J. Comput. Chem. 36 (2014), 408–422. https://doi.org/10.1002/jcc.23816.
  20. A. Zomorodian, G. Carlsson, Computing Persistent Homology, Discr. Comput. Geom. 33 (2004), 249–274. https://doi.org/10.1007/s00454-004-1146-y.
  21. R. Jing, Y. Wang, Y. Wu, Y. Hua, X. Dai, M. Li, A Research of Predicting the B-factor Base on the Protein Sequence, J. Theor. Comput. Sci. 1 (2014), 111. https://doi.org/10.4172/2376-130x.1000111.