Unraveling and Designing Biomolecular Interactions Using Direct Couplings From Global Probabilistic Models



Journal Title

Journal ISSN

Volume Title




Coevolution plays a fundamental role in determining folding, structure, interactions and functionality of proteins. Structural or functional related residues coevolve during the evolutionary history to maintain similar structures, interactions and functional properties among the same protein families. Direct coupling models for coevolutionary analysis have demonstrated outstanding performances in predicting contacting residues of proteins and thereby have commonly used to predict protein structures and interactions. In this dissertation, a global statistical inference framework, direct coupling analysis (DCA) was used to infer coevolutionary couplings in datasets including protein-protein interactions and protein modular-modular compatibility. The couplings were then used to build different computational models, i.e. a Hamiltonian energy function H(S) and compatibility score C(S). The Hamiltonian energy function successfully predicts the specificity strength of protein-protein interactions in two component systems and proposed a novel cross-talk model between different sets of twocomponent system (TCS), VanRS and CroRS, to explain strain specific antibiotic phenotypes in Enterococcus faecalis. On the other hand, the C(S) score predicts the compatibility between two protein subdomains in terms of allosteric communication and function between DNA-binding and ligand-binding modules originated from different proteins in the LacI protein family. This model facilitates screening out functional hybrids from different LacI homologs used to engineer and rewire the connection between signal sensing and genetic output. The compatibility score is also able to predict the mutational effect for hybrid proteins, aiming to improve the functionality of a hybrid protein. Moreover, the application of DCA framework was extended into nonsequence datasets, including pharmacogenomics and clinicopathological data, for the first time. The study has demonstrated that direct coupling approach could capture important connectivities between gene mutations and drug responses, as well as between different clinicopathological features. The direct coupling approach provides new means in pharmacogenomics and clinicopathological data analysis and thereby offers new insights in personalized medicine.



Coevolution, Drug resistance, Protein engineering, Pharmacogenomics, Personalized medicine