I am a rising senior and dual major in Physics and Chemistry at the University of Wisconsin – Stevens Point. I also have minors in Mathematics and Peace Studies. My research interests include complexity, the physics of information and consciousness, and applications of information science (classical and quantum) in areas like high energy physics, black holes, artificial intelligence, and biophysics. I will pursue a Ph.D. in computational/theoretical physics immediately after undergrad. In addition to research, I am passionate about science policy and advocacy, and actively work both on- and off-campus to promote diversity, transparency, and sustainability in STEM and beyond. Apart from academics and advocacy, I enjoy baking, crocheting, reading, and learning about and working on projects related to programming and machine learning.
Latent Space Modeling of Heterochromatin Protein One
Abigail M. Adams1, 2, Xingcheng Lin3 and Bin Zhang4
1Department of Chemistry, University of Wisconsin – Stevens Point
2Department of Physics & Astronomy, University of Wisconsin – Stevens Point
3, 4Department of Chemistry, Massachusetts Institute of Technology
The heterochromatin protein one (HP1) protein family is one of the most highly conserved protein families and plays a critical role in heterochromatin formation, gene silencing, and essential cell function. Each full sequence consists of a poorly defined intrinsically disordered region (IDR) between a chromo (CHRomatin Organization MOdifier) domain and a chromoshadow domain. Here, we use a latent space generative model trained with a variational autoencoder to infer the evolutionary relationships between sequences with HP1-like sequence structures. The distribution of each sequence projected onto a point in latent space represents evolutionary relationships between sequences, with ancestral sequences located near the origin and phylogenetically close sequences in close spatial proximity in latent space. Using K-means clustering of the latent space representation, we were able to identify 4 distinct clusters, with each of HP1’s three isoforms represented in separate clusters, and with the fourth cluster largely absent of known HP1. Moreover, we show that the sequences in each cluster have varying degrees of amino acid residue conservation, particularly in the disordered region, which is believed to play a central role in the structural and dynamic properties of HP1 proteins. These conserved regions likely indicate areas of essential function, which is unique to each isoform and to HP1 in general. Further analysis is needed to determine the properties associated with the latent attributes that generated this latent space. Understanding the relationships between clusters will likely lead to better understanding the structural and functional differences between HP1 isoforms.