Faculty Mentor: Jonathan Kelner
Direct Supervisor: Adrian Vladu
Home University: University of Puerto Rico
I am from Puerto Rico and I am a senior in the University of Puerto Rico studying Mathematics. I have had a passion for mathematics since middle school, and in the past three years I have developed an interest for abstract algebra. I also like playing guitar, watching films, and reading Agatha Christie books.
Analysis of HapCut: An algorithm for the Haplotype Assembly Problem
The determination of DNA sequences in individuals can help track gene expression patterns and consequently varying susceptibility to a disease. A method to obtain DNA information is using sequencing technologies to read DNA fragments and reconstruct combinations of alleles, known as haplotypes. The DNA reads contain errors and gaps that make the reassembling of fragments into haplotypes more difficult. This is modeled into a combinatorial problem called the Haplotype Assembly Problem, which is known to be NP-hard. A recent study attempts to solve this problem using a combinatorial approach, HapCut, that is based on computing maximum cuts in certain graphs derived from the sequenced fragments. The test data was generated using the haplotype generator ReHap and it was found that the haplotypes reconstructed by HapCut are very close to the optimal haplotype reconstruction solutions provided by ReHap. However, because HapCut works by initially generating a random haplotype and performing local moves to improve the haplotype, its running time is considerably long. A haplotype chosen from a gapless data set can provide a 2-approximation and may reduce the running time of the algorithm significantly. Although a gapless data set does not appear in practice, it is plausible to modify this method for a practical instance.