Room 6C/6E Challenges and improvements in bioinformatics analysis of Next Generation Sequencing applied to natural population

Friday, October 12, 2012: 8:00 PM
6C/6E (WSCC)
Jose Medina , High Performance Computing Facility, University of Puerto Rico, Rio Piedras, Rio Piedras, PR
Mayté Ruiz, PhD , Biology, University of Puerto Rico, Rio Piedras, Rio Piedras, PR
Riccardo Papa, PhD , Biology, University of Puerto Rico, Rio Piedras, Rio Piedras, PR
Humberto Ortiz Zuazaga, PhD , High Performance Computing Facility, University of Puerto Rico, Rio Piedras, Rio Piedras, PR
The ability to produce gigabases of DNA sequence in a short time and at minimal cost using Next GenerationSequencing (NGS) platforms provide the ability to sequence an entire genome with relative ease. Thus, with the advent of the NGS technologies, it is possible to empirically answer questions in biology that have been intractable until now. The ability to genotype thousands of genetic markers allow researchers to obtain population genetic data as a continuous distribution across a genome, and to identify genomic regions and candidate genes of evolutionary significance. However, while NGS genome wide scale data are relatively easy to obtain, the analysis remain a major challenge, especially in natural populations. Lower complexity genome screening using restriction site-associated DNA sequencing (RADSeq) has proved to provide data that can be used for sophisticated genome-wide population genetics analysisIn our work, we take advantage of the natural diversity in the wing patterns of Heliconius butterflies to develop and test novel bioinformatic tools and algorithms to characterize thegenomic basis, distribution, and interactions of complexphenotypes segregating in natural hybrid populations. We utilize a combination of bioinformatic tools such as Bowtie, BWA, Velvet, and Stacks in combinations with novel algorithms to provide a powerful statistic approach to unfold the genomic architecture of phenotypic variationin natural populationsFinally, we demonstrate the utility of a reference genome to map and analyze RADSeq data from population of closely related species. Our strategy represents an improvement in the growing field of natural population genomic analysis.