SAT-79 CSUSM SAGE and EST Analysis Pipeline Project

Saturday, October 13, 2012: 9:40 PM
Hall 4E/F (WSCC)
Theo Crouch , Biology Department, California State University, San Marcos, San Marcos, CA
Denise Garcia, PhD , Biology Department, California State University, San Marcos, San Marcos, CA
Xiaoyu Zhang, PhD , Computer Science & Information Systems Department, California State University, San Marcos, San Marcos, CA
Suzanne Hizer , Biology Department, California State University, San Marcos, San Marcos, CA
New methods in biological sciences are generating large data-sets that require computational analysis and are thus increasingly quantitative.  Transcriptomic techniques, such as serial analysis of gene expression (SAGE) coupled with next-gen sequencing generate libraries with ten to hundreds of thousands of expressed gene sequenced tags in a single reaction.  Determination of biological meaning from these large data-sets can be streamlined using a dedicated server and integration of numerous computational algorithms.  In this project, my goal is to create a pipeline designed to address the specific computational needs of the lab.

The initial steps focused on file types, designing and integrating all software being used by the researchers into a text-based interface that accepts the various file formats and outputs the necessary data.  This project will be expanded.  The initial web-based dashboard will act as a “proof of concept” for researchers in other labs at this institution.  In that, other pipelines will be integrated into the same web-based dashboard to make all of this institution’s research data available to other researchers and participants through web-based accounts, in an effort to foster interdisciplinary practices.  In conclusion, this is an ongoing project.  The aim is to develop a web-based dashboard that makes available to researchers in the lab a quick and easy tool for data analysis.