Gaussian Kernel Estimation of Mutual Information in the Max-Min Parents and Children (MMPC) Algorithm for Extrapolating Gene Networks

Thursday, October 27, 2011: 7:35 PM
Room A7 (San Jose Convention Center)
Yogesh Saletore, BS , Tri-I Program in Computational Biology, Cornell University, Weill Cornell Medical College, and Memorial Sloan-Kettering Cancer Center, New York, NY
Jason Mezey, PhD , Department of Genetic Medicine, Weill Cornell Medical College, New York, NY
Large networks of genes co-regulate the transcription and translation of each other and affect the observed phenotypic expression. High throughput microarray technologies can be used to measure the mRNA transcription levels of each gene, which can be used to determine the structure and nature of the underlying gene network. The ARACNE algorithm uses a Gaussian kernel to estimate the mutual information between genes, and uses a Mutual Information Matrix (MIM) and the Data Processing Inequality (DPI) to extract the gene network. The Max-Min Parents and Children (MMPC) algorithm traditionally uses partial correlation for continuous data as the primary statistic to unveil the gene network. Our hypothesis is that the Gaussian kernel by Margolin can be used to estimate mutual information in the MMPC algorithm to achieve better results. In addition, we also propose an alternative Gaussian kernel to measure the conditional mutual information between gene and gene sets. Our methods are to simulate gene networks of different sizes and densities using a gene correlation matrix, and measure the receiver operating characteristics (RoC) of each the algorithms. Our preliminary results on small networks are promising, and show that the Gaussian kernel estimator is relatively successful at extrapolating the underlying gene network. In the future, we hope to analyze larger networks, and determine the efficacy of the Gaussian kernel estimator in terms of speed and power.