Seminar: Deep neural network for Genome-Wide Association Studies and the impact of SNP locations
Supervisor: Drs. Minglun Gong, Yuanzhu Chen, Ting Hu
Deep neural network for Genome-Wide Association Studies and the impact of SNP locations
Department of Computer Science
Wednesday, September 20, 2017, 10:00 a.m., Room EN 2022
The study of Single Nucleotide Polymorphisms (SNP) associated with human diseases has important significance for identifying pathogenic genetic variance and illuminating the genetic architecture of complex diseases. The genome-wide association study (GWAS) is an examination in finding the sequence mutation being SNPs in different individuals, and detecting disease related SNPs. Most existing GWAS use univariate analyses and examine one SNP at a time, and thus may overlook the complex interacting relationships among multiple genetic factors. In this thesis, we propose a new hybrid deep learning approach to identifying susceptibility SNPs associated with colorectal cancer. First, a set of SNP variants was selected by a hybrid feature selection algorithm, and then organized as two-dimensional images using a selection of space filling-curve models. Based on the existed Convolution Neural Network (CNN) algorithm (Tensor-Flow), a multi-layer deep CNN was constructed and trained using those images. We found that images generated using the space filling-curve model that preserved the original SNP locations in the genome yielded the best classification performance. We also reported a set of high risk SNPs associated with colorectal
cancer as the result of the deep neural network training.