Seminar: Known Knowns, Known Unknowns, & Unknown Unknowns
Dr. Steven M Carr
Department of Biology
Memorial University of Newfoundland
“Known Knowns, Known Unknowns, & Unknown Unknowns”
CS for analysis of multidimensional DNA matrices in Evolutionary Genomics
Department of Computer Science
Thursday, March 7, 2013, 1:00 p.m., Room EN-2022
“There are known knowns; there are things we know we know. We also know there are knownunknowns; ...we know there are some things we do not know. But there are also unknown unknowns– the ones we don’t know we don’t know” Donald Rumsfeld (2002)
The advent of so-called NextGen DNA sequencing methods has massively increased the rate atwhich DNA sequence information can be generated, and the volume and complexity of the datamatrices that apply to biological questions, including molecular and organismal evolution. My labhas adopted such methods to analyze complete mitochondrial DNA (mtDNA) genomes frommultiple species simultaneously, by means of a “sequencing by hybridization” microarraybiotechnology, the “ArkChip” (Carr et al. 2007, 2009).
A typical ArkChip experiment generates ca. 1,000,000 features that comprise four ACGThybridization signals for the forward and reverse DNA strands of single individuals from each ofseven species [4 x 2 x 17,000 x 7] (Flynn & Carr 2007). Projects may include 100 ~ 1000s ofindividuals per species (Carr & Marshall 2008). Known knowns include algorithms that extractindividual genome sequences from a 4 x 2 x 17,000 matrix. Known unknowns compare genepatterns along the 17K genome vector within and among species (Marshall et al. 2008), based onexternal algorithms applied to exported data. Unknown unknowns include algorithms for detectionof molecular and evolutionary patterns implicit in fully-annotated higher-order dimensions.
Here, I review the biotechnology, describe known knowns of data matrices and their computationalchallenges, review biological knowns and their bearing on unknown, and speculate about CSunknown unknowns.
download at [http://www.terranovagenomics.com/id62.html]
SM Carr, AT Duggan, & HD Marshall. 2009. Iterative DNA sequencing on microarrays: ahigh-throughput NextGen technology for ecological and evolutionary mitogenomics. LaboratoryFocus 13, 8-12.
HD Marshall, MW Coulson, & SM Carr. 2008. Near neutrality, rate heterogeneity, and linkagegovern mitochondrial genome evolution in Atlantic Cod (Gadus morhua) and other gadine fish.Molecular Biology & Evolution 26, 579-589.
SM Carr & HD Marshall. 2008. Intraspecific phylogeographic genomics from multiple completemtDNA genomes in Atlantic Cod (Gadus morhua): Origins of the “Codmother,” trans-Atlanticvicariance, and mid-glacial population expansion. Genetics 108, 381-389.
SM Carr, HD Marshall, AT Duggan, SMC Flynn, KA Johnstone, AM Pope, & CD Wilkerson. 2008.Phylogeographic genomics of mitochondrial DNA: patterns of intraspecific evolution and amulti-species, microarray-based DNA sequencing strategy for biodiversity studies. ComparativeBiochemistry and Physiology, D: Genomics and Proteomics 3,1-11.
SMC Flynn & SM Carr. 2007. Interspecies hybridization on DNA resequencing microarrays:efficiency of sequence recovery and accuracy of SNP detection in human, ape, and codfishmitochondrial DNA genomes sequenced on a human-specific MitoChip. BMC Genomics 8, 339.