Anderson, Michael P.Dubnicka, Suzanne R.2015-03-042015-03-042015-03-04http://hdl.handle.net/2097/18860DNA barcodes are short strands of 255–700 nucleotide bases taken from the cytochrome c oxidase subunit 1 (COI) region of the mitochondrial DNA. It has been proposed that these barcodes may be used as a method of differentiating between biological species. Current methods of species classification utilize distance measures that are heavily dependent on both evolutionary model assumptions as well as a clearly defined “gap” between intra- and interspecies variation. Such distance measures fail to measure classification uncertainty or to indicate how much of the barcode is necessary for classification. We propose a sequential naïve Bayes classifier for species classification to address these limitations. The proposed method is shown to provide accurate species-level classification on real and simulated data. The method proposed here quantifies the uncertainty of each classification and addresses how much of the barcode is necessary.en-USThe final publication is available at www.degruyter.comNaïve Bayes classifierDNA barcodingPhylogenetic analysisSequential analysisSpecies classificationSpecies discoveryA sequential naïve Bayes classifier for DNA barcodesArticle (publisher version)