Separating homeologs by phasing in the tetraploid wheat transcriptome

K-REx Repository

Show simple item record

dc.contributor.author Krasileva, Ksenia V.
dc.contributor.author Buffalo, Vince
dc.contributor.author Bailey, Paul
dc.contributor.author Pearce, Stephen
dc.contributor.author Ayling, Sarah
dc.contributor.author Tabbita, Facundo
dc.contributor.author Soria, Marcelo
dc.contributor.author Wang, Shichen
dc.contributor.author Akhunov, Eduard D.
dc.contributor.author Uauy, Cristobal
dc.contributor.author Dubcovsky, Jorge
dc.contributor.author IWGS Consortium
dc.date.accessioned 2014-03-03T21:44:07Z
dc.date.available 2014-03-03T21:44:07Z
dc.date.issued 2014-03-03
dc.identifier.uri http://hdl.handle.net/2097/17202
dc.description.abstract Background: The high level of identity among duplicated homoeologous genomes in tetraploid pasta wheat presents substantial challenges for de novo transcriptome assembly. To solve this problem, we develop a specialized bioinformatics workflow that optimizes transcriptome assembly and separation of merged homoeologs. To evaluate our strategy, we sequence and assemble the transcriptome of one of the diploid ancestors of pasta wheat, and compare both assemblies with a benchmark set of 13,472 full-length, non-redundant bread wheat cDNAs. Results: A total of 489 million 100 bp paired-end reads from tetraploid wheat assemble in 140,118 contigs, including 96% of the benchmark cDNAs. We used a comparative genomics approach to annotate 66,633 open reading frames. The multiple k-mer assembly strategy increases the proportion of cDNAs assembled full-length in a single contig by 22% relative to the best single k-mer size. Homoeologs are separated using a post-assembly pipeline that includes polymorphism identification, phasing of SNPs, read sorting, and re-assembly of phased reads. Using a reference set of genes, we determine that 98.7% of SNPs analyzed are correctly separated by phasing. Conclusions: Our study shows that de novo transcriptome assembly of tetraploid wheat benefit from multiple k-mer assembly strategies more than diploid wheat. Our results also demonstrate that phasing approaches originally designed for heterozygous diploid organisms can be used to separate the close homoeologous genomes of tetraploid wheat. The predicted tetraploid wheat proteome and gene models provide a valuable tool for the wheat research community and for those interested in comparative genomic studies. en_US
dc.language.iso en_US en_US
dc.relation.uri http://genomebiology.com/content/14/6/R66 en_US
dc.subject Transcriptome assembly en_US
dc.subject Multiple k-mer assembly en_US
dc.subject Wheat en_US
dc.subject Polyploid en_US
dc.subject Triticum urartu en_US
dc.subject Triticum turgidum en_US
dc.subject Pseudogenes en_US
dc.title Separating homeologs by phasing in the tetraploid wheat transcriptome en_US
dc.type Article (publisher version) en_US
dc.date.published 2013 en_US
dc.citation.doi doi:10.1186/gb-2013-14-6-r66 en_US
dc.citation.issue 6 en_US
dc.citation.jtitle Genome Biology en_US
dc.citation.spage R66 en_US
dc.citation.volume 14 en_US
dc.contributor.authoreid wangsc en_US
dc.contributor.authoreid eakhunov en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search K-REx


Advanced Search

Browse

My Account

Statistics








Center for the

Advancement of Digital

Scholarship

118 Hale Library

Manhattan KS 66506


(785) 532-7444

cads@k-state.edu