High-accuracy splice sites prediction based on sequence component and position features

dc.citation.doidoi:10.4238/2012.September.25.12en_US
dc.citation.epage3451en_US
dc.citation.issue3en_US
dc.citation.jtitleGenetics and Molecular Researchen_US
dc.citation.spage3432en_US
dc.citation.volume11en_US
dc.contributor.authorLi, Jinliang
dc.contributor.authorWang, Lifeng
dc.contributor.authorWang, Haiyan
dc.contributor.authorBai, Lianyang
dc.contributor.authorYuan, Zheming
dc.contributor.authoreidhwangen_US
dc.date.accessioned2013-04-03T16:09:30Z
dc.date.available2013-04-03T16:09:30Z
dc.date.issued2013-04-03
dc.date.published2012en_US
dc.description.abstractIdentification of splice sites plays a key role in annotation of genes and hence, the improvement of computational prediction of splice sites with high accuracy has great significance. In this article, we first quantitatively determined the length of window and the number and position of the consensus bases by a Chi-square test, and then extracted the sequence multi-scale component (MSC) features and the position (Pos) and adjacent positions relationship (APR) features of consensus sites. Then we constructed a novel classification model using SVM with above features and applied it to the HS³D dataset. Compared with the results in current literatures, our method produces a great improvement in the 10-fold cross validation accuracies for training sets with true and spurious splice sites of both equal and different-proportions. This method was also applied to the NN269 dataset for further evaluation and independent test. The obtained results are superior to those in literature, which demonstrates the stability and superiority of this method. Satisfying results show that our method has high accuracy for prediction of splice sites.en_US
dc.identifier.urihttp://hdl.handle.net/2097/15449
dc.language.isoen_USen_US
dc.relation.urihttp://www.geneticsmr.com/articles/1890en_US
dc.rightsPermission to archive granted by Genetics and Molecular Research, March 18, 2013en_US
dc.subjectSplice site predictionen_US
dc.subjectMulti-scale component featuresen_US
dc.subjectPosition featuresen_US
dc.subjectAdjacent positions relationship featuresen_US
dc.subjectSupport Vector Machine (SVM)en_US
dc.titleHigh-accuracy splice sites prediction based on sequence component and position featuresen_US
dc.typeArticle (author version)en_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
WangGMR2012.pdf
Size:
400.71 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.62 KB
Format:
Item-specific license agreed upon to submission
Description: