Bioinformatics analyses of alternative splicing, est-based and machine learning-based prediction

Date

2008-12-22T14:36:12Z

Journal Title

Journal ISSN

Volume Title

Publisher

Kansas State University

Abstract

Alternative splicing is a mechanism for generating different gene transcripts (called iso- forms) from the same genomic sequence. Finding alternative splicing events experimentally is both expensive and time consuming. Computational methods in general, and EST analy- sis and machine learning algorithms in particular, can be used to complement experimental methods in the process of identifying alternative splicing events. In this thesis, I first iden- tify alternative splicing exons by analyzing EST-genome alignment. Next, I explore the predictive power of a rich set of features that have been experimentally shown to affect al- ternative splicing. I use these features to build support vector machine (SVM) classifiers for distinguishing between alternatively spliced exons and constitutive exons. My results show that simple, linear SVM classifiers built from a rich set of features give results comparable to those of more sophisticated SVM classifiers that use more basic sequence features. Finally, I use feature selection methods to identify computationally the most informative features for the prediction problem considered.

Description

Keywords

support vector machine, alternative splicing

Graduation Month

December

Degree

Master of Science

Department

Department of Computing and Information Sciences

Major Professor

William H. Hsu

Date

2008

Type

Thesis

Citation