Hierarchical Bayesian topic modeling with sentiment and author extension

Date

2016-05-01

Journal Title

Journal ISSN

Volume Title

Publisher

Kansas State University

Abstract

While the Hierarchical Dirichlet Process (HDP) has recently been widely applied to topic modeling tasks, most current hybrid models for concurrent inference of topics and other factors are not based on HDP.

In this dissertation, we present two new models that extend an HDP topic modeling framework to incorporate other learning factors. One model injects Latent Dirichlet Allocation (LDA) based sentiment learning into HDP. This model preserves the benefits of nonparametric Bayesian models for topic learning, while learning latent sentiment aspects simultaneously. It automatically learns different word distributions for each single sentiment polarity within each topic generated.

The other model combines an existing HDP framework for learning topics from free text with latent authorship learning within a generative model using author list information. This model adds one more layer into the current hierarchy of HDPs to represent topic groups shared by authors, and the document topic distribution is represented as a mixture of topic distribution of its authors. This model automatically learns author contribution partitions for documents in addition to topics.

Description

Keywords

Computer science

Graduation Month

May

Degree

Doctor of Philosophy

Department

Computing and Information Sciences

Major Professor

William H. Hsu

Date

2016

Type

Dissertation

Citation