Learning to detect named entities in bilingual code-mixed open speech corpora

Theis, Yihong

Learning to detect named entities in bilingual code-mixed open speech corpora

dc.contributor.author	Theis, Yihong
dc.date.accessioned	2019-07-24T15:58:32Z
dc.date.available	2019-07-24T15:58:32Z
dc.date.graduationmonth	August
dc.date.issued	2019-08-01
dc.description.abstract	This research addresses the problem of code-mixing in speech-based cognitive services, and the subtasks of language identification in multilingual speech commands, search, and named entity recognition. According to the American Community Survey (ACS) published by the United States Census Bureau, more than 20 percent of U.S. residents speak a language other than English at home. Many bilingual speakers habitually and even subconsciously switch languages in mid-sentence and mix them in successive sentences. For example, this happens when a user wants to listen to popular music by artists from different countries and use the native pronunciation of the artist's name. Misrecognition of these embedded named entities by an automatic speech recognition (ASR) system can lead to wrong search results. For instance, when a user wants to play songs by Chinese singers on Spotify, home assistants frequently play the wrong songs because they only recognize English. When callers leave voicemail messages on Google Voice that are transcribed to text, specific named entities (people, places, and things) and the surrounding context of messages are often misinterpreted. Malfunctions of this kind are inconvenient and detract from the holistic user experience for home assistant users. To develop a machine learning-driven approach towards coping with such usability issues, I developed a research test bed centered around code-mixed bilingual sentences. We collected voice recordings from 40 individual participants for multiple commands, multiple streaming music service names, and about 100 Chinese names. We segmented and recombined these samples automatically using sound editing software to combinatorically enumerate a set of utterances, each of which is a short command phrase. Instead of traditional ways to use Hidden Markov models (HMMS), I used a deep learning model which is part of the Baidu DeepSpeech Project and developed by contributors to the Mozilla DeepSpeech open source repository on GitHub. This narrows the focus of our code-mixing task, and the associated supervised learning task, to language identification and segmentation of utterances in different languages at the phrase level. This facilitates development of a prototype web application through which users can contribute their voice data to improve the system. In current and continuing work, I am improving the phrasal model using deep learning to develop a working prototype that integrates with cognitive service APIs (e.g., Amazon Alexa, Google Home) for Chinese/English music search.
dc.description.advisor	William H. Hsu
dc.description.degree	Master of Science
dc.description.department	Department of Computer Science
dc.description.level	Masters
dc.identifier.uri	http://hdl.handle.net/2097/39830
dc.language.iso	en_US
dc.publisher	Kansas State University
dc.rights	© the author. This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/
dc.subject	Code-mixed
dc.subject	Speech recognition
dc.subject	Deep learning
dc.subject	Recurrent neural networks
dc.subject	Cognitive services
dc.subject	Bilingual named entities
dc.title	Learning to detect named entities in bilingual code-mixed open speech corpora
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: YihongTheis2019.pdf
Size:: 2.11 MB
Format:: Adobe Portable Document Format
Description:: Thesis

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.62 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

K-State Electronic Theses, Dissertations, and Reports: 2004 -