JavaScript is disabled for your browser. Some features of this site may not work without it.
Web Harvest of Minimal Intonational Pairs

Author
Howell, Jonathan; Rooth, Mats
Abstract
This paper describes experiments on gathering spoken-language data on the web that bears on issues of the phonetics-phonology and semantics-pragmatics of intonation. The target data are tokens of fixed word strings like "than I did", where intonation varies in a way which correlates with grammatical and pragmatic context. In a web harvest procedure, audio files were identified using a search engine based in speech-to-text, downloaded, and cut to a relevant segment under program control. In an application of such a database, an SVM classifier was trained to make a grammatically determined distinction in intonation based on purely acoustic cues. Sources of error in the retrieval are quantified.
Description
Preliminary version of paper to be presented at Web as Corpus 5, September 2009. Final version will be substituted on July 17, 2009.
Date Issued
2009-07-02Subject
intonation; focus; web as corpus; machine learning; prosody; comparatives; speech recognition; linguistics
Type
article