eCommons

 

Web Harvest of Minimal Intonational Pairs

Other Titles

Abstract

This paper describes experiments on gathering spoken-language data on the web that bears on issues of the phonetics-phonology and semantics-pragmatics of intonation. The target data are tokens of fixed word strings like "than I did", where intonation varies in a way which correlates with grammatical and pragmatic context. In a web harvest procedure, audio files were identified using a search engine based in speech-to-text, downloaded, and cut to a relevant segment under program control. In an application of such a database, an SVM classifier was trained to make a grammatically determined distinction in intonation based on purely acoustic cues. Sources of error in the retrieval are quantified.

Journal / Series

Volume & Issue

Description

Preliminary version of paper to be presented at Web as Corpus 5, September 2009. Final version will be substituted on July 17, 2009.

Sponsorship

Date Issued

2009-07-02T12:50:31Z

Publisher

Keywords

intonation; focus; web as corpus; machine learning; prosody; comparatives; speech recognition; linguistics

Location

Effective Date

Expiration Date

Sector

Employer

Union

Union Local

NAICS

Number of Workers

Committee Chair

Committee Co-Chair

Committee Member

Degree Discipline

Degree Name

Degree Level

Related Version

Related DOI

Related To

Related Part

Based on Related Item

Has Other Format(s)

Part of Related Item

Related To

Related Publication(s)

Link(s) to Related Publication(s)

References

Link(s) to Reference(s)

Previously Published As

Government Document

ISBN

ISMN

ISSN

Other Identifiers

Rights

Rights URI

Types

article

Accessibility Feature

Accessibility Hazard

Accessibility Summary

Link(s) to Catalog Record