Harvesting speech datasets for linguistic research on the web
Rooth, Mats; Howell, Jonathan; Wagner, Michael
This is a white paper for a project that harvested audio and transcribed data from podcasts and news broadcasts on the web. Tools were developed to analyze the different uses of prosody (rhythm, stress and intonation) within spoken communication using phonetic analysis and machine learning.
prosody; intonation; comparatives; machine learning; spoken language; web science
Previously Published As
Final project white paper, Digging into Data Challenge