Corpus collected for analyzing differences between two speaking styles, i.e. "clear" and "conversational" speech. Provides 140 sentences and parrallel recordings of clear and conversational speech as well as associated phonetic labels and manually verfied pitch marks, i.e. glottal closure instants.


The Center for Spoken Language Understanding (CSLU) distributes corpora to commercial entities and academic institutions. Corporate members can use these corpora for research but also for creating commercial products such as generating acoustic models for speech recognition.


Developing a successful spoken language system typically requires vast amounts of data, and CSLU has established itself significantly as a collector and distributor of speech corpora. Recognizing that speech corpora are important resources for anyone conducting research in the area of voice processing, we have collected and transcribed telephone and cellular speech data in over 20 languages. CSLU usually has at least one data collection going at any given time.

