CoANZSE logo

The Corpus of Australian and New Zealand Spoken English is a 195-million-word corpus of geolocated automatic speech recognition transcripts of video content from local governments in Australia and New Zealand, created for the study of lexical, grammatical, phonetic, and discourse-pragmatic phenomena of spoken language. CoANZSE Audio contains, in addition to the complete textual content of the corpus, audio files and forced alignments in Praat's TextGrid format for most transcripts.


To access the corpus, log in via the CLARIN Service Provider Federation with your institution's credentials.

If your university is not listed or does not support CLARIN Federation login, or you not have an account at an academic institution, you can register an account at clarin.eu.