9 speech databases for training and testing multilingual speech recognition applications in the car environment. Contains parallel 4 channel in-car recordings and a GSM channel. Contains interesting phonetically rich material. All orthographically transcribed. Speaker information included for gender, age, accent. Including pronunciation lexicon.
28 speech databases containing broadband recordings from 550 adults and 50 children per language. Contains interesting phonetically rich material. All orthographically transcribed. Speaker information included for gender, age, accent. Including pronunciation lexicon.
Mainly written Swedish corpora (all time periods except Runic Swedish; various genres, including learner corpora) and lexicons; some non-Swedish corpora (Faroese, Old Icelandic, Latin, Spanish); Swedish corpora (appr. 200 MW); Swedish lexicons (appr. 220,000 entries total); non-Swedish corpora (appr. 15 MW