Available Languages

Each language in MuNeCo is set up as a corpus on its own. Therefore, you first have to select a language before you can search.

LanguageSizeNLP Status
selectAdyghe4M
selectAfrikaans8.1Mtld
selectAlbanian, Tosk8.9Mtl
selectAmharic1.4M
selectArab, Standard16Mtld
selectAragonese817k
selectArmenian9.1Mtld
selectAsturian4.1Mxl
selectAymara, Central2.2k
selectAzerbaijani, Northern5M
selectBashkir3.4M
selectBasque32Mtld
selectBelarusian11Mtld
selectBengali1.3M
selectBreton667k
selectBulgarian4.9Mtld
selectBuryat, Russian1.8M
selectCatalan32Mtld
selectChechen614k
selectChuvash2.5M
selectCrimean Tatar199k
selectCroatian16Mtld
selectCzech29Mtld
selectDanish12Mtld
selectDari5.4M
selectDutch32Mtld
selectEnglish32Mtld
selectEstonian5.6Mtld
selectFaroese4.9Mtld
selectFinnish7.9Mtld
selectFrench11Mtld
selectFrisian, West6Mtl
selectFriulian2.7M
selectGagauz320k
selectGalician32Mtld
selectGanda5.3M
selectGeorgian3.9M
selectGerman15Mtld
selectGreek3.3Mtld
selectGuaraní, Paraguayan548k
selectHaitian Creole252k
selectHausa4.4M
selectHebrew9.6Mtld
selectHungarian14Mtld
selectIbanag28k
selectIcelandic24Mtld
selectIgbo1M
selectIlocano780k
selectIndonesian2.5Mtld
selectInuktitut, Greenlandic7.8M
selectIrish3Mtld
selectItalian23Mtld
selectKabuverdianu64k
selectKalmyk Oirat635k
selectKarakalpak1.1M
selectKazakh3.1M
selectKhmer, Central811k
selectKinyarwanda2.6M
selectKirghiz3M
selectKorean4.2Mtld
selectKurdish, Central585k
selectLadin1.4M
selectLadino753k
selectLao490k
selectLatvian13Mtld
selectLezghian565k
selectLithuanian1.5Mtld
selectLuxembourgish7.9M
selectMacedonian4Mtl
selectMalayalam6.7M
selectMalaysian, Standard5.8M
selectMaltese14Mtd
selectMaori2.4M
selectMauritian Kreol978k
selectMaya, Yucatán287k
selectMirandese192k
selectMoksha367k
selectMongolian1.6M
selectNdebele1.7M
selectNdonga4.7M
selectNenets513k
selectNorwegian31Mtld
selectNyanja559k
selectOccitan3.5M
selectOromo290k
selectOssetian134k
selectPangasinan52k
selectPapiamento3.2M
selectPashto, Northern2.3M
selectPnar1.1M
selectPolish13Mtld
selectPortuguese32Mtld
selectRomani, Balkan23k
selectRomanian5.3Mtld
selectRomansch3.3M
selectRussian13Mtld
selectSami, Nortern1.5Mtld
selectSamoan2.9M
selectSardinian, Logudorese536k
selectSerbian2.5Mtld
selectShona1.8M
selectSlovak25Mtld
selectSlovenian15Mtld
selectSomali3.3M
selectSorbian, Upper1.3M
selectSpanish32Mtld
selectSwahili17M
selectSwedish11Mtld
selectTagalog7.4M
selectTajiki5.2M
selectTamil979ktld
selectTatar4.9M
selectTetun3.2M
selectTigrinya1.4M
selectTok Pisin349k
selectTongan263k
selectTswana1.4M
selectTurkish3.1Mtld
selectTurkmen2.6M
selectTuva1M
selectUcranian7.5Mtld
selectUdmurt720k
selectUrdu5.5Mtld
selectUyghur4.8Mtld
selectUzbek, Northern2.5M
selectVeps103k
selectVietnamese5.6Mtld
selectVõro374k
selectWaray-Waray16k
selectWelsh8.2Mtld
selectWolof17k
selectYakut2.2M
selectYiddish, Eastern2M
selectYoruba1.1M
selectZazaki, Kirmanjki75k
selectZulu711k

136 languages - 845M tokens in total