Search
Search Results
- Creator:
- Rychlý, Pavel
- Publisher:
- Masaryk University, NLP Centre
- Type:
- text and corpus
- Subject:
- text corpora, Ethiopian languages, web corpora, under-resourced languages, and Amharic
- Language:
- Amharic
- Description:
- Substantially cleaned version of existing morphologically annotated WIC Corpus.
- Rights:
- Creative Commons - Attribution 4.0 International (CC BY 4.0), http://creativecommons.org/licenses/by/4.0/, and PUB
- Creator:
- Suchomel, Vít and Rychlý, Pavel
- Publisher:
- Masaryk University, NLP Centre
- Type:
- text and corpus
- Subject:
- text corpora, Ethiopian languages, Oromo, Web corpus, and under-resourced language
- Language:
- Oromo
- Description:
- Oromo web corpus. Crawled by SpiderLing in January 2016. Encoded in UTF-8, cleaned, deduplicated.
- Rights:
- NLP Centre Web Corpus License, https://lindat.mff.cuni.cz/repository/xmlui/page/license-NLPC-WeC, and ACA
- Creator:
- Suchomel, Vít and Rychlý, Pavel
- Publisher:
- Masaryk University, NLP Centre
- Type:
- text and corpus
- Subject:
- text corpora, Ethiopian languages, web corpora, under-resourced languages, and Somali
- Language:
- Somali
- Description:
- Somali web corpus. Crawled by SpiderLing in January 2016. Encoded in UTF-8, cleaned, deduplicated.
- Rights:
- NLP Centre Web Corpus License, https://lindat.mff.cuni.cz/repository/xmlui/page/license-NLPC-WeC, and ACA
- Creator:
- Suchomel, Vít and Rychlý, Pavel
- Publisher:
- Masaryk University, NLP Centre
- Type:
- text and corpus
- Subject:
- text corpora, Ethiopian languages, web corpora, under-resourced languages, Tigrinya, and Tigrigna
- Language:
- Tigrinya
- Description:
- Tigrinya web corpus. Crawled by SpiderLing in January 2016. Encoded in UTF-8, cleaned, deduplicated.
- Rights:
- NLP Centre Web Corpus License, https://lindat.mff.cuni.cz/repository/xmlui/page/license-NLPC-WeC, and ACA