Skip to search
Skip to main content
Skip to first result
Search
Search Results
Creator:
Rychlý, Pavel
Publisher:
Masaryk University, NLP Centre
Type:
text and corpus
Subject:
text corpora , Ethiopian languages , web corpora , under-resourced languages , and Amharic
Language:
Amharic
Description:
Substantially cleaned version of existing morphologically annotated WIC Corpus.
Rights:
Creative Commons - Attribution 4.0 International (CC BY 4.0) , http://creativecommons.org/licenses/by/4.0/ , and PUB
Creator:
Suchomel, Vít and Rychlý, Pavel
Publisher:
Masaryk University, NLP Centre
Type:
text and corpus
Subject:
text corpora , Ethiopian languages , web corpora , under-resourced languages , and Somali
Language:
Somali
Description:
Somali web corpus. Crawled by SpiderLing in January 2016. Encoded in UTF-8, cleaned, deduplicated.
Rights:
NLP Centre Web Corpus License , https://lindat.mff.cuni.cz/repository/xmlui/page/license-NLPC-WeC , and ACA
Creator:
Suchomel, Vít and Rychlý, Pavel
Publisher:
Masaryk University, NLP Centre
Type:
text and corpus
Subject:
text corpora , Ethiopian languages , web corpora , under-resourced languages , Tigrinya , and Tigrigna
Language:
Tigrinya
Description:
Tigrinya web corpus. Crawled by SpiderLing in January 2016. Encoded in UTF-8, cleaned, deduplicated.
Rights:
NLP Centre Web Corpus License , https://lindat.mff.cuni.cz/repository/xmlui/page/license-NLPC-WeC , and ACA