W2C – Web to Corpus – Corpora
Please use the following text to cite this item or export to a predefined format:
Majliš, Martin, 2011,
W2C – Web to Corpus – Corpora, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11858/00-097C-0000-0022-6133-9.
Authors
Item identifier
Date issued
2011-12-20
Size
55 gb
Description
A set of corpora for 120 languages automatically collected from wikipedia and the web.
Collected using the W2C toolset: http://hdl.handle.net/11858/00-097C-0000-0022-60D6-1
Subject(s)
Collections


