Deltacorpus
- Title:
- Deltacorpus
- Creator:
- Mareček, David, Yu, Zhiwei, Zeman, Daniel, and Žabokrtský, Zdeněk
- Contributor:
- Grantová agentura České republiky@@GA15-10472S@@Morphologically and Syntactically Annotated Corpora of Many Languages@@nationalFunds@@
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Identifier:
- http://hdl.handle.net/11234/1-1662
- Subject:
- part of speech, tagging, semi-supervised, and cross-language
- Type:
- text and corpus
- Description:
- Texts in 107 languages from the W2C corpus (http://hdl.handle.net/11858/00-097C-0000-0022-6133-9), first 1,000,000 tokens per language, tagged by the delexicalized tagger described in Yu et al. (2016, LREC, Portorož, Slovenia).
- Language:
- Belarusian, Bosnian, Bulgarian, Czech, Serbo-Croatian, Croatian, Upper Sorbian, Macedonian, Polish, Russian, Slovak, Slovenian, Serbian, Ukrainian, Latvian, Lithuanian, Afrikaans, Danish, German, English, Faroese, Western Frisian, Swiss German, Icelandic, Limburgan, Luxembourgish, Low German, Dutch, Norwegian Nynorsk, Norwegian, Scots, Swedish, Yiddish, Aragonese, Asturian, Catalan, French, Galician, Haitian, Italian, Latin, Lombard, Neapolitan, Piemontese, Portuguese, Romanian, Spanish, Venetian, Walloon, Breton, Welsh, Scottish Gaelic, Irish, Modern Greek (1453-), Armenian, Albanian, Dimli (individual language), Persian, Gilaki, Kurdish, Tajik, Bengali, Bishnupriya, Gujarati, Fiji Hindi, Hindi, Marathi, Nepali (macrolanguage), Urdu, Amharic, Arabic, Egyptian Arabic, Hebrew, Estonian, Finnish, Hungarian, Basque, Georgian, Chuvash, Azerbaijani, Turkish, Uzbek, Kazakh, Tatar, Yakut, Korean, Mongolian, Telugu, Kannada, Malayalam, Tamil, Newari, Vietnamese, Indonesian, Javanese, Malagasy, Maori, Malay (macrolanguage), Pampanga, Sundanese, Tagalog, Waray (Philippines), Swahili (macrolanguage), Esperanto, Ido, Interlingua (International Auxiliary Language Association), and Volapük
- Rights:
- Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
http://creativecommons.org/licenses/by-sa/4.0/
PUB - Relation:
- http://hdl.handle.net/11234/1-1743
- Source:
- http://ufal.mff.cuni.cz/deltacorpus
- Harvested from:
- LINDAT/CLARIAH-CZ repository
- Metadata only:
- false
- Date:
- 2016-03-17
The item or associated files might be "in copyright"; review the provided rights metadata:
- Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
- http://creativecommons.org/licenses/by-sa/4.0/
- PUB