Skip to search
Skip to main content
Skip to first result
Search
Search Results
Creator:
Zeman, Daniel and Droganova, Kira
Publisher:
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:
text and corpus
Subject:
semantic dependency and universal dependencies
Language:
Afrikaans , Assyrian Neo-Aramaic , Akkadian , Amharic , Arabic , Belarusian , Breton , Bulgarian , Russia Buriat , Catalan , Czech , Church Slavic , Mandarin Chinese , Coptic , Welsh , Danish , German , Modern Greek (1453-) , English , Estonian , Basque , Faroese , Finnish , French , Irish , Gothic , Ancient Greek (to 1453) , Mbyá Guaraní , Hebrew , Hindi , Croatian , Upper Sorbian , Hungarian , Armenian , Indonesian , Italian , Japanese , Kazakh , Northern Kurdish , Korean , Komi-Zyrian , Karelian , Latin , Latvian , Lithuanian , Literary Chinese , Marathi , Erzya , Dutch , Norwegian , Old Russian , Nigerian Pidgin , Polish , Portuguese , Romanian , Russian , Sanskrit , Slovak , Slovenian , Northern Sami , Spanish , Serbian , Swedish , Tamil , Tagalog , Turkish , Ukrainian , Urdu , Vietnamese , Warlpiri , Wolof , Yoruba , Galician , Bhojpuri , Komi-Permyak , Livvi , Moksha , Scottish Gaelic , Skolt Sami , Icelandic , Albanian , Persian , Akuntsu , Apurinã , Khunsari , Manx , Mundurukú , Nayini , Soi , South Levantine Arabic , Tupinambá , Beja , Western Frisian , Urubú-Kaapor , Kangri , K'iche' , Low German , Makuráp , Western Armenian , and Central Siberian Yupik
Description:
Deep Universal Dependencies is a collection of treebanks derived semi-automatically from Universal Dependencies (http://hdl.handle.net/11234/1-3687). It contains additional deep-syntactic and semantic annotations. Version of Deep UD corresponds to the version of UD it is based on. Note however that some UD treebanks have been omitted from Deep UD.
Rights:
Licence Universal Dependencies v2.8 , https://lindat.mff.cuni.cz/repository/xmlui/page/license-ud-2.8 , and PUB
Creator:
Mareček, David , Yu, Zhiwei , Zeman, Daniel , and Žabokrtský, Zdeněk
Publisher:
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:
text and corpus
Subject:
part of speech , tagging , semi-supervised , and cross-language
Language:
Belarusian , Bosnian , Bulgarian , Czech , Serbo-Croatian , Croatian , Upper Sorbian , Macedonian , Polish , Russian , Slovak , Slovenian , Serbian , Ukrainian , Latvian , Lithuanian , Afrikaans , Danish , German , English , Faroese , Western Frisian , Swiss German , Icelandic , Limburgan , Luxembourgish , Low German , Dutch , Norwegian Nynorsk , Norwegian , Scots , Swedish , Yiddish , Aragonese , Asturian , Catalan , French , Galician , Haitian , Italian , Latin , Lombard , Neapolitan , Piemontese , Portuguese , Romanian , Spanish , Venetian , Walloon , Breton , Welsh , Scottish Gaelic , Irish , Modern Greek (1453-) , Armenian , Albanian , Dimli (individual language) , Persian , Gilaki , Kurdish , Tajik , Bengali , Bishnupriya , Gujarati , Fiji Hindi , Hindi , Marathi , Nepali (macrolanguage) , Urdu , Amharic , Arabic , Egyptian Arabic , Hebrew , Estonian , Finnish , Hungarian , Basque , Georgian , Chuvash , Azerbaijani , Turkish , Uzbek , Kazakh , Tatar , Yakut , Korean , Mongolian , Telugu , Kannada , Malayalam , Tamil , Newari , Vietnamese , Indonesian , Javanese , Malagasy , Maori , Malay (macrolanguage) , Pampanga , Sundanese , Tagalog , Waray (Philippines) , Swahili (macrolanguage) , Esperanto , Ido , Interlingua (International Auxiliary Language Association) , and Volapük
Description:
Texts in 107 languages from the W2C corpus (http://hdl.handle.net/11858/00-097C-0000-0022-6133-9), first 1,000,000 tokens per language, tagged by the delexicalized tagger described in Yu et al. (2016, LREC, Portorož, Slovenia).
Rights:
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) , http://creativecommons.org/licenses/by-sa/4.0/ , and PUB
Creator:
Mareček, David , Yu, Zhiwei , Zeman, Daniel , and Žabokrtský, Zdeněk
Publisher:
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:
text and corpus
Subject:
part of speech , tagging , semi-supervised , and cross-language
Language:
Belarusian , Bosnian , Bulgarian , Czech , Serbo-Croatian , Croatian , Upper Sorbian , Macedonian , Polish , Russian , Slovak , Slovenian , Serbian , Ukrainian , Latvian , Lithuanian , Afrikaans , Danish , German , English , Faroese , Western Frisian , Swiss German , Icelandic , Limburgan , Luxembourgish , Low German , Dutch , Norwegian Nynorsk , Norwegian , Scots , Swedish , Yiddish , Aragonese , Asturian , Catalan , French , Galician , Haitian , Italian , Latin , Lombard , Neapolitan , Piemontese , Portuguese , Romanian , Spanish , Venetian , Walloon , Breton , Welsh , Scottish Gaelic , Irish , Modern Greek (1453-) , Armenian , Albanian , Dimli (individual language) , Persian , Gilaki , Kurdish , Tajik , Bengali , Bishnupriya , Gujarati , Fiji Hindi , Hindi , Marathi , Nepali (macrolanguage) , Urdu , Amharic , Arabic , Egyptian Arabic , Hebrew , Estonian , Finnish , Hungarian , Basque , Georgian , Chuvash , Azerbaijani , Turkish , Uzbek , Kazakh , Tatar , Yakut , Korean , Mongolian , Telugu , Kannada , Malayalam , Tamil , Newari , Vietnamese , Indonesian , Javanese , Malagasy , Maori , Malay (macrolanguage) , Pampanga , Sundanese , Tagalog , Waray (Philippines) , Swahili (macrolanguage) , Esperanto , Ido , Interlingua (International Auxiliary Language Association) , and Volapük
Description:
Texts in 107 languages from the W2C corpus (http://hdl.handle.net/11858/00-097C-0000-0022-6133-9), first 1,000,000 tokens per language, tagged by the delexicalized tagger described in Yu et al. (2016, LREC, Portorož, Slovenia).
Changes in version 1.1:
1. Universal Dependencies tagset instead of the older and smaller Google Universal POS tagset.
2. SVM classifier trained on Universal Dependencies 1.2 instead of HamleDT 2.0.
3. Balto-Slavic languages, Germanic languages and Romance languages were tagged by classifier trained only on the respective group of languages. Other languages were tagged by a classifier trained on all available languages. The "c7" combination from version 1.0 is no longer used.
Rights:
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) , http://creativecommons.org/licenses/by-sa/4.0/ , and PUB
Creator:
Kločurak, Stepan,
Type:
text , paměti , and autobiografie
Subject:
Vnitropolitický vývoj, politický život , Biografie , Kločurak, Stepan, , Rusíni , politici rusínští , hnutí za nezávislost , Ukrajina , světové dějiny 1789-1918 , politické dějiny, politici , jiné národnostní menšiny (Poláci, Rusíni, Lužičtí Srbové atd.) , and Československo 1918-1938
Language:
Ukrainian
Rights:
unknown
Creator:
Papakin, Heorhij
Type:
studie
Subject:
Historická věda. Pomocné vědy historické. Archivnictví , archivy ukrajinské , fondy archivní , Ukrajinská povstalecká armáda , nacionalismus ukrajinský , zahraniční archivnictví , Ukrajina , světové dějiny od r. 1918 do současnosti , dějiny vojenství , and národnosti, vztahy mezi národnostmi a národní hnutí
Language:
Ukrainian
Description:
Sources for the History of Ukrainian Liberation Combats 1939-1956 in the Ukrainian Archives: Specifics, Types and Research Perspectives. and Prameny k dějinám ukrajinských osvobozeneckých bojů v letech 1939-1956 v archivech Ukrajiny: specifika, umístění a perspektivy bádání.
Rights:
unknown
Creator:
Popel, Martin , Novák, Michal , Balhar, Jiří , Košarko, Ondřej , Mayer, Jiří , Poláková, Lucie , Kloudová, Věra , and Anisimova, Mariia
Publisher:
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:
tool and toolService
Subject:
machine translation and Ukrainian
Language:
Ukrainian and Czech
Description:
This software package includes three tools: web frontend for machine translation featuring phonetic transcription of Ukrainian suitable for Czech speakers, API server and a tool for translation of documents with markup (html, docx, odt, pptx, odp,...). These tools are used in the Charles Translator service (https://translator.cuni.cz).
This software was developed within the EdUKate project, which aims to help mitigate language barriers between non-Czech-speaking children in the Czech Republic and the education in the Czech school system. The project focuses on the development and dissemination of multilingual digital learning materials for students in primary and secondary schools.
Rights:
BSD 2-Clause "Simplified" or "FreeBSD" license , http://opensource.org/licenses/BSD-2-Clause , and PUB
Creator:
Kolesnyk, Natalia
Type:
text and studie
Subject:
Lingvistika. Jazyky , onomastika , folklor , historická onomastika , and národopis, folkloristika a etnologie, historická a sociální antropologie
Language:
Ukrainian
Description:
Folklore onomastics: the problem of its status. and Folklorní onomastika: problém statusu.
Rights:
unknown
Creator:
Darevyč, Darija
Type:
studie
Subject:
Grafické umění. Grafika , Chasevyč, Nìl, , malíři ukrajinští , Ukrajinská povstalecká armáda , grafiky , Ukrajina , světové dějiny od r. 1918 do současnosti , malířství, malíři , and armáda, vojenské složky, vojáci
Language:
Ukrainian
Description:
Nil Khasevych's Graphic Art as a Representation of the Struggle of the Ukrainian Insurgent Army. and Grafika nila Chasevyče jako obraz bojů UPA.
Rights:
unknown
Creator:
Hruševs'kyj, Mychajlo Serhìjovyč,
Type:
text and monografie
Subject:
Dějiny zemí východní Evropy , dějiny států , Ukrajina , přehledná zpracování (tematicky) , and přehledná zpracování světových dějin (chronologicky)
Language:
Ukrainian
Rights:
unknown
Creator:
Ostaš, Ljubov
Type:
text and studie
Subject:
Východoslovanské jazyky , Ukrajinci polští , jména vlastní , jazyk ukrajinský , and historická onomastika
Language:
Ukrainian
Description:
Rodná jména Ukrajinců ve vsi Berezno na Cholmščyně.
Rights:
unknown