Skip to search
Skip to main content
Skip to first result
Search
Search Results
Creator:
Žabokrtský, Zdeněk , Bafna, Nyati , Bodnár, Jan , Kyjánek, Lukáš , Svoboda, Emil , Ševčíková, Magda , Vidra, Jonáš , Angle, Sachi , Ansari, Ebrahim , Arkhangelskiy, Timofey , Batsuren, Khuyagbaatar , Bella, Gábor , Bertinetto, Pier Marco , Bonami, Olivier , Celata, Chiara , Daniel, Michael , Fedorenko, Alexei , Filko, Matea , Giunchiglia, Fausto , Haghdoost, Hamid , Hathout, Nabil , Khomchenkova, Irina , Khurshudyan, Victoria , Levonian, Dmitri , Litta, Eleonora , Medvedeva, Maria , Muralikrishna, S. N. , Namer, Fiammetta , Nikravesh, Mahshid , Padó, Sebastian , Passarotti, Marco , Plungian, Vladimir , Polyakov, Alexey , Potapov, Mihail , Pruthwik, Mishra , Rao B, Ashwath , Rubakov, Sergei , Samar, Husain , Sharma, Dipti Misra , Šnajder, Jan , Šojat, Krešimir , Štefanec, Vanja , Talamo, Luigi , Tribout, Delphine , Vodolazsky, Daniil , Vydrin, Arseniy , Zakirova, Aigul , and Zeller, Britta
Publisher:
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:
text , lexicon , and lexicalConceptualResource
Subject:
universal segmentations , morphological segmentation , word segmentation , segmentation , morphology , morphemes , morphological dictionary , unisegments , morph , and multilingual
Language:
Czech , Catalan , German , English , Persian , Finnish , French , Serbo-Croatian , Croatian , Hungarian , Italian , Komi-Zyrian , Latin , Moksha , Mari (Russia) , Mongolian , Erzya , Polish , Portuguese , Russian , Spanish , Swedish , Tajik , Udmurt , Armenian , Bengali , Hindi , Malayalam , Marathi , and Kannada
Description:
Universal Segmentations (UniSegments) is a collection of lexical resources capturing morphological segmentations harmonised into a cross-linguistically consistent annotation scheme for many languages. The annotation scheme consists of simple tab-separated columns that stores a word and its morphological segmentations, including pieces of information about the word and the segmented units, e.g., part-of-speech categories, type of morphs/morphemes etc. The current public version of the collection contains 38 harmonised segmentation datasets covering 30 different languages.
Rights:
Universal Segmentations 1.0 License Terms , https://lindat.mff.cuni.cz/repository/xmlui/page/licence-unisegs-1.0 , and PUB
Creator:
Stulli, Bernard,
Type:
text and monografie
Subject:
Dějiny států a území na Balkánském poloostrově , válka první světová (1914-1918) , armáda rakousko-uherská (1867-1918) , povstání protiválečná , Habsburská monarchie , armáda, vojenské složky, vojáci , and světové dějiny 1914-1918
Language:
Croatian
Rights:
unknown
Creator:
Marcelić, Jasna,
Type:
text and biografie
Subject:
Politika , Biografie , Havel, Václav, , politici čeští , disent , prezidenti , Československo 1945-1992 , české země od r. 1993 do současnosti , and politické dějiny, politici
Language:
Croatian
Rights:
unknown
Creator:
Rogošić, Roko
Type:
text and monografie
Subject:
Dějiny zemí starověkého světa , správa státní , antika , říše římská , dějiny náboženství , dějiny správy , Etruskové, starověký Řím , and církve, sekty
Language:
Croatian
Rights:
unknown
Subject:
Bulić, Frane, , historiografie chorvatská , archeologie chorvatská , časopisy vědecké , Chorvatsko , dějepisectví, historické vědy, historici , světové dějiny 1918-1945 , and archeologie
Language:
Croatian
Rights:
unknown
Subject:
výtahy z časopisů and česká periodika
Language:
Croatian
Rights:
unknown
Publisher:
[chrovatský zemský archiv],
Subject:
časopisy vědecké , archivy , archivnictví chorvatské , Chorvatsko , dějepisectví, historické vědy, historici , světové dějiny 1789-1918 , zahraniční periodika a sborníky , and zahraniční archivnictví
Language:
Croatian
Rights:
unknown
Creator:
Majliš, Martin
Publisher:
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:
text and corpus
Subject:
multilingual corpora
Language:
Afrikaans , Tosk Albanian , Amharic , Arabic , Aragonese , Egyptian Arabic , Asturian , Azerbaijani , Belarusian , Bengali , Bosnian , Bishnupriya , Breton , Buginese , Bulgarian , Catalan , Cebuano , Czech , Chuvash , Corsican , Welsh , Danish , German , Dimli (individual language) , Modern Greek (1453-) , English , Esperanto , Estonian , Basque , Faroese , Persian , Finnish , French , Western Frisian , Gan Chinese , Scottish Gaelic , Irish , Galician , Gilaki , Gujarati , Haitian , Serbo-Croatian , Hebrew , Fiji Hindi , Hindi , Croatian , Upper Sorbian , Hungarian , Armenian , Ido , Interlingua (International Auxiliary Language Association) , Indonesian , Icelandic , Italian , Javanese , Japanese , Kannada , Georgian , Kazakh , Korean , Kurdish , Latin , Latvian , Limburgan , Lithuanian , Lombard , Luxembourgish , Malayalam , Marathi , Macedonian , Malagasy , Mongolian , Maori , Malay (macrolanguage) , Burmese , Neapolitan , Low German , Nepali (macrolanguage) , Newari , Dutch , Norwegian Nynorsk , Norwegian , Occitan (post 1500) , Ossetian , Pampanga , Piemontese , Polish , Portuguese , Quechua , Romanian , Russian , Yakut , Sicilian , Scots , Slovak , Slovenian , Spanish , Albanian , Serbian , Sundanese , Swahili (macrolanguage) , Swedish , Tamil , Tatar , Telugu , Tajik , Tagalog , Thai , Turkish , Ukrainian , Urdu , Uzbek , Venetian , Vietnamese , Volapük , Waray (Philippines) , Walloon , Yiddish , Yoruba , and Chinese
Description:
A set of corpora for 120 languages automatically collected from wikipedia and the web.
Collected using the W2C toolset: http://hdl.handle.net/11858/00-097C-0000-0022-60D6-1
Rights:
Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) , http://creativecommons.org/licenses/by-sa/3.0/ , and PUB
Creator:
Jan Patočka
Publisher:
Str. 7–112. Stať. něm. [Z dopisů ze 70. let H. Ballauffové sestavil I. Chvatík v r. 1982.] — 2. otisk in: Transit 2 (Wien 1991), str. 87–104. [Zkráceno a jazykově upraveno.] — 3. otisk in: Schriften zur tschechischen Kultur und Geschichte, Stuttgart 1992, str. 29–106 (v. 1992/2). [Jazykově upraveno.] — 4. otisk in: Co jsou Češi? / Was sind die Tschechen?, Praha 1992, str. 111–224 (v. 1992/9). [Dvojjazyčné vydání. Německý text podle 3. otisku. Do češtiny podle 1. otisku přel. V. Jochmann.] — Srv. 1989/4.
Type:
Text
Subject:
1968/11 , 1969/1 , 1969/11 , 1969/7 , 1989 , 1989/4 , 1991/4 , 1991/8 , 1992/2 , 1992/9 , 1996/7 , 1997/4 , 2000/32 , 2006/21 , cs , de , fr , hr , hu , pl , SS-13/Češi-II , and stať
Language:
Czech , French , Croatian , Hungarian , Polish , and German
Rights:
open access and Rights holder: Archiv Jana Patočky, z.s.
Publisher:
University of Leipzig
Type:
corpus
Language:
Afrikaans , Albanian , Bulgarian , Catalan , Chinese , Croatian , Czech , Danish , Dutch , English , Esperanto , Estonian , Finnish , French , German , Hungarian , Icelandic , Indonesian , Italian , Japanese , Korean , Latin , Latvian , Lithuanian , Malay (macrolanguage) , Norwegian , Occitan (post 1500) , Romanian , Russian , Slovak , Slovenian , Spanish , Sundanese , Swedish , Tagalog , Turkish , Vietnamese , and Welsh
Description:
Collected from newspaper texts, webcrawling, etc.: words (+frequency), cooccurrences (+graph), left/right neighbours, example sentences
Rights:
Not specified