Skip to search
Skip to main content
Skip to first result
Search
Search Results
Creator:
Savary, Agata , Ramisch, Carlos , Cordeiro, Silvio Ricardo , Sangati, Federico , Vincze, Veronika , QasemiZadeh, Behrang , Candito, Marie , Cap, Fabienne , Giouli, Voula , Stoyanova, Ivelina , Doucet, Antoine , Adalı, Kübra , Barbu Mititelu, Verginica , Bejček, Eduard , El Maarouf, Ismail , Eryiğit, Gülşen , Galea, Luke , Ha-Cohen Kerner, Yaakov , Liebeskind, Chaya , Monti, Johanna , Parra Escartín, Carla , Kovalevskaitė, Jolanta , Krek, Simon , van der Plas, Lonneke , Aceta, Cristina , Aduriz, Itziar , Antoine, Jean-Yves , Attard, Greta , Azzopardi, Kirsty , Boizou, Loic , Bonnici, Janice , Boz, Mert , Bumbulienė, Ieva , Busuttil, Jael , Caruso, Valeria , Cherchi, Manuela , Constant, Matthieu , Czerepowicka, Monika , De Santis, Anna , Dimitrova, Tsvetana , Dinç, Tutkum , Elyovich, Hevi , Fabri, Ray , Farrugia, Alison , Findlay, Jamie , Fotopoulou, Aggeliki , Foufi, Vassiliki , Galea, Sara Anne , Gantar, Polona , Gatt, Albert , Gatt, Anabelle , Herrero, Carlos , Iñurrieta, Uxoa , Jagfeld, Glorianna , Hnátková, Milena , Ionescu, Mihaela , Klyueva, Natalia , Koeva, Svetla , Kovács, Viktória , Kuzman, Taja , Leseva, Svetlozara , Louisou, Sevi , Lynn, Teresa , Malka, Ruth , Martínez Alonso, Héctor , McCrae, John , de Medeiros Caseli, Helena , Miral, Ayşenur , Muscat, Amanda , Nivre, Joakim , Oakes, Michael , Onofrei, Mihaela , Parmentier, Yannick , Pasquer, Caroline , Pia di Buono, Maria , Priego Sanchez, Belem , Raffone, Annalisa , Ramisch, Renata , Rimkutė, Erika , Rizea, Monica-Mihaela , Simkó, Katalin , Spagnol, Michael , Stefanova, Valentina , Stymne, Sara , Sulubacak, Umut , Tabone, Nicole , Tanti, Marc , Todorova, Maria , Urešová, Zdenka , Villavicencio, Aline , and Zilio, Leonardo
Publisher:
PARSEME
Type:
text and corpus
Subject:
Multiword expressions , verbal multiword expressions , idioms , light-verb constructions , verb-particle constructions , and inherently reflexive verbs
Language:
Bulgarian , Czech , German , Modern Greek (1453-) , Spanish , Persian , French , Hebrew , Hungarian , Italian , Lithuanian , Maltese , Polish , Portuguese , Romanian , Slovenian , Swedish , and Turkish
Description:
The PARSEME shared task aims at identifying verbal MWEs in running texts. Verbal MWEs include idioms (let the cat out of the bag), light verb constructions (make a decision), verb-particle constructions (give up), and inherently reflexive verbs (se suicider 'to suicide' in French). VMWEs were annotated according to the universal guidelines in 18 languages. The corpora are provided in the parsemetsv format, inspired by the CONLL-U format.
For most languages, paired files in the CONLL-U format - not necessarily using UD tagsets - containing parts of speech, lemmas, morphological features and/or syntactic dependencies are also provided. Depending on the language, the information comes from treebanks (e.g., Universal Dependencies) or from automatic parsers trained on treebanks (e.g., UDPipe).
This item contains training and test data, tools and the universal guidelines file.
Rights:
PARSEME Shared Task Data (v. 1.0) Agreement , https://lindat.mff.cuni.cz/repository/xmlui/page/licence-mwe-1.0 , and PUB
Creator:
Ramisch, Carlos , Cordeiro, Silvio Ricardo , Savary, Agata , Vincze, Veronika , Barbu Mititelu, Verginica , Bhatia, Archna , Buljan, Maja , Candito, Marie , Gantar, Polona , Giouli, Voula , Güngör, Tunga , Hawwari, Abdelati , Iñurrieta, Uxoa , Kovalevskaitė, Jolanta , Krek, Simon , Lichte, Timm , Liebeskind, Chaya , Monti, Johanna , Parra Escartín, Carla , QasemiZadeh, Behrang , Ramisch, Renata , Schneider, Nathan , Stoyanova, Ivelina , Vaidya, Ashwini , Walsh, Abigail , Aceta, Cristina , Aduriz, Itziar , Antoine, Jean-Yves , Arhar Holdt, Špela , Berk, Gözde , Bielinskienė, Agnė , Blagus, Goranka , Boizou, Loic , Bonial, Claire , Caruso, Valeria , Čibej, Jaka , Constant, Matthieu , Cook, Paul , Diab, Mona , Dimitrova, Tsvetana , Ehren, Rafael , Elbadrashiny, Mohamed , Elyovich, Hevi , Erden, Berna , Estarrona, Ainara , Fotopoulou, Aggeliki , Foufi, Vassiliki , Geeraert, Kristina , van Gompel, Maarten , Gonzalez, Itziar , Gurrutxaga, Antton , Ha-Cohen Kerner, Yaakov , Ibrahim, Rehab , Ionescu, Mihaela , Jain, Kanishka , Jazbec, Ivo-Pavao , Kavčič, Teja , Klyueva, Natalia , Kocijan, Kristina , Kovács, Viktória , Kuzman, Taja , Leseva, Svetlozara , Ljubešić, Nikola , Malka, Ruth , Markantonatou, Stella , Martínez Alonso, Héctor , Matas, Ivana , McCrae, John , de Medeiros Caseli, Helena , Onofrei, Mihaela , Palka-Binkiewicz, Emilia , Papadelli, Stella , Parmentier, Yannick , Pascucci, Antonio , Pasquer, Caroline , Pia di Buono, Maria , Puri, Vandana , Raffone, Annalisa , Ratori, Shraddha , Riccio, Anna , Sangati, Federico , Shukla, Vishakha , Simkó, Katalin , Šnajder, Jan , Somers, Clarissa , Srivastava, Shubham , Stefanova, Valentina , Taslimipoor, Shiva , Theoxari, Natasa , Todorova, Maria , Urizar, Ruben , Villavicencio, Aline , and Zilio, Leonardo
Publisher:
PARSEME
Type:
text and corpus
Subject:
Multiword expressions , verbal multiword expressions , light-verb constructions , verb-particle constructions , inherently reflexive verbs , verbal idioms , and multi-verb constructions
Language:
Bulgarian , German , Modern Greek (1453-) , Spanish , Persian , French , Hebrew , Hungarian , Italian , Lithuanian , Polish , Portuguese , Romanian , Slovenian , Turkish , Hindi , Basque , English , and Croatian
Description:
This multilingual resource contains corpora in which verbal MWEs have been manually annotated. VMWEs include idioms (let the cat out of the bag), light-verb constructions (make a decision), verb-particle constructions (give up), inherently reflexive verbs (help oneself), and multi-verb constructions (make do). VMWEs were annotated according to the universal guidelines in 19 languages. The corpora are provided in the cupt format, inspired by the CONLL-U format. The corpora were used in the 1.1 edition of the PARSEME Shared Task (2018).
For most languages, morphological and syntactic information – not necessarily using UD tagsets – including parts of speech, lemmas, morphological features and/or syntactic dependencies are also provided. Depending on the language, the information comes from treebanks (e.g., Universal Dependencies) or from automatic parsers trained on treebanks (e.g., UDPipe).
This item contains training, development and test data, as well as the evaluation tools used in the PARSEME Shared Task 1.1 (2018).
The annotation guidelines are available online: http://parsemefr.lif.univ-mrs.fr/parseme-st-guidelines/1.1
Rights:
PARSEME Shared Task Data (v. 1.1) Agreement , https://lindat.mff.cuni.cz/repository/xmlui/page/licence-mwe-1.1 , and PUB
Creator:
Ramisch, Carlos , Guillaume, Bruno , Savary, Agata , Waszczuk, Jakub , Candito, Marie , Vaidya, Ashwini , Barbu Mititelu, Verginica , Bhatia, Archna , Iñurrieta, Uxoa , Giouli, Voula , Güngör, Tunga , Jiang, Menghan , Lichte, Timm , Liebeskind, Chaya , Monti, Johanna , Ramisch, Renata , Stymme, Sara , Walsh, Abigail , Xu, Hongzhi , Palka-Binkiewicz, Emilia , Ehren, Rafael , Stymne, Sara , Constant, Matthieu , Pasquer, Caroline , Parmentier, Yannick , Antoine, Jean-Yves , Carlino, Carola , Caruso, Valeria , Di Buono, Maria Pia , Pascucci, Antonio , Raffone, Annalisa , Riccio, Anna , Sangati, Federico , Speranza, Giulia , Cordeiro, Silvio Ricardo , de Medeiros Caseli, Helena , Miranda, Isaac , Rademaker, Alexandre , Vale, Oto , Villavicencio, Aline , Wick Pedro, Gabriela , Wilkens, Rodrigo , Zilio, Leonardo , Rizea, Monica-Mihaela , Ionescu, Mihaela , Onofrei, Mihaela , Chen, Jia , Ge, Xiaomin , Hu, Fangyuan , Hu, Sha , Li, Minli , Liu, Siyuan , Qin, Zhenzhen , Sun, Ruilong , Wang, Chenweng , Xiao, Huangyang , Yan, Peiyi , Yih, Tsy , Yu, Ke , Yu, Songping , Zeng, Si , Zhang, Yongchen , Zhao, Yun , Foufi, Vassiliki , Fotopoulou, Aggeliki , Markantonatou, Stella , Papadelli, Stella , Louizou, Sevasti , Aduriz, Itziar , Estarrona, Ainara , Gonzalez, Itziar , Gurrutxaga, Antton , Uria, Larraitz , Urizar, Ruben , Foster, Jennifer , Lynn, Teresa , Elyovitch, Hevi , Ha-Cohen Kerner, Yaakov , Malka, Ruth , Jain, Kanishka , Puri, Vandana , Ratori, Shraddha , Shukla, Vishakha , Srivastava, Shubham , Berk, Gozde , Erden, Berna , and Yirmibeşoğlu, Zeynep
Publisher:
PARSEME
Type:
text and corpus
Subject:
multiword expressions , verbal multiword expressions , light verb construction , verb-particle constructions , inherently reflexive verbs , verbal idioms , and multi-verb constructions
Language:
German , Modern Greek (1453-) , Basque , French , Irish , Hebrew , Hindi , Italian , Polish , Portuguese , Romanian , Swedish , Turkish , and Chinese
Description:
This multilingual resource contains corpora in which verbal MWEs have been manually annotated, gathered at the occasion of the 1.2 edition of the PARSEME Shared Task on semi-supervised Identification of Verbal MWEs (2020).
VMWEs include idioms (let the cat out of the bag), light-verb constructions (make a decision), verb-particle constructions (give up), inherently reflexive verbs (help oneself), and multi-verb constructions (make do).
For the 1.2 shared task edition, the data covers 14 languages, for which VMWEs were annotated according to the universal guidelines. The corpora are provided in the cupt format, inspired by the CONLL-U format.
Morphological and syntactic information – not necessarily using UD tagsets – including parts of speech, lemmas, morphological features and/or syntactic dependencies are also provided. Depending on the language, the information comes from treebanks (e.g., Universal Dependencies) or from automatic parsers trained on treebanks (e.g., UDPipe).
This item contains training, development and test data, as well as the evaluation tools used in the PARSEME Shared Task 1.2 (2020). The annotation guidelines are available online: http://parsemefr.lif.univ-mrs.fr/parseme-st-guidelines/1.2
Rights:
PARSEME Shared Task Data (v. 1.2) Agreement , https://lindat.mff.cuni.cz/repository/xmlui/page/licence-mwe-1.2 , and PUB
Creator:
Barque, Lucie , Candito, Marie , Constant, Matthieu , Cordeiro, Silvio Ricardo , Crabbé, Benoît , Fort, Karën , Guillaume, Bruno , Haas, Pauline , Huyghe, Richard , Perrier, Guy , Ramisch, Carlos , Ribeyre, Corentin , Savary, Agata , Seddah, Djamé , Segonne, Vincent , Tribout, Delphine , Villemonte de la Clergerie, Eric , Parmentier, Yannick , Pasquer, Caroline , and Antoine, Jean-Yves
Publisher:
ANR
Type:
text and corpus
Subject:
morpho-syntactic annotations , treebank , dependency syntax , semantic tagging , multiword expressions , and named entities
Language:
French
Description:
The Sequoia corpus is a set of 3,099 linguistically-annotated French sentences, originating from four sources (Europarl, European Agency Reports, French regional journal L'Est Républicain, and French wikipedia).
Several types of annotations were added over the years.
The current release comprises:
- parts-of-speech (SEQUOIA ANR-08-EMER-013 project)
- syntactic dependency trees
- deep syntactic dependency graphs (Deep sequoia project)
- multi-word expressions and named entities (PARSEME COST project and PARSEME-FR ANR-14-CERA-0001 project)
- coarse semantic tags for nouns (FrSemCor project)
See the deep sequoia page for a detailed description: https://deep-sequoia.inria.fr/
Rights:
Deep Sequoia Licence , https://lindat.mff.cuni.cz/repository/xmlui/page/deep-sequoia-licence , and PUB
Creator:
Savary, Agata , Cordeiro, Silvio Ricardo , Lichte, Timm , Ramisch, Carlos , Iñurrieta, Uxoa , and Giouli, Voula
Publisher:
PARSEME
Type:
text and corpus
Subject:
verbal multiword expressions , literal occurrence , and idiomaticity rate
Language:
Basque , German , Modern Greek (1453-) , Polish , and Portuguese
Description:
The corpus contains sentences with idiomatic, literal and coincidental occurrences of verbal multiword expressions (VMWEs) in Basque, German, Greek, Polish and Portuguese. The source corpus is the PARSEME multilingual corpus of VMWEs v 1.1 (cf. http://hdl.handle.net/11372/LRT-2842). The sentences with VMWEs were extracted from the source corpus and potential co-occurrences of the same lexemes were automatically extracted from the same corpus. These candidates were then manually annotated by native experts into 6 classes, including literal and coincidental occurrences, as well as various annotation errors.
The construction of the corpus is described by the following publication:
Agata Savary, Silvio Ricardo Cordeiro, Timm Lichte, Carlos Ramisch, Uxoa Iñurrieta, Voula Giouli (forthcoming) "Literal occurrences of multiword expressions: Rare birds that cause a stir", to appear in Prague Bulletin of Mathematical Linguistics.
Rights:
License agreement for The Multilingual corpus of literal occurrences of multiword expressions , https://lindat.mff.cuni.cz/repository/xmlui/page/licence-mwe-literal , and PUB
Creator:
Estève, Louis Clément , Savary, Agata , and Lavergne, Thomas
Publisher:
Université Paris-Saclay, CNRS, Laboratoire Interdisciplinaire des Sciences du Numérique
Type:
text , computationalLexicon , and lexicalConceptualResource
Subject:
verbal multiword expressions , word embeddings , and word2vec
Language:
German , Modern Greek (1453-) , Basque , French , Irish , Hebrew , Hindi , Italian , Polish , Portuguese , Romanian , Swedish , Turkish , and Chinese
Description:
This resource is a set of 14 vector spaces for single words and Verbal Multiword Expressions (VMWEs) in different languages (German, Greek, Basque, French, Irish, Hebrew, Hindi, Italian, Polish, Brazilian Portuguese, Romanian, Swedish, Turkish, Chinese).
They were trained with the Word2Vec algorithm, in its skip-gram version, on PARSEME raw corpora automatically annotated for morpho-syntax (http://hdl.handle.net/11234/1-3367).
These corpora were annotated by Seen2Seen, a rule-based VMWE identifier, one of the leading tools of the PARSEME shared task version 1.2.
VMWE tokens were merged into single tokens.
The format of the vector space files is that of the original Word2Vec implementation by Mikolov et al. (2013), i.e. a binary format.
For compression, bzip2 was used.
Rights:
PARSEME Shared Task Raw Corpus Data (v. 1.2) Agreement , https://lindat.mff.cuni.cz/repository/xmlui/page/licence-mwe-1.2-raw , and PUB
Creator:
Savary, Agata , Ramisch, Carlos , Guillaume, Bruno , Hawwari, Abdelati , Walsh, Abigail , Fotopoulou, Aggeliki , Bielinskienė, Agnė , Estarrona, Ainara , Gatt, Albert , Butler, Alexandra , Rademaker, Alexandre , Maldonado, Alfredo , Villavicencio, Aline , Farrugia, Alison , Muscat, Amanda , Gatt, Anabelle , Antić, Anđela , De Santis, Anna , Raffone, Annalisa , Riccio, Anna , Pascucci, Antonio , Gurrutxaga, Antton , Bhatia, Archna , Vaidya, Ashwini , Miral, Ayşenur , QasemiZadeh, Behrang , Priego Sanchez, Belem , Griciūtė, Bernadeta , Erden, Berna , Parra Escartín, Carla , Herrero, Carlos , Carlino, Carola , Pasquer, Caroline , Liebeskind, Chaya , Wang, Chenweng , Ben Khelil, Chérifa , Bonial, Claire , Somers, Clarissa , Aceta, Cristina , Krstev, Cvetana , Bejček, Eduard , Lindqvist, Ellinor , Erenmalm, Elsa , Palka-Binkiewicz, Emilia , Rimkute, Erika , Petterson, Eva , Cap, Fabienne , Hu, Fangyuan , Sangati, Federico , Wick Pedro, Gabriela , Speranza, Giulia , Jagfeld, Glorianna , Blagus, Goranka , Berk, Gözde , Attard, Greta , Eryiğit, Gülşen , Finnveden, Gustav , Martínez Alonso, Héctor , de Medeiros Caseli, Helena , Elyovich, Hevi , Xu, Hongzhi , Xiao, Huangyang , Miranda, Isaac , Jaknić, Isidora , El Maarouf, Ismail , Aduriz, Itziar , Gonzalez, Itziar , Matas, Ivana , Stoyanova, Ivelina , Jazbec, Ivo-Pavao , Busuttil, Jael , Waszczuk, Jakub , Findlay, Jamie , Bonnici, Janice , Šnajder, Jan , Antoine, Jean-Yves , Foster, Jennifer , Chen, Jia , Nivre, Joakim , Monti, Johanna , McCrae, John , Kovalevskaitė, Jolanta , Jain, Kanishka , Simkó, Katalin , Yu, Ke , Azzopardi, Kirsty , Adalı, Kübra , Uria, Larraitz , Zilio, Leonardo , Boizou, Loïc , van der Plas, Lonneke , Galea, Luke , Sarlak, Mahtab , Buljan, Maja , Cherchi, Manuela , Tanti, Marc , Di Buono, Maria Pia , Todorova, Maria , Candito, Marie , Constant, Matthieu , Shamsfard, Mehrnoush , Jiang, Menghan , Boz, Mert , Spagnol, Michael , Onofrei, Mihaela , Li, Minli , Elbadrashiny, Mohamed , Diab, Mona , Rizea, Monica-Mihaela , Hadj Mohamed, Najet , Theoxari, Natasa , Schneider, Nathan , Tabone, Nicole , Ljubešić, Nikola , Vale, Oto , Cook, Paul , Yan, Peiyi , Gantar, Polona , Ehren, Rafael , Fabri, Ray , Ibrahim, Rehab , Ramisch, Renata , Walles, Rinat , Wilkens, Rodrigo , Urizar, Ruben , Sun, Ruilong , Malka, Ruth , Galea, Sara Anne , Stymne, Sara , Louizou, Sevasti , Hu, Sha , Taslimipoor, Shiva , Ratori, Shraddha , Srivastava, Shubham , Cordeiro, Silvio Ricardo , Krek, Simon , Liu, Siyuan , Zeng, Si , Yu, Songping , Arhar Holdt, Špela , Markantonatou, Stella , Papadelli, Stella , Leseva, Svetlozara , Kuzman, Taja , Kavčič, Teja , Lynn, Teresa , Lichte, Timm , Pickard, Thomas , Dimitrova, Tsvetana , Yih, Tsy , Güngör, Tunga , Dinç, Tutkum , Iñurrieta, Uxoa , Tajalli, Vahide , Stefanova, Valentina , Caruso, Valeria , Puri, Vandana , Foufi, Vassiliki , Barbu Mititelu, Verginica , Vincze, Veronika , Kovács, Viktória , Shukla, Vishakha , Giouli, Voula , Ge, Xiaomin , Ha-Cohen Kerner, Yaakov , Öztürk, Yağmur , Yarandi, Yalda , Parmentier, Yannick , Zhang, Yongchen , Zhao, Yun , Urešová, Zdeňka , Yirmibeşoğlu, Zeynep , Qin, Zhenzhen , Stank , Cristescu, Mihaela , Zgreabăn, Bianca-Mădălina , Bărbulescu, Elena-Andreea , and Stanković, Ranka
Publisher:
PARSEME
Type:
text and corpus
Subject:
multiword expressions , verbal multiword expressions , light verb construction , verb-particle constructions , inherently reflexive verbs , verbal idioms , and multi-verb constructions
Language:
Arabic , Bulgarian , Czech , German , Modern Greek (1453-) , English , Spanish , Basque , Persian , French , Irish , Hebrew , Hindi , Croatian , Hungarian , Lithuanian , Italian , Maltese , Polish , Portuguese , Romanian , Slovenian , Serbian , Swedish , Turkish , and Chinese
Description:
This multilingual resource contains corpora in which verbal MWEs have been manually annotated. VMWEs include idioms (let the cat out of the bag), light-verb constructions (make a decision), verb-particle constructions (give up), inherently reflexive verbs (help oneself), and multi-verb constructions (make do). This is the first release of the corpora without an associated shared task. Previous version (1.2) was associated with the PARSEME Shared Task on semi-supervised Identification of Verbal MWEs (2020). The data covers 26 languages corresponding to the combination of the corpora for all previous three editions (1.0, 1.1 and 1.2) of the corpora. VMWEs were annotated according to the universal guidelines. The corpora are provided in the cupt format, inspired by the CONLL-U format. Morphological and syntactic information, including parts of speech, lemmas, morphological features and/or syntactic dependencies, are also provided. Depending on the language, the information comes from treebanks (e.g., Universal Dependencies) or from automatic parsers trained on treebanks (e.g., UDPipe). All corpora are split into training, development and test data, following the splitting strategy adopted for the PARSEME Shared Task 1.2. The annotation guidelines are available online: https://parsemefr.lis-lab.fr/parseme-st-guidelines/1.3 The .cupt format is detailed here: https://multiword.sourceforge.net/cupt-format/
Rights:
PARSEME Corpora v. 1.3 - Licence Agreement , https://lindat.mff.cuni.cz/repository/xmlui/page/licence-mwe-1.3 , and PUB
Creator:
Zeman, Daniel , Nivre, Joakim , Abrams, Mitchell , Ackermann, Elia , Aepli, Noëmi , Aghaei, Hamid , Agić, Željko , Ahmadi, Amir , Ahrenberg, Lars , Ajede, Chika Kennedy , Akkurt, Salih Furkan , Aleksandravičiūtė, Gabrielė , Alfina, Ika , Algom, Avner , Alnajjar, Khalid , Alzetta, Chiara , Andersen, Erik , Antonsen, Lene , Aoyama, Tatsuya , Aplonova, Katya , Aquino, Angelina , Aragon, Carolina , Aranes, Glyd , Aranzabe, Maria Jesus , Arıcan, Bilge Nas , Arnardóttir, Þórunn , Arutie, Gashaw , Arwidarasti, Jessica Naraiswari , Asahara, Masayuki , Ásgeirsdóttir, Katla , Aslan, Deniz Baran , Asmazoğlu, Cengiz , Ateyah, Luma , Atmaca, Furkan , Attia, Mohammed , Atutxa, Aitziber , Augustinus, Liesbeth , Avelãs, Mariana , Badmaeva, Elena , Balasubramani, Keerthana , Ballesteros, Miguel , Banerjee, Esha , Bank, Sebastian , Barbu Mititelu, Verginica , Barkarson, Starkaður , Basile, Rodolfo , Basmov, Victoria , Batchelor, Colin , Bauer, John , Bedir, Seyyit Talha , Behzad, Shabnam , Belieni, Juan , Bengoetxea, Kepa , Benli, İbrahim , Ben Moshe, Yifat , Berk, Gözde , Bhat, Riyaz Ahmad , Biagetti, Erica , Bick, Eckhard , Bielinskienė, Agnė , Bjarnadóttir, Kristín , Blokland, Rogier , Bobicev, Victoria , Boizou, Loïc , Borges Völker, Emanuel , Börstell, Carl , Bosco, Cristina , Bouma, Gosse , Bowman, Sam , Boyd, Adriane , Braggaar, Anouck , Branco, António , Brokaitė, Kristina , Burchardt, Aljoscha , Campos, Marisa , Candito, Marie , Caron, Bernard , Caron, Gauthier , Carvalheiro, Catarina , Carvalho, Rita , Cassidy, Lauren , Castro, Maria Clara , Castro, Sérgio , Cavalcanti, Tatiana , Cebiroğlu Eryiğit, Gülşen , Cecchini, Flavio Massimiliano , Celano, Giuseppe G. A. , Čéplö, Slavomír , Cesur, Neslihan , Cetin, Savas , Çetinoğlu, Özlem , Chalub, Fabricio , Chamila, Liyanage , Chauhan, Shweta , Chi, Ethan , Chika, Taishi , Cho, Yongseok , Choi, Jinho , Chun, Jayeol , Chung, Juyeon , Cignarella, Alessandra T. , Cinková, Silvie , Collomb, Aurélie , Çöltekin, Çağrı , Connor, Miriam , Corbetta, Claudia , Corbetta, Daniela , Costa, Francisco , Courtin, Marine , Crabbé, Benoît , Cristescu, Mihaela , Cvetkoski, Vladimir , Dale, Ingerid Løyning , Daniel, Philemon , Davidson, Elizabeth , de Alencar, Leonel Figueiredo , Dehouck, Mathieu , de Laurentiis, Martina , de Marneffe, Marie-Catherine , de Paiva, Valeria , Derin, Mehmet Oguz , de Souza, Elvis , Diaz de Ilarraza, Arantza , Dickerson, Carly , Dinakaramani, Arawinda , Di Nuovo, Elisa , Dione, Bamba , Dirix, Peter , Dobrovoljc, Kaja , Doyle, Adrian , Dozat, Timothy , Droganova, Kira , Duran, Magali Sanches , Dwivedi, Puneet , Ebert, Christian , Eckhoff, Hanne , Eguchi, Masaki , Eiche, Sandra , Eli, Marhaba , Elkahky, Ali , Ephrem, Binyam , Erina, Olga , Erjavec, Tomaž , Essaidi, Farah , Etienne, Aline , Evelyn, Wograine , Facundes, Sidney , Farkas, Richárd , Favero, Federica , Ferdaousi, Jannatul , Fernanda, Marília , Fernandez Alcalde, Hector , Fethi, Amal , Foster, Jennifer , Fransen, Theodorus , Freitas, Cláudia , Fujita, Kazunori , Gajdošová, Katarína , Galbraith, Daniel , Gamba, Federica , Garcia, Marcos , Gärdenfors, Moa , Gerardi, Fabrício Ferraz , Gerdes, Kim , Gessler, Luke , Ginter, Filip , Godoy, Gustavo , Goenaga, Iakes , Gojenola, Koldo , Gökırmak, Memduh , Goldberg, Yoav , Gómez Guinovart, Xavier , González Saavedra, Berta , Griciūtė, Bernadeta , Grioni, Matias , Grobol, Loïc , Grūzītis, Normunds , Guillaume, Bruno , Guiller, Kirian , Guillot-Barbance, Céline , Güngör, Tunga , Habash, Nizar , Hafsteinsson, Hinrik , Hajič, Jan , Hajič jr., Jan , Hämäläinen, Mika , Hà Mỹ, Linh , Han, Na-Rae , Hanifmuti, Muhammad Yudistira , Harada, Takahiro , Hardwick, Sam , Harris, Kim , Haug, Dag , Heinecke, Johannes , Hellwig, Oliver , Hennig, Felix , Hladká, Barbora , Hlaváčová, Jaroslava , Hociung, Florinel , Hohle, Petter , Huang, Yidi , Huerta Mendez, Marivel , Hwang, Jena , Ikeda, Takumi , Ingason, Anton Karl , Ion, Radu , Irimia, Elena , Ishola, Ọlájídé , Islamaj, Artan , Ito, Kaoru , Jagodzińska, Sandra , Jannat, Siratun , Jelínek, Tomáš , Jha, Apoorva , Jiang, Katharine , Johannsen, Anders , Jónsdóttir, Hildur , Jørgensen, Fredrik , Juutinen, Markus , Kaşıkara, Hüner , Kabaeva, Nadezhda , Kahane, Sylvain , Kanayama, Hiroshi , Kanerva, Jenna , Kara, Neslihan , Karahóǧa, Ritván , Kåsen, Andre , Kayadelen, Tolga , Kengatharaiyer, Sarveswaran , Kettnerová, Václava , Kharatyan, Lilit , Kirchner, Jesse , Klementieva, Elena , Klyachko, Elena , Kocharov, Petr , Köhn, Arne , Köksal, Abdullatif , Kopacewicz, Kamil , Korkiakangas, Timo , Köse, Mehmet , Koshevoy, Alexey , Kotsyba, Natalia , Kovalevskaitė, Jolanta , Krek, Simon , Krishnamurthy, Parameswari , Kübler, Sandra , Kuqi, Adrian , Kuyrukçu, Oğuzhan , Kuzgun, Aslı , Kwak, Sookyoung , Kyle, Kris , Laan, Käbi , Laippala, Veronika , Lambertino, Lorenzo , Lando, Tatiana , Larasati, Septina Dian , Lavrentiev, Alexei , Lee, John , Lê Hồng, Phương , Lenci, Alessandro , Lertpradit, Saran , Leung, Herman , Levina, Maria , Levine, Lauren , Li, Cheuk Ying , Li, Josie , Li, Keying , Li, Yixuan , Li, Yuan , Lim, KyungTae , Lima Padovani, Bruna , Lin, Yi-Ju Jessica , Lindén, Krister , Liu, Yang Janet , Ljubešić, Nikola , Lobzhanidze, Irina , Loginova, Olga , Lopes, Lucelene , Lusito, Stefano , Luthfi, Andry , Luukko, Mikko , Lyashevskaya, Olga , Lynn, Teresa , Macketanz, Vivien , Mahamdi, Menel , Maillard, Jean , Makarchuk, Ilya , Makazhanov, Aibek , Mandl, Michael , Manning, Christopher , Manurung, Ruli , Marşan, Büşra , Mărănduc, Cătălina , Mareček, David , Marheinecke, Katrin , Markantonatou, Stella , Martínez Alonso, Héctor , Martín Rodríguez, Lorena , Martins, André , Martins, Cláudia , Mašek, Jan , Matsuda, Hiroshi , Matsumoto, Yuji , Mazzei, Alessandro , McDonald, Ryan , McGuinness, Sarah , Mendonça, Gustavo , Merzhevich, Tatiana , Miekka, Niko , Miller, Aaron , Mischenkova, Karina , Missilä, Anna , Mititelu, Cătălin , Mitrofan, Maria , Miyao, Yusuke , Mojiri Foroushani, AmirHossein , Molnár, Judit , Moloodi, Amirsaeid , Montemagni, Simonetta , More, Amir , Moreno Romero, Laura , Moretti, Giovanni , Mori, Shinsuke , Morioka, Tomohiko , Moro, Shigeki , Mortensen, Bjartur , Moskalevskyi, Bohdan , Muischnek, Kadri , Munro, Robert , Murawaki, Yugo , Müürisep, Kaili , Nainwani, Pinkey , Nakhlé, Mariam , Navarro Horñiacek, Juan Ignacio , Nedoluzhko, Anna , Nešpore-Bērzkalne, Gunta , Nevaci, Manuela , Nguyễn Thị, Lương , Nguyễn Thị Minh, Huyền , Nikaido, Yoshihiro , Nikolaev, Vitaly , Nitisaroj, Rattima , Nourian, Alireza , Nunes, Maria das Graças Volpe , Nurmi, Hanna , Ojala, Stina , Ojha, Atul Kr. , Óladóttir, Hulda , Olúòkun, Adédayọ̀ , Omura, Mai , Onwuegbuzia, Emeka , Ordan, Noam , Osenova, Petya , Östling, Robert , Øvrelid, Lilja , Özateş, Şaziye Betül , Özçelik, Merve , Özgür, Arzucan , Öztürk Başaran, Balkız , Paccosi, Teresa , Palmero Aprosio, Alessio , Panova, Anastasia , Pardo, Thiago Alexandre Salgueiro , Park, Hyunji Hayley , Partanen, Niko , Pascual, Elena , Passarotti, Marco , Patejuk, Agnieszka , Paulino-Passos, Guilherme , Pedonese, Giulia , Peljak-Łapińska, Angelika , Peng, Siyao , Peng, Siyao Logan , Pereira, Rita , Pereira, Sílvia , Perez, Cenel-Augusto , Perkova, Natalia , Perrier, Guy , Petrov, Slav , Petrova, Daria , Peverelli, Andrea , Phelan, Jason , Pierre-Louis, Claudel , Piitulainen, Jussi , Pinter, Yuval , Pinto, Clara , Pintucci, Rodrigo , Pirinen, Tommi A , Pitler, Emily , Plamada, Magdalena , Plank, Barbara , Poibeau, Thierry , Ponomareva, Larisa , Popel, Martin , Pretkalniņa, Lauma , Prévost, Sophie , Prokopidis, Prokopis , Przepiórkowski, Adam , Pugh, Robert , Puolakainen, Tiina , Pyysalo, Sampo , Qi, Peng , Querido, Andreia , Rääbis, Andriela , Rademaker, Alexandre , Rahoman, Mizanur , Rama, Taraka , Ramasamy, Loganathan , Ramisch, Carlos , Ramos, Joana , Rashel, Fam , Rasooli, Mohammad Sadegh , Ravishankar, Vinit , Real, Livy , Rebeja, Petru , Reddy, Siva , Regnault, Mathilde , Rehm, Georg , Riabi, Arij , Riabov, Ivan , Rießler, Michael , Rimkutė, Erika , Rinaldi, Larissa , Rituma, Laura , Rizqiyah, Putri , Rocha, Luisa , Rögnvaldsson, Eiríkur , Roksandic, Ivan , Romanenko, Mykhailo , Rosa, Rudolf , Roșca, Valentin , Rovati, Davide , Rozonoyer, Ben , Rudina, Olga , Rueter, Jack , Rúnarsson, Kristján , Sadde, Shoval , Safari, Pegah , Sahala, Aleksi , Saleh, Shadi , Salomoni, Alessio , Samardžić, Tanja , Samson, Stephanie , Sanguinetti, Manuela , Sanıyar, Ezgi , Särg, Dage , Sartor, Marta , Sasaki, Mitsuya , Saulīte, Baiba , Savary, Agata , Sawanakunanon, Yanin , Saxena, Shefali , Scannell, Kevin , Scarlata, Salvatore , Schang, Emmanuel , Schneider, Nathan , Schuster, Sebastian , Schwartz, Lane , Seddah, Djamé , Seeker, Wolfgang , Seraji, Mojgan , Shahzadi, Syeda , Shen, Mo , Shimada, Atsuko , Shirasu, Hiroyuki , Shishkina, Yana , Shohibussirri, Muh , Shvedova, Maria , Siewert, Janine , Sigurðsson, Einar Freyr , Silva, João , Silveira, Aline , Silveira, Natalia , Silveira, Sara , Simi, Maria , Simionescu, Radu , Simkó, Katalin , Šimková, Mária , Símonarson, Haukur Barri , Simov, Kiril , Sitchinava, Dmitri , Sither, Ted , Skachedubova, Maria , Smith, Aaron , Soares-Bastos, Isabela , Solberg, Per Erik , Sonnenhauser, Barbara , Sourov, Shafi , Sprugnoli, Rachele , Stamou, Vivian , Steingrímsson, Steinþór , Stella, Antonio , Stephen, Abishek , Straka, Milan , Strickland, Emmett , Strnadová, Jana , Suhr, Alane , Sulestio, Yogi Lesmana , Sulubacak, Umut , Suzuki, Shingo , Swanson, Daniel , Szántó, Zsolt , Taguchi, Chihiro , Taji, Dima , Tamburini, Fabio , Tan, Mary Ann C. , Tanaka, Takaaki , Tanaya, Dipta , Tavoni, Mirko , Tella, Samson , Tellier, Isabelle , Testori, Marinella , Thomas, Guillaume , Tonelli, Sara , Torga, Liisi , Toska, Marsida , Trosterud, Trond , Trukhina, Anna , Tsarfaty, Reut , Türk, Utku , Tyers, Francis , Þórðarson, Sveinbjörn , Þorsteinsson, Vilhjálmur , Uematsu, Sumire , Untilov, Roman , Urešová, Zdeňka , Uria, Larraitz , Uszkoreit, Hans , Utka, Andrius , Vagnoni, Elena , Vajjala, Sowmya , Vak, Socrates , van der Goot, Rob , Vanhove, Martine , van Niekerk, Daniel , van Noord, Gertjan , Varga, Viktor , Vedenina, Uliana , Venturi, Giulia , Villemonte de la Clergerie, Eric , Vincze, Veronika , Vlasova, Natalia , Wakasa, Aya , Wallenberg, Joel C. , Wallin, Lars , Walsh, Abigail , Washington, Jonathan North , Wendt, Maximilan , Widmer, Paul , Wigderson, Shira , Wijono, Sri Hartati , Wille, Vanessa Berwanger , Williams, Seyi , Wirén, Mats , Wittern, Christian , Woldemariam, Tsegay , Wong, Tak-sum , Wróblewska, Alina , Wu, Qishen , Yako, Mary , Yamashita, Kayo , Yamazaki, Naoki , Yan, Chunxiao , Yasuoka, Koichi , Yavrumyan, Marat M. , Yenice, Arife Betül , Yıldız, Olcay Taner , Yu, Zhuoran , Yuliawati, Arlisa , Žabokrtský, Zdeněk , Zahra, Shorouq , Zeldes, Amir , Zhou, He , Zhu, Hanzhi , Zhu, Yilun , Zhuravleva, Anna , and Ziane, Rayan
Publisher:
Universal Dependencies Consortium
Type:
text and corpus
Subject:
treebank , dependency , syntax , morphology , harmonized annotation , interset , universal tagset , and stanford dependencies
Language:
Ancient Greek (to 1453) , Arabic , Basque , Bulgarian , Croatian , Czech , Danish , Dutch , English , Estonian , Finnish , French , German , Gothic , Modern Greek (1453-) , Hebrew , Hindi , Hungarian , Indonesian , Irish , Italian , Japanese , Latin , Norwegian , Church Slavic , Persian , Polish , Portuguese , Romanian , Slovenian , Spanish , Swedish , Tamil , Catalan , Chinese , Galician , Kazakh , Latvian , Russian , Turkish , Coptic , Sanskrit , Slovak , Ukrainian , Uighur , Vietnamese , Belarusian , Korean , Lithuanian , Urdu , Russia Buriat , Northern Kurdish , Northern Sami , Upper Sorbian , Afrikaans , Yue Chinese , Marathi , Serbian , Swedish Sign Language , Telugu , Amharic , Armenian , Breton , Faroese , Komi-Zyrian , Nigerian Pidgin , Old French (842-ca. 1400) , Tagalog , Thai , Warlpiri , Yoruba , Akkadian , Bambara , Erzya , Maltese , Welsh , Wolof , Assyrian Neo-Aramaic , Literary Chinese , Old Russian , Karelian , Mbyá Guaraní , Bhojpuri , Komi-Permyak , Livvi , Moksha , Scottish Gaelic , Skolt Sami , Swiss German , Albanian , Icelandic , Akuntsu , Apurinã , Chukot , Khunsari , Manx , Mundurukú , Nayini , Old Turkish , Soi , South Levantine Arabic , Tupinambá , Beja , Western Frisian , Guajajára , Urubú-Kaapor , Kangri , K'iche' , Low German , Makuráp , Central Siberian Yupik , Western Armenian , Bengali , Javanese , Karo (Brazil) , Ligurian , Neapolitan , Tatar , Xibe , Yakut , Ancient Hebrew , Cebuano , Guarani , Hittite , Madi , Emerillon , Umbrian , Abaza , Gheg Albanian , Malayalam , Nhengatu , Sinhala , Zacatlán-Ahuacatlán-Tepetzintla Nahuatl , Xavánte , Saya , Borôro , Kirghiz , Algerian Arabic , Old Irish (to 900) , Classical Armenian , Georgian , Haitian , Highland Puebla Nahuatl , Macedonian , Middle French (ca. 1400-1600) , and Veps
Description:
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008).
Rights:
Licence Universal Dependencies v2.13 , https://lindat.mff.cuni.cz/repository/xmlui/page/license-ud-2.13 , and PUB
Creator:
Zeman, Daniel , Nivre, Joakim , Abrams, Mitchell , Ackermann, Elia , Aepli, Noëmi , Aghaei, Hamid , Agić, Željko , Ahmadi, Amir , Ahrenberg, Lars , Ajede, Chika Kennedy , Akkurt, Salih Furkan , Aleksandravičiūtė, Gabrielė , Alfina, Ika , Algom, Avner , Alnajjar, Khalid , Alzetta, Chiara , Andersen, Erik , Antonsen, Lene , Aoyama, Tatsuya , Aplonova, Katya , Aquino, Angelina , Aragon, Carolina , Aranes, Glyd , Aranzabe, Maria Jesus , Arıcan, Bilge Nas , Arnardóttir, Þórunn , Arutie, Gashaw , Arwidarasti, Jessica Naraiswari , Asahara, Masayuki , Ásgeirsdóttir, Katla , Aslan, Deniz Baran , Asmazoğlu, Cengiz , Ateyah, Luma , Atmaca, Furkan , Attia, Mohammed , Atutxa, Aitziber , Augustinus, Liesbeth , Avelãs, Mariana , Badmaeva, Elena , Balasubramani, Keerthana , Ballesteros, Miguel , Banerjee, Esha , Bank, Sebastian , Barbu Mititelu, Verginica , Barkarson, Starkaður , Basile, Rodolfo , Basmov, Victoria , Batchelor, Colin , Bauer, John , Bedir, Seyyit Talha , Behzad, Shabnam , Belieni, Juan , Bengoetxea, Kepa , Benli, İbrahim , Ben Moshe, Yifat , Berg, Ansu , Berk, Gözde , Bhat, Riyaz Ahmad , Biagetti, Erica , Bick, Eckhard , Bielinskienė, Agnė , Bilgin Taşdemir, Esma Fatıma , Bjarnadóttir, Kristín , Blaschke, Verena , Blokland, Rogier , Bobicev, Victoria , Boizou, Loïc , Bonilla, Johnatan , Borges Völker, Emanuel , Börstell, Carl , Bosco, Cristina , Bouma, Gosse , Bowman, Sam , Boyd, Adriane , Braggaar, Anouck , Branco, António , Brokaitė, Kristina , Burchardt, Aljoscha , Campos, Marisa , Candito, Marie , Caron, Bernard , Caron, Gauthier , Carvalheiro, Catarina , Carvalho, Rita , Cassidy, Lauren , Castro, Maria Clara , Castro, Sérgio , Cavalcanti, Tatiana , Cebiroğlu Eryiğit, Gülşen , Cecchini, Flavio Massimiliano , Celano, Giuseppe G. A. , Čéplö, Slavomír , Cesur, Neslihan , Cetin, Savas , Çetinoğlu, Özlem , Chalub, Fabricio , Chamila, Liyanage , Chauhan, Shweta , Chen, Yifei , Chi, Ethan , Chika, Taishi , Cho, Yongseok , Choi, Jinho , Chontaeva, Bermet , Chun, Jayeol , Chung, Juyeon , Cignarella, Alessandra T. , Cinková, Silvie , Collomb, Aurélie , Çöltekin, Çağrı , Connor, Miriam , Corbetta, Claudia , Corbetta, Daniela , Costa, Francisco , Courtin, Marine , Crabbé, Benoît , Cristescu, Mihaela , Cvetkoski, Vladimir , Dale, Ingerid Løyning , Daniel, Philemon , Davidson, Elizabeth , de Alencar, Leonel Figueiredo , Dehouck, Mathieu , de Laurentiis, Martina , de Marneffe, Marie-Catherine , de Paiva, Valeria , Derin, Mehmet Oguz , de Souza, Elvis , Diaz de Ilarraza, Arantza , Díaz Hernández, Roberto Antonio , Dickerson, Carly , Dinakaramani, Arawinda , Di Nuovo, Elisa , Dione, Bamba , Dirix, Peter , Do, Hoa , Dobrovoljc, Kaja , Döhmer, Caroline , Doyle, Adrian , Dozat, Timothy , Droganova, Kira , Duran, Magali Sanches , Dwivedi, Puneet , Ebert, Christian , Eckhoff, Hanne , Eguchi, Masaki , Eiche, Sandra , Eiselen, Roald , Eli, Marhaba , Elkahky, Ali , Ephrem, Binyam , Erina, Olga , Erjavec, Tomaž , Eslami, Soudabeh , Essaidi, Farah , Etienne, Aline , Evelyn, Wograine , Facundes, Sidney , Farkas, Richárd , Favero, Federica , Ferdaousi, Jannatul , Fernanda, Marília , Fernandez Alcalde, Hector , Fethi, Amal , Foster, Jennifer , Fransen, Theodorus , Freitas, Cláudia , Fujita, Kazunori , Gajdošová, Katarína , Galbraith, Daniel , Galy, Edith , Gamba, Federica , Garcia, Marcos , Gärdenfors, Moa , Gaustad, Tanja , Genç, Efe Eren , Gerardi, Fabrício Ferraz , Gerdes, Kim , Gessler, Luke , Ginter, Filip , Godoy, Gustavo , Goenaga, Iakes , Gojenola, Koldo , Gökırmak, Memduh , Goldberg, Yoav , Gómez Guinovart, Xavier , González Saavedra, Berta , Griciūtė, Bernadeta , Grioni, Matias , Grobol, Loïc , Grūzītis, Normunds , Guillaume, Bruno , Guiller, Kirian , Guillot-Barbance, Céline , Güngör, Tunga , Habash, Nizar , Hafsteinsson, Hinrik , Hajič, Jan , Hajič jr., Jan , Hämäläinen, Mika , Hà Mỹ, Linh , Han, Na-Rae , Hanifmuti, Muhammad Yudistira , Harada, Takahiro , Hardwick, Sam , Harris, Kim , Hassert, Naïma , Haug, Dag , Heinecke, Johannes , Hellwig, Oliver , Hennig, Felix , Hladká, Barbora , Hlaváčová, Jaroslava , Hociung, Florinel , Hoefels, Diana , Hohle, Petter , Huang, Yidi , Huerta Mendez, Marivel , Hwang, Jena , Ikeda, Takumi , Iliadou, Inessa , Ingason, Anton Karl , Ion, Radu , Irimia, Elena , Ishola, Ọlájídé , Islamaj, Artan , Ito, Kaoru , Iurescia, Federica , Jagodzińska, Sandra , Jannat, Siratun , Jelínek, Tomáš , Jha, Apoorva , Jiang, Katharine , Jobanputra, Mayank , Johannsen, Anders , Jónsdóttir, Hildur , Jørgensen, Fredrik , Juutinen, Markus , Kaşıkara, Hüner , Kabaeva, Nadezhda , Kahane, Sylvain , Kanayama, Hiroshi , Kanerva, Jenna , Kara, Neslihan , Karahóǧa, Ritván , Kåsen, Andre , Kayadelen, Tolga , Kengatharaiyer, Sarveswaran , Kettnerová, Václava , Kharatyan, Lilit , Kirchner, Jesse , Klementieva, Elena , Klyachko, Elena , Kocharov, Petr , Köhn, Arne , Köksal, Abdullatif , Kopacewicz, Kamil , Korkiakangas, Timo , Köse, Mehmet , Koshevoy, Alexey , Kotsyba, Natalia , Kovačić, Barbara , Kovalevskaitė, Jolanta , Krek, Simon , Krishnamurthy, Parameswari , Kübler, Sandra , Kuqi, Adrian , Kuyrukçu, Oğuzhan , Kuzgun, Aslı , Kwak, Sookyoung , Kyle, Kris , Laan, Käbi , Laippala, Veronika , Lambertino, Lorenzo , Lando, Tatiana , Larasati, Septina Dian , Lavrentiev, Alexei , Lee, John , Lê Hồng, Phương , Lenci, Alessandro , Lertpradit, Saran , Leung, Herman , Levina, Maria , Levine, Lauren , Li, Cheuk Ying , Li, Josie , Li, Keying , Li, Yixuan , Li, Yuan , Lim, KyungTae , Lima Padovani, Bruna , Lin, Yi-Ju Jessica , Lindén, Krister , Liu, Yang Janet , Ljubešić, Nikola , Lobzhanidze, Irina , Loginova, Olga , Lopes, Lucelene , Lusito, Stefano , Lutgen, Anne-Marie , Luthfi, Andry , Luukko, Mikko , Lyashevskaya, Olga , Lynn, Teresa , Macketanz, Vivien , Mahamdi, Menel , Maillard, Jean , Makarchuk, Ilya , Makazhanov, Aibek , Mambrini, Francesco , Mandl, Michael , Manning, Christopher , Manurung, Ruli , Marşan, Büşra , Mărănduc, Cătălina , Mareček, David , Marheinecke, Katrin , Markantonatou, Stella , Martínez Alonso, Héctor , Martín Rodríguez, Lorena , Martins, André , Martins, Cláudia , Mašek, Jan , Matsuda, Hiroshi , Matsumoto, Yuji , Mazzei, Alessandro , McDonald, Ryan , McGuinness, Sarah , Mehta, Maitrey , Ménard, Pierre André , Mendonça, Gustavo , Merzhevich, Tatiana , Meurer, Paul , Miekka, Niko , Milano, Emilia , Miller, Aaron , Mischenkova, Karina , Missilä, Anna , Mititelu, Cătălin , Mitrofan, Maria , Miyao, Yusuke , Mojiri Foroushani, AmirHossein , Molnár, Judit , Moloodi, Amirsaeid , Montemagni, Simonetta , More, Amir , Moreno Romero, Laura , Moretti, Giovanni , Mori, Shinsuke , Morioka, Tomohiko , Moro, Shigeki , Mortensen, Bjartur , Moskalevskyi, Bohdan , Muischnek, Kadri , Munro, Robert , Murawaki, Yugo , Müürisep, Kaili , Nainwani, Pinkey , Nakhlé, Mariam , Navarro Horñiacek, Juan Ignacio , Nedoluzhko, Anna , Nešpore-Bērzkalne, Gunta , Nevaci, Manuela , Nguyễn Thị, Lương , Nguyễn Thị Minh, Huyền , Nikaido, Yoshihiro , Nikolaev, Vitaly , Nitisaroj, Rattima , Norrman, Victor , Nourian, Alireza , Nunes, Maria das Graças Volpe , Nurmi, Hanna , Ojala, Stina , Ojha, Atul Kr. , Óladóttir, Hulda , Olúòkun, Adédayọ̀ , Omura, Mai , Onwuegbuzia, Emeka , Ordan, Noam , Osenova, Petya , Östling, Robert , Ott, Annika , Øvrelid, Lilja , Özateş, Şaziye Betül , Özçelik, Merve , Özgür, Arzucan , Öztürk Başaran, Balkız , Paccosi, Teresa , Palmero Aprosio, Alessio , Panova, Anastasia , Pardo, Thiago Alexandre Salgueiro , Park, Hyunji Hayley , Partanen, Niko , Pascual, Elena , Passarotti, Marco , Patejuk, Agnieszka , Paulino-Passos, Guilherme , Pedonese, Giulia , Peljak-Łapińska, Angelika , Peng, Siyao , Peng, Siyao Logan , Pereira, Rita , Pereira, Sílvia , Perez, Cenel-Augusto , Perkova, Natalia , Perrier, Guy , Petrov, Slav , Petrova, Daria , Peverelli, Andrea , Phelan, Jason , Pierre-Louis, Claudel , Piitulainen, Jussi , Pinter, Yuval , Pinto, Clara , Pintucci, Rodrigo , Pirinen, Tommi A , Pitler, Emily , Plamada, Magdalena , Plank, Barbara , Plum, Alistair , Poibeau, Thierry , Ponomareva, Larisa , Popel, Martin , Pretkalniņa, Lauma , Pretorius, Rigardt , Prévost, Sophie , Prokopidis, Prokopis , Przepiórkowski, Adam , Pugh, Robert , Puolakainen, Tiina , Purschke, Christoph , Pyysalo, Sampo , Qi, Peng , Querido, Andreia , Rääbis, Andriela , Rademaker, Alexandre , Rahoman, Mizanur , Rama, Taraka , Ramasamy, Loganathan , Ramisch, Carlos , Ramos, Joana , Rashel, Fam , Rasooli, Mohammad Sadegh , Ravishankar, Vinit , Real, Livy , Rebeja, Petru , Reddy, Siva , Regnault, Mathilde , Rehm, Georg , Riabi, Arij , Riabov, Ivan , Rießler, Michael , Rimkutė, Erika , Rinaldi, Larissa , Rituma, Laura , Rizqiyah, Putri , Rocha, Luisa , Rögnvaldsson, Eiríkur , Roksandic, Ivan , Romanenko, Mykhailo , Rosa, Rudolf , Roșca, Valentin , Rovati, Davide , Rozonoyer, Ben , Rudina, Olga , Rueter, Jack , Ruffolo, Paolo , Rúnarsson, Kristján , Sadde, Shoval , Safari, Pegah , Sahala, Aleksi , Saleh, Shadi , Salomoni, Alessio , Samardžić, Tanja , Samson, Stephanie , Sánchez-Rodríguez, Xulia , Sanguinetti, Manuela , Sanıyar, Ezgi , Särg, Dage , Sartor, Marta , Sarymsakova, Albina , Sasaki, Mitsuya , Saulīte, Baiba , Savary, Agata , Sawanakunanon, Yanin , Saxena, Shefali , Scannell, Kevin , Scarlata, Salvatore , Schang, Emmanuel , Schneider, Nathan , Schuster, Sebastian , Schwartz, Lane , Seddah, Djamé , Seeker, Wolfgang , Sellmer, Sven , Seraji, Mojgan , Shahzadi, Syeda , Shen, Mo , Shimada, Atsuko , Shirasu, Hiroyuki , Shishkina, Yana , Shohibussirri, Muh , Shvedova, Maria , Siewert, Janine , Sigurðsson, Einar Freyr , Silva, João , Silveira, Aline , Silveira, Natalia , Silveira, Sara , Simi, Maria , Simionescu, Radu , Simkó, Katalin , Šimková, Mária , Símonarson, Haukur Barri , Simov, Kiril , Sitchinava, Dmitri , Sither, Ted , Smith, Aaron , Soares-Bastos, Isabela , Solberg, Per Erik , Sonnenhauser, Barbara , Sourov, Shafi , Sprugnoli, Rachele , Stamou, Vivian , Steingrímsson, Steinþór , Stella, Antonio , Stephen, Abishek , Straka, Milan , Strickland, Emmett , Strnadová, Jana , Suhr, Alane , Sulestio, Yogi Lesmana , Sulubacak, Umut , Suzuki, Shingo , Swanson, Daniel , Szántó, Zsolt , Taguchi, Chihiro , Taji, Dima , Tamburini, Fabio , Tan, Mary Ann C. , Tanaka, Takaaki , Tanaya, Dipta , Tavoni, Mirko , Tella, Samson , Tellier, Isabelle , Testori, Marinella , Thomas, Guillaume , Tıraş, Tarık Emre , Tonelli, Sara , Torga, Liisi , Toska, Marsida , Trosterud, Trond , Trukhina, Anna , Tsarfaty, Reut , Türk, Utku , Tyers, Francis , Þórðarson, Sveinbjörn , Þorsteinsson, Vilhjálmur , Uematsu, Sumire , Untilov, Roman , Urešová, Zdeňka , Uria, Larraitz , Uszkoreit, Hans , Utka, Andrius , Vagnoni, Elena , Vajjala, Sowmya , Vak, Socrates , van der Goot, Rob , Vanhove, Martine , van Niekerk, Daniel , van Noord, Gertjan , Varga, Viktor , Vedenina, Uliana , Venturi, Giulia , Villemonte de la Clergerie, Eric , Vincze, Veronika , Vissamsetty, Anishka , Vlasova, Natalia , Vligouridou, Eleni , Wakasa, Aya , Wallenberg, Joel C. , Wallin, Lars , Walsh, Abigail , Wang, John , Washington, Jonathan North , Wendt, Maximilan , Widmer, Paul , Wigderson, Shira , Wijono, Sri Hartati , Wille, Vanessa Berwanger , Williams, Seyi , Wirén, Mats , Wittern, Christian , Woldemariam, Tsegay , Wong, Tak-sum , Wróblewska, Alina , Wu, Qishen , Yako, Mary , Yamashita, Kayo , Yamazaki, Naoki , Yan, Chunxiao , Yasuoka, Koichi , Yavrumyan, Marat M. , Yenice, Arife Betül , Yılandiloğlu, Enes , Yıldız, Olcay Taner , Yu, Zhuoran , Yuliawati, Arlisa , Žabokrtský, Zdeněk , Zahra, Shorouq , Zeldes, Amir , Zhou, He , Zhu, Hanzhi , Zhu, Yilun , Zhuravleva, Anna , and Ziane, Rayan
Publisher:
Universal Dependencies Consortium
Type:
text and corpus
Subject:
treebank , dependency , syntax , morphology , harmonized annotation , interset , universal tagset , and stanford dependencies
Language:
Ancient Greek (to 1453) , Arabic , Basque , Bulgarian , Croatian , Czech , Danish , Dutch , English , Estonian , Finnish , French , German , Gothic , Modern Greek (1453-) , Hebrew , Hindi , Hungarian , Indonesian , Irish , Italian , Japanese , Latin , Norwegian , Church Slavic , Persian , Polish , Portuguese , Romanian , Slovenian , Spanish , Swedish , Tamil , Catalan , Chinese , Galician , Kazakh , Latvian , Russian , Turkish , Coptic , Sanskrit , Slovak , Ukrainian , Uighur , Vietnamese , Belarusian , Korean , Lithuanian , Urdu , Russia Buriat , Northern Kurdish , Northern Sami , Upper Sorbian , Afrikaans , Yue Chinese , Marathi , Serbian , Swedish Sign Language , Telugu , Amharic , Armenian , Breton , Faroese , Komi-Zyrian , Nigerian Pidgin , Old French (842-ca. 1400) , Tagalog , Thai , Warlpiri , Yoruba , Akkadian , Bambara , Erzya , Maltese , Welsh , Wolof , Assyrian Neo-Aramaic , Literary Chinese , Old Russian , Karelian , Mbyá Guaraní , Bhojpuri , Komi-Permyak , Livvi , Moksha , Scottish Gaelic , Skolt Sami , Swiss German , Albanian , Icelandic , Akuntsu , Apurinã , Chukot , Khunsari , Manx , Mundurukú , Nayini , Old Turkish , Soi , South Levantine Arabic , Tupinambá , Beja , Western Frisian , Guajajára , Urubú-Kaapor , Kangri , K'iche' , Low German , Makuráp , Central Siberian Yupik , Western Armenian , Bengali , Javanese , Karo (Brazil) , Ligurian , Neapolitan , Tatar , Xibe , Yakut , Ancient Hebrew , Cebuano , Guarani , Hittite , Madi , Emerillon , Umbrian , Abaza , Gheg Albanian , Malayalam , Nhengatu , Sinhala , Zacatlán-Ahuacatlán-Tepetzintla Nahuatl , Xavánte , Saya , Borôro , Kirghiz , Algerian Arabic , Old Irish (to 900) , Classical Armenian , Georgian , Haitian , Highland Puebla Nahuatl , Macedonian , Middle French (ca. 1400-1600) , Veps , Abkhazian , Azerbaijani , Bavarian , Cappadocian Greek , Egyptian (Ancient) , Gujarati , Hausa , Latgalian , Luxembourgish , Ottoman Turkish (1500-1928) , Paumarí , and Tswana
Description:
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008).
Rights:
Licence Universal Dependencies v2.14 , https://lindat.mff.cuni.cz/repository/xmlui/page/license-ud-2.14 , and PUB