This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.
 

Universal Dependencies 2.0

Authors
show everyone
Date issued
2017-03-13
Size
11814230 tokens,
12102983 words,
630518 sentences
Description
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008). This release is special in that the treebanks will be used as training/development data in the CoNLL 2017 shared task (http://universaldependencies.org/conll17/). Test data are not released, except for the few treebanks that do not take part in the shared task. 64 treebanks will be in the shared task, and they correspond to the following 45 languages: Ancient Greek, Arabic, Basque, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, Galician, German, Gothic, Greek, Hebrew, Hindi, Hungarian, Indonesian, Irish, Italian, Japanese, Kazakh, Korean, Latin, Latvian, Norwegian, Old Church Slavonic, Persian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Spanish, Swedish, Turkish, Ukrainian, Urdu, Uyghur and Vietnamese. This release fixes a bug in http://hdl.handle.net/11234/1-1976. Changed files: ud-tools-v2.0.tgz (conllu_to_text.pl, conllu_to_conllx.pl; added text_without_spaces.pl), ud-treebanks-conll2017.tgz (fi_ftb-ud-train.txt, he-ud-train.txt, it-ud-train.txt, pt_br-ud-train.txt, es-ud-train.txt) and ud-treebanks-v2.0.tgz (fi_ftb-ud-train.txt, he-ud-train.txt, it-ud-train.txt, pt_br-ud-train.txt, es-ud-train.txt, ar_nyuad-ud-dev.txt, ar_nyuad-ud-test.txt, ar_nyuad-ud-train.txt, cop-ud-dev.txt, cop-ud-test.txt, cop-ud-train.txt, sa-ud-dev.txt, sa-ud-test.txt, sa-ud-train.txt).
Acknowledgement

Version History

Showing 1 - 10 out of 24 results
VersionDateSummary
2025-11-13 11:27:52
New release of Universal Dependencies: 2.17
2025-05-15 00:00:00
2024-11-15 00:00:00
2024-05-15 00:00:00
2023-11-15 00:00:00
2023-05-15 00:00:00
2022-11-15 00:00:00
2022-05-15 00:00:00
2021-11-15 00:00:00
2021-05-16 00:00:00
* Selected version
This item isPublicly Available
and licensed under:
 Files in this item
Name
ud-documentation-v2.0.tgz
Size
43.5 MB
Format
application/x-gzip
Description
gzip Archive
MD5
fbe08dd83675da3ac1e54a5ee67d1a69
Preview
  File Preview
Name
ud-treebanks-v2.0.tgz
Size
180.67 MB
Format
application/x-gzip
Description
gzip Archive
MD5
0fc5576ebade87a0733cc323d529d784
Preview
  File Preview
Name
ud-treebanks-conll2017.tgz
Size
174.86 MB
Format
application/x-gzip
Description
gzip Archive
MD5
f4869f28c376c360c740ef8caafdfffd
Preview
  File Preview
Name
ud-tools-v2.0.tgz
Size
192.19 KB
Format
application/x-gzip
Description
gzip Archive
MD5
68c0c53a740a87eaeae18cd528a915c2
Preview
  File Preview