Morpho-syntactically annotated corpora provided for the PARSEME Shared Task on Semi-Supervised Identification of Verbal Multiword Expressions (edition 1.2)
Please use the following text to cite this item or export to a predefined format:
Guillaume, Bruno; et al., 2020,
Morpho-syntactically annotated corpora provided for the PARSEME Shared Task on Semi-Supervised Identification of Verbal Multiword Expressions (edition 1.2), LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-3416.
Authors
Guillaume, Bruno ; et al.
Item identifier
Project URL
Date issued
2020-07-09
Size
450 gb
Description
This multilingual resource contains corpora for 14 languages, gathered at the occasion of the 1.2 edition of the PARSEME Shared Task on semi-supervised Identification of Verbal MWEs (2020). These corpora were meant to serve as additional "raw" corpora, to help discovering unseen verbal MWEs.
The corpora are provided in CONLL-U (https://universaldependencies.org/format.html) format. They contain morphosyntactic annotations (parts of speech, lemmas, morphological features, and syntactic dependencies). Depending on the language, the information comes from treebanks (mostly Universal Dependencies v2.x) or from automatic parsers trained on UD v2.x treebanks (e.g., UDPipe).
VMWEs include idioms (let the cat out of the bag), light-verb constructions (make a decision), verb-particle constructions (give up), inherently reflexive verbs (help oneself), and multi-verb constructions (make do).
For the 1.2 shared task edition, the data covers 14 languages, for which VMWEs were annotated according to the universal guidelines. The corpora are provided in the cupt format, inspired by the CONLL-U format.
Morphological and syntactic information – not necessarily using UD tagsets – including parts of speech, lemmas, morphological features and/or syntactic dependencies are also provided. Depending on the language, the information comes from treebanks (e.g., Universal Dependencies) or from automatic parsers trained on treebanks (e.g., UDPipe).
This item contains training, development and test data, as well as the evaluation tools used in the PARSEME Shared Task 1.2 (2020). The annotation guidelines are available online: http://parsemefr.lif.univ-mrs.fr/parseme-st-guidelines/1.2
Publisher
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- DE.tgz
- Size
- 1.34 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 3aed5aee260875f8903cc0a1543d890a

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- EL.tgz
- Size
- 187.26 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- c81f1052d4ff0f48a36deca267ccaf1f

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- GA.tgz
- Size
- 199.6 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- a06d893fdb1c69d7286a95cb601b3a2e

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- FR-0.tgz
- Size
- 1.43 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- c40111581b2f4d45c8d284087609254a

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- HI.tgz
- Size
- 542.97 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 547b0eaa6eb0fce60af47473c72276a1

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- EU.tgz
- Size
- 141.77 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- a7617a30028a4189558c018612368c77

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- HE.tgz
- Size
- 88.2 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 0a11b23381d024a3ebdca84e18245a68

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- SV-01.tgz
- Size
- 1.01 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 9f71294e5bbe3503964b9dd77e5851f3

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- FR-2.tgz
- Size
- 1.39 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- ed9ae99da315a5c4e2adcf226d3d0f3e

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- FR-1.tgz
- Size
- 1.43 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 984b76faccffef4cdd50d357ee033206

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- PL-12.tgz
- Size
- 1.09 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- e6713b3961cf9bfb1322461d7b5ad53d

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- PL-10.tgz
- Size
- 1.12 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- bbf111bf808e84e46cb04b28b2b537f9

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- SV-11.tgz
- Size
- 1.05 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 247d36e4389774ae59af8aeac055091f

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- SV-18.tgz
- Size
- 800.99 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 89243ae8435558ccac8b5ddfb74677f2

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- SV-17.tgz
- Size
- 1 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- a0c58d61c9670c8c5f954d469aa74a62

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- SV-16.tgz
- Size
- 1.04 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- a3ce9f178594be70a30835d4984f3f63

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- IT.tgz
- Size
- 1.59 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- ce3e3baf237c011731bcbf42c16f161b

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- TR.tgz
- Size
- 162.02 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 6323664589945450aaf1853b73ca99cf

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- ZH.tgz
- Size
- 399.11 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- d7f39b560a8f037d76a8e421657e235a

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- PL-07.tgz
- Size
- 1.13 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- eb952fd1b29b1097ae0166ac43176df0

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- PL-00.tgz
- Size
- 1.03 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 7751e321a902c6435b32689538ea4df3

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- SV-03.tgz
- Size
- 970.8 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 6da6df394d30d975e44daae2a8bfdc2f

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- PL-01.tgz
- Size
- 1.15 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 30bfd4572fa0558a40de07d54a48bf00

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- SV-13.tgz
- Size
- 1017.32 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 279fe30a8afda651c73d5744ceea6ee2

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- PL-02.tgz
- Size
- 1.15 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- f789d03f71464da5f3b653d40fbef351

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- SV-06.tgz
- Size
- 999.95 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 8d3994e43c6442e96d5a19951017cef0

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- PL-05.tgz
- Size
- 1.14 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 7650964d51f43de10ebcb11ee636db06

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- PT.tgz
- Size
- 1.73 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- dcc41bbd107be5902b8795c267dace5d

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- SV-14.tgz
- Size
- 946.75 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- d42b12cc53288f077c69ae821fffa16d

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- SV-15.tgz
- Size
- 972 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 7696cc8931abb44cadbefe6ad402a0c8

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- PL-14.tgz
- Size
- 107.29 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 676dc9cb0109ab5160d5d1d7ab15c19f

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- PL-04.tgz
- Size
- 1.14 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- bfbde7fb2b55bbc117f0298a0a9a77fb

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- SV-08.tgz
- Size
- 931.66 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 5694bd3086b7bf67ca0ea69ecb32f491

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- PL-03.tgz
- Size
- 1.15 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- fe721405c64c7437a9ad45175984b985

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- SV-02.tgz
- Size
- 1008.39 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 06584c4521f93c4924747fa28d24bac5

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- PL-09.tgz
- Size
- 1016.43 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- e7d5e40e9daf8301a83f0ba0f57f0e46

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- SV-04.tgz
- Size
- 849.36 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- f76653d2632ef2a4d458c6fbd8e16d10

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- PL-06.tgz
- Size
- 1.14 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 81de90fef4982b69adb5dac7a392f23c

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- SV-10.tgz
- Size
- 987.17 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 1c974dd27afd4026ce26ba95ff520171

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- SV-09.tgz
- Size
- 886.31 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- bc0450f9b8ab9fe0db06def2750e55c5

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- SV-12.tgz
- Size
- 1.02 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 11de25c2d239438832522e9c2c3d86e3

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- PL-08.tgz
- Size
- 1.12 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 59ec3bbd04f68321fefc1ee23b566b8b

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- PL-11.tgz
- Size
- 1.13 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 63c285f88dfc9e8c180033929067cbbe

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- PL-13.tgz
- Size
- 1.1 GB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- f0ebe5779d28738918e1fcb9a29fc96e

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- RO.tgz
- Size
- 88.05 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 55a0cddc185b3c2f7faa9b7c12d0bf85

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- SV-00.tgz
- Size
- 953.36 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- ad5542e3129988bfb6e4af0db1068b35

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- SV-07.tgz
- Size
- 966.7 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 11f477f6b558f39c7c20031d2783892c

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- SV-05.tgz
- Size
- 1019.38 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 04e7917c7cf5dc620467171b10639a73

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- README_raw.md
- Size
- 10.97 KB
- Format
- application/octet-stream
- Description
- Unknown
- MD5
- f93e74d775c864bd27446d9244cf19ec

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz

