PARSEME corpora annotated for verbal multiword expressions (version 1.3)
Please use the following text to cite this item or export to a predefined format:
Savary, Agata; et al., 2023,
PARSEME corpora annotated for verbal multiword expressions (version 1.3), LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11372/LRT-5124.
Authors
Savary, Agata ; et al.
Item identifier
Project URL
Demo URL
Referenced by
Date issued
2023-05-10
Size
455629 sentences,
9264811 tokens,
127498 multiWordUnits
Description
This multilingual resource contains corpora in which verbal MWEs have been manually annotated. VMWEs include idioms (let the cat out of the bag), light-verb constructions (make a decision), verb-particle constructions (give up), inherently reflexive verbs (help oneself), and multi-verb constructions (make do). This is the first release of the corpora without an associated shared task. Previous version (1.2) was associated with the PARSEME Shared Task on semi-supervised Identification of Verbal MWEs (2020). The data covers 26 languages corresponding to the combination of the corpora for all previous three editions (1.0, 1.1 and 1.2) of the corpora. VMWEs were annotated according to the universal guidelines. The corpora are provided in the cupt format, inspired by the CONLL-U format. Morphological and syntactic information, including parts of speech, lemmas, morphological features and/or syntactic dependencies, are also provided. Depending on the language, the information comes from treebanks (e.g., Universal Dependencies) or from automatic parsers trained on treebanks (e.g., UDPipe). All corpora are split into training, development and test data, following the splitting strategy adopted for the PARSEME Shared Task 1.2. The annotation guidelines are available online: https://parsemefr.lis-lab.fr/parseme-st-guidelines/1.3 The .cupt format is detailed here: https://multiword.sourceforge.net/cupt-format/
Publisher
Collections
Files in this item
- Name
- HI.tgz
- Size
- 469.3 KB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 1d8dbf79b80326f797d517f3f993d04d

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- ES.tgz
- Size
- 2.09 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 588b050f3cd655d1dd6df000b0d702da

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- EL.tgz
- Size
- 10.6 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 125789048de3a0ee764cc3d9f34bc854

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- AR.tgz
- Size
- 10.78 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 73fe213c348928f5eb49a635a6f02a01

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- GA.tgz
- Size
- 494.12 KB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- cb2b193f7ce5bd60a77ba55efbd8232f

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- EU.tgz
- Size
- 2.02 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 5b9d3da6fcdce7e800b1c1ea07eb6ef1

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- FA.tgz
- Size
- 703.09 KB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- d0459becd9d685241b241384ec79ad57

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- DE.tgz
- Size
- 2.25 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- eaee4a615ce4abd74aab58ea72d5c12e

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- BG.tgz
- Size
- 6.48 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 7ccee1056d5621a9b509cf727a678525

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- EN.tgz
- Size
- 1.59 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- b8c356eefeb174e0984f6c7b1188dba9

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- CS.tgz
- Size
- 12.86 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 9fe9764dc970e2c646049533a81ccda6

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- HU.tgz
- Size
- 1.88 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- a1153a044795ee7a9151e0ad2f9e25c1

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- FR.tgz
- Size
- 6.12 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 755009c7e5ba96e74cedc14ec802eb2b

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- HE.tgz
- Size
- 5.26 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- f2e883e1a108a3888fb2628d769b9c3c

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- HR.tgz
- Size
- 1.98 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 951cd6b5948ee8e1aa6a9a4a8bf41336

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- IT.tgz
- Size
- 4.67 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 565fb5c73667b4ac55e8aacf20680501

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- LT.tgz
- Size
- 2.98 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 8f94517eebae1216e80ea6effc97a91a

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- MT.tgz
- Size
- 2.78 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 12ee7b2105eeac324386c859a7ef7816

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- PL.tgz
- Size
- 6.99 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- bdae0922e513f36c000b47360980ffc9

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- PT.tgz
- Size
- 7.59 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 2c96f436546787f976e20a2022abf516

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- RO.tgz
- Size
- 12.33 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 7efcbd0b9902d925c11f014b6ccd3c18

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- TR.tgz
- Size
- 4.55 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 1c36bfd64fba1d93f9deca35e3272ed1

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- SV.tgz
- Size
- 1.44 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 5c71eb09a2bb773b21141a13e8e40a88

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- ZH.tgz
- Size
- 9.61 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 362b4150e0fda49a0915130bc85a6712

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- SR.tgz
- Size
- 1.11 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 0ad8cad8ca462ea837445d2166bc722a

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- SL.tgz
- Size
- 8.35 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 6933ab467e6bef5e52d0656075e42618

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- README.md
- Size
- 7.08 KB
- Format
- application/octet-stream
- Description
- Unknown
- MD5
- 5902de46b35f82c79183b20d67ab13de

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz

