This is not the latest version of this item. The latest version can be found here.
Annotated corpora and tools of the PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions (edition 1.0)
Please use the following text to cite this item or export to a predefined format:
Savary, Agata; et al., 2017,
Annotated corpora and tools of the PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions (edition 1.0), LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11372/LRT-2282.
Authors
Savary, Agata ; et al.
Item identifier
Project URL
Date issued
2017-01-20
Size
274376 sentences,
5439204 tokens,
62218 multiWordUnits
Description
The PARSEME shared task aims at identifying verbal MWEs in running texts. Verbal MWEs include idioms (let the cat out of the bag), light verb constructions (make a decision), verb-particle constructions (give up), and inherently reflexive verbs (se suicider 'to suicide' in French). VMWEs were annotated according to the universal guidelines in 18 languages. The corpora are provided in the parsemetsv format, inspired by the CONLL-U format.
For most languages, paired files in the CONLL-U format - not necessarily using UD tagsets - containing parts of speech, lemmas, morphological features and/or syntactic dependencies are also provided. Depending on the language, the information comes from treebanks (e.g., Universal Dependencies) or from automatic parsers trained on treebanks (e.g., UDPipe).
This item contains training and test data, tools and the universal guidelines file.
Publisher
Acknowledgement
COST
Project code:IC1207
Project name:PARSEME: PARSing and Multi-word Expressions
Collections
Version History
Files in this item
- Name
- Annotation_guidelines_PARSEME_Shared_Task_1.0.pdf
- Size
- 608.46 KB
- Format
- application/pdf
- Description
- common annotation guidelines
- MD5
- 7efe5547bd0d85cd3f341f0125a35a6c

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- Description_paper_PARSEME_Shared_Task_1.0.pdf
- Size
- 278.76 KB
- Format
- application/pdf
- Description
- an article describing the creation of the data and its use in the PARSEME shared task
- MD5
- 6947539d298d53bbcd9024437bd29939

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- README.md
- Size
- 2.67 KB
- Format
- application/octet-stream
- Description
- Unknown
- MD5
- 3b65e76fcb453f3dbe570240b4a0ca3a

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz


