This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.
 

Annotated corpora and tools of the PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions (edition 1.0)

Please use the following text to cite this item or export to a predefined format:
Savary, Agata; et al., 2017, Annotated corpora and tools of the PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions (edition 1.0), LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/11372/LRT-2282.
Authors
show everyone
Date issued
2017-01-20
Size
274376 sentences,
5439204 tokens,
62218 multiWordUnits
Description
The PARSEME shared task aims at identifying verbal MWEs in running texts. Verbal MWEs include idioms (let the cat out of the bag), light verb constructions (make a decision), verb-particle constructions (give up), and inherently reflexive verbs (se suicider 'to suicide' in French). VMWEs were annotated according to the universal guidelines in 18 languages. The corpora are provided in the parsemetsv format, inspired by the CONLL-U format. For most languages, paired files in the CONLL-U format - not necessarily using UD tagsets - containing parts of speech, lemmas, morphological features and/or syntactic dependencies are also provided. Depending on the language, the information comes from treebanks (e.g., Universal Dependencies) or from automatic parsers trained on treebanks (e.g., UDPipe). This item contains training and test data, tools and the universal guidelines file.
Publisher
Acknowledgement

Version History

Showing 1 - 4 out of 4 results
VersionDateSummary
2023-05-10 00:00:00
2020-07-09 00:00:00
2018-04-30 00:00:00
1*
2017-01-20 00:00:00
* Selected version
This item isPublicly Available
and licensed under:
 Files in this item
Name
IT.tgz
Size
4.82 MB
Format
application/x-gzip
Description
gzip Archive
MD5
64ab3ab19e87767e9fa9764130e41046
Preview
  File Preview
Name
CS.tgz
Size
10.56 MB
Format
application/x-gzip
Description
gzip Archive
MD5
b97b0f5bed1ed94f096be4150ee68049
Preview
  File Preview
Name
TR.tgz
Size
4.65 MB
Format
application/x-gzip
Description
gzip Archive
MD5
4d34bb3f81dec21184b9877da2dcf12b
Preview
  File Preview
Name
ES.tgz
Size
2.16 MB
Format
application/x-gzip
Description
gzip Archive
MD5
e1ae9704c3608f78cf57e09bb9b165dd
Preview
  File Preview
Name
SV.tgz
Size
499.71 KB
Format
application/x-gzip
Description
gzip Archive
MD5
52db26a0ba0dbc2e283c4551795b5271
Preview
  File Preview
Name
FR.tgz
Size
7.89 MB
Format
application/x-gzip
Description
gzip Archive
MD5
69f9a65d2a6c127573f8b646cc10eeb3
Preview
  File Preview
Name
EL.tgz
Size
3.59 MB
Format
application/x-gzip
Description
gzip Archive
MD5
0956143f60ed16a0c01c4ec87e6c07f3
Preview
  File Preview
Name
FA.tgz
Size
593.52 KB
Format
application/x-gzip
Description
gzip Archive
MD5
cc0686b1f93e0b0782855e514c54823a
Preview
  File Preview
Name
HE.tgz
Size
804.67 KB
Format
application/x-gzip
Description
gzip Archive
MD5
6e20cc548086eeb18469841b0c5b1393
Preview
  File Preview
Name
DE.tgz
Size
2.09 MB
Format
application/x-gzip
Description
gzip Archive
MD5
94ea5c3f074b783a090946e9e7e208ce
Preview
  File Preview
Name
Annotation_guidelines_PARSEME_Shared_Task_1.0.pdf
Size
608.46 KB
Format
application/pdf
Description
Adobe PDF
MD5
7efe5547bd0d85cd3f341f0125a35a6c
Preview
  File Preview
Name
Description_paper_PARSEME_Shared_Task_1.0.pdf
Size
278.76 KB
Format
application/pdf
Description
Adobe PDF
MD5
6947539d298d53bbcd9024437bd29939
Preview
  File Preview
Name
RO.tgz
Size
8 MB
Format
application/x-gzip
Description
gzip Archive
MD5
83f76629b83fc380facb4e11e98f119e
Preview
  File Preview
Name
PT.tgz
Size
5.1 MB
Format
application/x-gzip
Description
gzip Archive
MD5
eec1f919ce50aad1099d69381ab1f76e
Preview
  File Preview
Name
PL.tgz
Size
3.45 MB
Format
application/x-gzip
Description
gzip Archive
MD5
4f54d970d85b325b4c3b3f621e6c192d
Preview
  File Preview
Name
HU.tgz
Size
1.3 MB
Format
application/x-gzip
Description
gzip Archive
MD5
06ce4fa53dcaeda0b1bdce90a24637ea
Preview
  File Preview
Name
SL.tgz
Size
2.89 MB
Format
application/x-gzip
Description
gzip Archive
MD5
abf463cd1bf7855d35efb581706ebf66
Preview
  File Preview
Name
MT.tgz
Size
2.89 MB
Format
application/x-gzip
Description
gzip Archive
MD5
58f1b8bc4dc99429f504df50c59d21e5
Preview
  File Preview
Name
LT.tgz
Size
1.18 MB
Format
application/x-gzip
Description
gzip Archive
MD5
4b2e19fdb954c52a1adf1b1dc05de4a0
Preview
  File Preview
Name
BG.tgz
Size
959.28 KB
Format
application/x-gzip
Description
gzip Archive
MD5
b29b32b039d4b7cfafc02569a9e90dcd
Preview
  File Preview
Name
README.md
Size
2.67 KB
Format
application/octet-stream
Description
Unknown
MD5
3b65e76fcb453f3dbe570240b4a0ca3a
Preview
  File Preview