Show simple item record

 
dc.contributor.author Guillaume, Bruno
dc.contributor.author Ramisch, Carlos
dc.contributor.author Waszczuk, Jakub
dc.contributor.author Monti, Johanna
dc.contributor.author Di Buono, Maria Pia
dc.contributor.author Sangati, Federico
dc.contributor.author Speranza, Giulia
dc.contributor.author Carlino, Carola
dc.contributor.author Güngör, Tunga
dc.contributor.author Yirmibeşoğlu, Zeynep
dc.contributor.author Sak, Haşim
dc.contributor.author Saraçlar, Murat
dc.contributor.author Giouli, Voula
dc.contributor.author Foufi, Vassiliki
dc.contributor.author Ramisch, Renata
dc.contributor.author Rademaker, Alexandre
dc.contributor.author Vale, Oto
dc.contributor.author Wilkens, Rodrigo
dc.contributor.author Candito, Marie
dc.contributor.author Crabbé, Benoît
dc.contributor.author Segonne, Vincent
dc.contributor.author Liebeskind, Chaya
dc.contributor.author Stymne, Sara
dc.contributor.author Hajič, Jan
dc.contributor.author Ginter, Filip
dc.contributor.author Luotolahti, Juhani
dc.contributor.author Straka, Milan
dc.contributor.author Zeman, Daniel
dc.contributor.author Barbu Mititelu, Verginica
dc.contributor.author Cristescu, Mihaela
dc.contributor.author Vaidya, Ashwini
dc.contributor.author Bhatia, Archna
dc.contributor.author Lichte, Timm
dc.contributor.author Ehren, Rafael
dc.contributor.author Jiang, Menghan
dc.contributor.author Xu, Hongzhi
dc.contributor.author Walsh, Abigail
dc.contributor.author Irimia, Elena
dc.contributor.author Dowling, Meghan
dc.date.accessioned 2020-11-04T13:19:21Z
dc.date.available 2020-11-04T13:19:21Z
dc.date.issued 2020-07-09
dc.identifier.uri http://hdl.handle.net/11234/1-3416
dc.description This multilingual resource contains corpora for 14 languages, gathered at the occasion of the 1.2 edition of the PARSEME Shared Task on semi-supervised Identification of Verbal MWEs (2020). These corpora were meant to serve as additional "raw" corpora, to help discovering unseen verbal MWEs. The corpora are provided in CONLL-U (https://universaldependencies.org/format.html) format. They contain morphosyntactic annotations (parts of speech, lemmas, morphological features, and syntactic dependencies). Depending on the language, the information comes from treebanks (mostly Universal Dependencies v2.x) or from automatic parsers trained on UD v2.x treebanks (e.g., UDPipe). VMWEs include idioms (let the cat out of the bag), light-verb constructions (make a decision), verb-particle constructions (give up), inherently reflexive verbs (help oneself), and multi-verb constructions (make do). For the 1.2 shared task edition, the data covers 14 languages, for which VMWEs were annotated according to the universal guidelines. The corpora are provided in the cupt format, inspired by the CONLL-U format. Morphological and syntactic information ­­­­– not necessarily using UD tagsets – including parts of speech, lemmas, morphological features and/or syntactic dependencies are also provided. Depending on the language, the information comes from treebanks (e.g., Universal Dependencies) or from automatic parsers trained on treebanks (e.g., UDPipe). This item contains training, development and test data, as well as the evaluation tools used in the PARSEME Shared Task 1.2 (2020). The annotation guidelines are available online: http://parsemefr.lif.univ-mrs.fr/parseme-st-guidelines/1.2
dc.language.iso deu
dc.language.iso ell
dc.language.iso eus
dc.language.iso fra
dc.language.iso gle
dc.language.iso heb
dc.language.iso hin
dc.language.iso ita
dc.language.iso pol
dc.language.iso por
dc.language.iso ron
dc.language.iso swe
dc.language.iso tur
dc.language.iso zho
dc.publisher PARSEME
dc.rights PARSEME Shared Task Raw Corpus Data (v. 1.2) Agreement
dc.rights.uri https://lindat.mff.cuni.cz/repository/xmlui/page/licence-mwe-1.2-raw
dc.source.uri http://multiword.sf.net/sharedtask2020
dc.subject morphosyntactic annotation
dc.subject dependency trees
dc.subject morphological analysis
dc.title Morpho-syntactically annotated corpora provided for the PARSEME Shared Task on Semi-Supervised Identification of Verbal Multiword Expressions (edition 1.2)
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
dc.rights.label PUB
has.files yes
branding LINDAT / CLARIAH-CZ
contact.person Candito Marie marie.candito@gmail.com LLF (Université de Paris / CNRS)
contact.person Guillaume Bruno bruno.guillaume@loria.fr LORIA
size.info 450 gb
files.size 47768774083
files.count 49


 Files in this item

This item is
Publicly Available
and licensed under:
PARSEME Shared Task Raw Corpus Data (v. 1.2) Agreement
Distributed under Creative Commons
Icon
Name
README_raw.md
Size
10.97 KB
Format
Unknown
Description
README file
MD5
f93e74d775c864bd27446d9244cf19ec
 Download file
Icon
Name
DE.tgz
Size
1.34 GB
Format
application/x-gzip
Description
German corpus
MD5
3aed5aee260875f8903cc0a1543d890a
 Download file  Preview
 File Preview  
  • DE
    • raw-008.conllu.xz137 MB
    • raw-010.conllu.xz137 MB
    • raw-004.conllu.xz137 MB
    • raw-003.conllu.xz137 MB
    • files.txt180 B
    • raw-007.conllu.xz137 MB
    • raw-002.conllu.xz137 MB
    • raw-006.conllu.xz137 MB
    • raw-009.conllu.xz137 MB
    • raw-001.conllu.xz137 MB
    • raw-005.conllu.xz137 MB
Icon
Name
EL.tgz
Size
187.26 MB
Format
application/x-gzip
Description
Greek corpus
MD5
c81f1052d4ff0f48a36deca267ccaf1f
 Download file  Preview
 File Preview  
  • EL
    • raw-040.conllu.xz1 MB
    • raw-109.conllu.xz1 MB
    • raw-057.conllu.xz1 MB
    • raw-064.conllu.xz1 MB
    • raw-017.conllu.xz1 MB
    • raw-024.conllu.xz1 MB
    • raw-088.conllu.xz1 MB
    • raw-104.conllu.xz1 MB
    • files.txt1 kB
    • raw-012.conllu.xz1 MB
    • raw-076.conllu.xz1 MB
    • raw-083.conllu.xz1 MB
    • raw-029.conllu.xz1 MB
    • raw-036.conllu.xz1 MB
    • raw-043.conllu.xz1 MB
    • raw-071.conllu.xz1 MB
    • raw-031.conllu.xz1 MB
    • raw-095.conllu.xz1 MB
    • raw-107.conllu.xz1 MB
    • raw-048.conllu.xz1 MB
    • raw-055.conllu.xz1 MB
    • raw-062.conllu.xz1 MB
    • raw-008.conllu.xz1 MB
    • raw-015.conllu.xz1 MB
    • raw-090.conllu.xz1 MB
    • raw-079.conllu.xz1 MB
    • raw-102.conllu.xz1 MB
    • raw-050.conllu.xz1 MB
    • raw-039.conllu.xz1 MB
    • raw-003.conllu.xz1 MB
    • raw-010.conllu.xz1 MB
    • raw-067.conllu.xz1 MB
    • raw-074.conllu.xz1 MB
    • raw-081.conllu.xz1 MB
    • raw-027.conllu.xz1 MB
    • raw-034.conllu.xz1 MB
    • raw-098.conllu.xz1 MB
    • raw-022.conllu.xz1 MB
    • raw-086.conllu.xz1 MB
    • raw-093.conllu.xz1 MB
    • raw-105.conllu.xz1 MB
    • raw-046.conllu.xz1 MB
    • raw-053.conllu.xz1 MB
    • raw-006.conllu.xz1 MB
    • raw-100.conllu.xz1 MB
    • raw-041.conllu.xz1 MB
    • raw-001.conllu.xz1 MB
    • raw-058.conllu.xz1 MB
    • raw-065.conllu.xz1 MB
    • raw-072.conllu.xz1 MB
    • raw-018.conllu.xz1 MB
    • raw-025.conllu.xz1 MB
    • raw-089.conllu.xz1 MB
    • raw-060.conllu.xz1 MB
    • raw-049.conllu.xz1 MB
    • raw-013.conllu.xz1 MB
    • raw-020.conllu.xz1 MB
    • raw-077.conllu.xz1 MB
    • raw-084.conllu.xz1 MB
    • raw-091.conllu.xz1 MB
    • raw-037.conllu.xz1 MB
    • raw-044.conllu.xz1 MB
    • raw-004.conllu.xz1 MB
    • raw-068.conllu.xz1 MB
    • raw-032.conllu.xz1 MB
    • raw-096.conllu.xz1 MB
    • raw-108.conllu.xz1 MB
    • raw-056.conllu.xz1 MB
    • raw-063.conllu.xz1 MB
    • raw-009.conllu.xz2 MB
    • raw-016.conllu.xz1 MB
    • raw-087.conllu.xz1 MB
    • raw-103.conllu.xz1 MB
    • raw-110.conllu.xz1 MB
    • raw-051.conllu.xz1 MB
    • raw-011.conllu.xz1 MB
    • raw-075.conllu.xz1 MB
    • raw-082.conllu.xz1 MB
    • raw-028.conllu.xz1 MB
    • raw-035.conllu.xz1 MB
    • raw-099.conllu.xz1 MB
    • raw-059.conllu.xz1 MB
    • raw-070.conllu.xz1 MB
    • raw-023.conllu.xz1 MB
    • raw-030.conllu.xz1 MB
    • raw-094.conllu.xz1 MB
    • raw-106.conllu.xz1 MB
    • raw-047.conllu.xz1 MB
    • raw-054.conllu.xz1 MB
    • raw-007.conllu.xz1 MB
    • raw-014.conllu.xz1 MB
    • raw-078.conllu.xz1 MB
    • raw-101.conllu.xz1 MB
    • raw-042.conllu.xz1 MB
    • raw-002.conllu.xz1 MB
    • raw-066.conllu.xz1 MB
    • raw-073.conllu.xz1 MB
    • raw-019.conllu.xz1 MB
    • raw-026.conllu.xz1 MB
    • raw-033.conllu.xz1 MB
    • raw-097.conllu.xz1 MB
    • raw-061.conllu.xz1 MB
    • raw-021.conllu.xz1 MB
    • raw-085.conllu.xz1 MB
    • raw-092.conllu.xz1 MB
    • raw-038.conllu.xz1 MB
    • raw-045.conllu.xz1 MB
    • raw-052.conllu.xz1 MB
    • raw-005.conllu.xz1 MB
    • raw-080.conllu.xz1 MB
    • raw-069.conllu.xz1 MB
Icon
Name
EU.tgz
Size
141.77 MB
Format
application/x-gzip
Description
Basque corpus
MD5
a7617a30028a4189558c018612368c77
 Download file  Preview
 File Preview  
  • EU
    • raw-108.conllu.xz886 kB
    • raw-017.conllu.xz1019 kB
    • raw-136.conllu.xz1006 kB
    • raw-067.conllu.xz1 MB
    • raw-113.conllu.xz1 MB
    • raw-068.conllu.xz984 kB
    • raw-045.conllu.xz969 kB
    • raw-022.conllu.xz1 MB
    • raw-095.conllu.xz290 kB
    • raw-141.conllu.xz1 MB
    • raw-073.conllu.xz901 kB
    • raw-050.conllu.xz997 kB
    • raw-109.conllu.xz1 MB
    • raw-018.conllu.xz921 kB
    • raw-137.conllu.xz1 MB
    • raw-114.conllu.xz995 kB
    • raw-069.conllu.xz932 kB
    • raw-115.conllu.xz991 kB
    • raw-046.conllu.xz982 kB
    • raw-023.conllu.xz992 kB
    • raw-096.conllu.xz998 kB
    • raw-142.conllu.xz915 kB
    • raw-024.conllu.xz847 kB
    • raw-097.conllu.xz960 kB
    • raw-001.conllu.xz886 kB
    • raw-074.conllu.xz976 kB
    • raw-120.conllu.xz967 kB
    • raw-051.conllu.xz1004 kB
    • raw-052.conllu.xz958 kB
    • raw-019.conllu.xz990 kB
    • raw-138.conllu.xz1 MB
    • raw-116.conllu.xz997 kB
    • raw-047.conllu.xz973 kB
    • raw-143.conllu.xz818 kB
    • raw-025.conllu.xz974 kB
    • raw-098.conllu.xz959 kB
    • raw-144.conllu.xz994 kB
    • raw-002.conllu.xz1016 kB
    • raw-075.conllu.xz959 kB
    • raw-121.conllu.xz1021 kB
    • raw-053.conllu.xz947 kB
    • raw-030.conllu.xz1002 kB
    • raw-080.conllu.xz843 kB
    • raw-081.conllu.xz953 kB
    • raw-139.conllu.xz1 MB
    • raw-117.conllu.xz1 MB
    • raw-048.conllu.xz947 kB
    • raw-049.conllu.xz992 kB
    • raw-026.conllu.xz967 kB
    • raw-099.conllu.xz1000 kB
    • raw-145.conllu.xz944 kB
    • raw-003.conllu.xz909 kB
    • raw-076.conllu.xz977 kB
    • raw-122.conllu.xz964 kB
    • raw-004.conllu.xz865 kB
    • raw-054.conllu.xz1009 kB
    • raw-100.conllu.xz998 kB
    • raw-031.conllu.xz1023 kB
    • raw-150.conllu.xz1 MB
    • raw-082.conllu.xz840 kB
    • raw-118.conllu.xz955 kB
    • raw-027.conllu.xz904 kB
    • raw-146.conllu.xz897 kB
    • raw-077.conllu.xz846 kB
    • raw-005.conllu.xz908 kB
    • raw-123.conllu.xz1 MB
    • raw-078.conllu.xz946 kB
    • raw-124.conllu.xz1015 kB
    • raw-055.conllu.xz988 kB
    • raw-101.conllu.xz971 kB
    • raw-032.conllu.xz1 MB
    • raw-033.conllu.xz918 kB
    • raw-010.conllu.xz945 kB
    • raw-083.conllu.xz1 MB
    • raw-060.conllu.xz1015 kB
    • raw-119.conllu.xz1 MB
    • raw-028.conllu.xz992 kB
    • raw-147.conllu.xz862 kB
    • raw-006.conllu.xz965 kB
    • raw-079.conllu.xz1016 kB
    • raw-125.conllu.xz991 kB
    • raw-056.conllu.xz954 kB
    • raw-102.conllu.xz995 kB
    • raw-034.conllu.xz948 kB
    • raw-011.conllu.xz921 kB
    • raw-084.conllu.xz990 kB
    • raw-130.conllu.xz1007 kB
    • raw-061.conllu.xz941 kB
    • raw-062.conllu.xz1 MB
    • raw-029.conllu.xz1018 kB
    • raw-148.conllu.xz987 kB
    • raw-007.conllu.xz862 kB
    • raw-126.conllu.xz1021 kB
    • raw-057.conllu.xz952 kB
    • raw-103.conllu.xz1023 kB
    • raw-035.conllu.xz959 kB
    • raw-012.conllu.xz956 kB
    • raw-085.conllu.xz1 MB
    • raw-131.conllu.xz988 kB
    • raw-063.conllu.xz970 kB
    • raw-040.conllu.xz868 kB
    • raw-090.conllu.xz949 kB
    • raw-091.conllu.xz946 kB
    • raw-149.conllu.xz894 kB
    • raw-008.conllu.xz1 MB
    • raw-127.conllu.xz1 MB
    • raw-058.conllu.xz1 MB
    • raw-104.conllu.xz923 kB
    • raw-059.conllu.xz1012 kB
    • raw-105.conllu.xz1 MB
    • raw-036.conllu.xz908 kB
    • raw-013.conllu.xz1023 kB
    • raw-086.conllu.xz897 kB
    • raw-132.conllu.xz972 kB
    • raw-014.conllu.xz663 kB
    • raw-087.conllu.xz914 kB
    • raw-064.conllu.xz1005 kB
    • raw-110.conllu.xz935 kB
    • raw-041.conllu.xz1016 kB
    • raw-092.conllu.xz983 kB
    • raw-009.conllu.xz968 kB
    • raw-128.conllu.xz1003 kB
    • raw-106.conllu.xz1 MB
    • raw-037.conllu.xz1 MB
    • raw-133.conllu.xz888 kB
    • raw-015.conllu.xz861 kB
    • raw-088.conllu.xz947 kB
    • raw-134.conllu.xz934 kB
    • raw-065.conllu.xz949 kB
    • raw-111.conllu.xz996 kB
    • raw-042.conllu.xz996 kB
    • files.txt2 kB
    • raw-043.conllu.xz840 kB
    • raw-020.conllu.xz1010 kB
    • raw-093.conllu.xz964 kB
    • raw-070.conllu.xz1 MB
    • raw-129.conllu.xz982 kB
    • raw-107.conllu.xz1004 kB
    • raw-038.conllu.xz992 kB
    • raw-039.conllu.xz1007 kB
    • raw-016.conllu.xz987 kB
    • raw-089.conllu.xz983 kB
    • raw-135.conllu.xz1 MB
    • raw-066.conllu.xz1004 kB
    • raw-112.conllu.xz955 kB
    • raw-044.conllu.xz951 kB
    • raw-021.conllu.xz1 MB
    • raw-094.conllu.xz932 kB
    • raw-140.conllu.xz967 kB
    • raw-071.conllu.xz987 kB
    • raw-072.conllu.xz1011 kB
Icon
Name
FR-0.tgz
Size
1.43 GB
Format
application/x-gzip
Description
French corpus 1/3
MD5
c40111581b2f4d45c8d284087609254a
 Download file  Preview
 File Preview  
  • FR
    • raw-008.conllu.xz157 MB
    • raw-004.conllu.xz165 MB
    • raw-003.conllu.xz165 MB
    • raw-007.conllu.xz158 MB
    • raw-002.conllu.xz167 MB
    • raw-006.conllu.xz155 MB
    • raw-009.conllu.xz155 MB
    • raw-001.conllu.xz174 MB
    • raw-005.conllu.xz158 MB
Icon
Name
FR-1.tgz
Size
1.43 GB
Format
application/x-gzip
Description
French corpus 2/3
MD5
984b76faccffef4cdd50d357ee033206
 Download file  Preview
 File Preview  
  • FR
    • raw-010.conllu.xz154 MB
    • raw-014.conllu.xz143 MB
    • raw-013.conllu.xz145 MB
    • raw-017.conllu.xz146 MB
    • raw-012.conllu.xz146 MB
    • raw-016.conllu.xz146 MB
    • raw-019.conllu.xz142 MB
    • raw-011.conllu.xz148 MB
    • raw-015.conllu.xz144 MB
    • raw-018.conllu.xz147 MB
Icon
Name
FR-2.tgz
Size
1.39 GB
Format
application/x-gzip
Description
French corpus 3/3
MD5
ed9ae99da315a5c4e2adcf226d3d0f3e
 Download file  Preview
 File Preview  
  • FR
    • raw-023.conllu.xz142 MB
    • raw-027.conllu.xz143 MB
    • raw-022.conllu.xz142 MB
    • raw-026.conllu.xz142 MB
    • raw-029.conllu.xz141 MB
    • raw-021.conllu.xz143 MB
    • raw-025.conllu.xz142 MB
    • raw-028.conllu.xz142 MB
    • raw-020.conllu.xz141 MB
    • raw-024.conllu.xz143 MB
Icon
Name
GA.tgz
Size
199.6 MB
Format
application/x-gzip
Description
Irish corpus
MD5
a06d893fdb1c69d7286a95cb601b3a2e
 Download file  Preview
 File Preview  
  • GA
    • raw-040.conllu.xz712 kB
    • raw-109.conllu.xz1 MB
    • raw-116.conllu.xz1 MB
    • raw-057.conllu.xz1 MB
    • raw-064.conllu.xz1 MB
    • raw-017.conllu.xz1 MB
    • raw-024.conllu.xz1 MB
    • raw-088.conllu.xz1 MB
    • raw-104.conllu.xz1 MB
    • raw-111.conllu.xz1 MB
    • files.txt2 kB
    • raw-012.conllu.xz1 MB
    • raw-128.conllu.xz975 kB
    • raw-135.conllu.xz1 MB
    • raw-076.conllu.xz1 MB
    • raw-083.conllu.xz1 MB
    • raw-029.conllu.xz685 kB
    • raw-036.conllu.xz739 kB
    • raw-043.conllu.xz808 kB
    • raw-123.conllu.xz1 MB
    • raw-130.conllu.xz1 MB
    • raw-071.conllu.xz1 MB
    • raw-031.conllu.xz1 MB
    • raw-095.conllu.xz1 MB
    • raw-107.conllu.xz1 MB
    • raw-048.conllu.xz1 MB
    • raw-055.conllu.xz1 MB
    • raw-062.conllu.xz1 MB
    • raw-008.conllu.xz1 MB
    • raw-015.conllu.xz1 MB
    • raw-090.conllu.xz1 MB
    • raw-079.conllu.xz1 MB
    • raw-102.conllu.xz1 MB
    • raw-050.conllu.xz1 MB
    • raw-039.conllu.xz889 kB
    • raw-003.conllu.xz1 MB
    • raw-010.conllu.xz1 MB
    • raw-119.conllu.xz1 MB
    • raw-126.conllu.xz1 MB
    • raw-067.conllu.xz1 MB
    • raw-074.conllu.xz1 MB
    • raw-081.conllu.xz1 MB
    • raw-027.conllu.xz1 MB
    • raw-034.conllu.xz761 kB
    • raw-098.conllu.xz1 MB
    • raw-114.conllu.xz1 MB
    • raw-121.conllu.xz1 MB
    • raw-022.conllu.xz1 MB
    • raw-138.conllu.xz1 MB
    • raw-086.conllu.xz2 MB
    • raw-093.conllu.xz1 MB
    • raw-105.conllu.xz1 MB
    • raw-046.conllu.xz1 MB
    • raw-053.conllu.xz1 MB
    • raw-006.conllu.xz1 MB
    • raw-133.conllu.xz1 MB
    • raw-100.conllu.xz1 MB
    • raw-041.conllu.xz668 kB
    • raw-001.conllu.xz1 MB
    • raw-117.conllu.xz1006 kB
    • raw-058.conllu.xz1 MB
    • raw-124.conllu.xz1 MB
    • raw-065.conllu.xz1 MB
    • raw-072.conllu.xz1 MB
    • raw-018.conllu.xz1 MB
    • raw-025.conllu.xz1 MB
    • raw-089.conllu.xz1 MB
    • raw-112.conllu.xz1 MB
    • raw-060.conllu.xz1 MB
    • raw-049.conllu.xz1 MB
    • raw-013.conllu.xz1 MB
    • raw-020.conllu.xz1 MB
    • raw-129.conllu.xz1 MB
    • raw-136.conllu.xz1 MB
    • raw-077.conllu.xz1 MB
    • raw-084.conllu.xz1 MB
    • raw-091.conllu.xz2 MB
    • raw-037.conllu.xz678 kB
    • raw-044.conllu.xz735 kB
    • raw-004.conllu.xz1 MB
    • raw-131.conllu.xz1 MB
    • raw-068.conllu.xz1 MB
    • raw-032.conllu.xz1 MB
    • raw-096.conllu.xz1 MB
    • raw-108.conllu.xz1 MB
    • raw-115.conllu.xz1 MB
    • raw-056.conllu.xz1 MB
    • raw-063.conllu.xz1 MB
    • raw-009.conllu.xz1 MB
    • raw-016.conllu.xz1 MB
    • raw-087.conllu.xz1 MB
    • raw-103.conllu.xz1 MB
    • raw-110.conllu.xz1 MB
    • raw-051.conllu.xz1 MB
    • raw-011.conllu.xz1 MB
    • raw-127.conllu.xz1 MB
    • raw-134.conllu.xz1 MB
    • raw-075.conllu.xz1 MB
    • raw-082.conllu.xz1 MB
    • raw-028.conllu.xz1 MB
    • raw-035.conllu.xz1 MB
    • raw-099.conllu.xz1 MB
    • raw-122.conllu.xz1 MB
    • raw-059.conllu.xz1 MB
    • raw-070.conllu.xz1 MB
    • raw-023.conllu.xz1 MB
    • raw-030.conllu.xz963 kB
    • raw-094.conllu.xz1 MB
    • raw-106.conllu.xz865 kB
    • raw-047.conllu.xz1 MB
    • raw-054.conllu.xz1 MB
    • raw-007.conllu.xz1 MB
    • raw-014.conllu.xz1 MB
    • raw-078.conllu.xz1 MB
    • raw-101.conllu.xz1 MB
    • raw-042.conllu.xz851 kB
    • raw-002.conllu.xz1 MB
    • raw-118.conllu.xz1 MB
    • raw-125.conllu.xz1 MB
    • raw-066.conllu.xz1 MB
    • raw-073.conllu.xz1 MB
    • raw-019.conllu.xz1 MB
    • raw-026.conllu.xz1 MB
    • raw-033.conllu.xz1 MB
    • raw-097.conllu.xz2 MB
    • raw-113.conllu.xz1 MB
    • raw-120.conllu.xz1 MB
    • raw-061.conllu.xz1 MB
    • raw-021.conllu.xz1 MB
    • raw-137.conllu.xz1 MB
    • raw-085.conllu.xz1 MB
    • raw-092.conllu.xz1 MB
    • raw-038.conllu.xz873 kB
    • raw-045.conllu.xz1 MB
    • raw-052.conllu.xz1 MB
    • raw-005.conllu.xz1 MB
    • raw-132.conllu.xz1 MB
    • raw-080.conllu.xz1 MB
    • raw-069.conllu.xz1 MB
Icon
Name
HE.tgz
Size
88.2 MB
Format
application/x-gzip
Description
Hebrew corpus
MD5
0a11b23381d024a3ebdca84e18245a68
 Download file  Preview
 File Preview  
  • HE
    • files.txt36 B
    • raw-001.conllu.xz82 MB
    • raw-002.conllu.xz5 MB
Icon
Name
HI.tgz
Size
542.97 MB
Format
application/x-gzip
Description
Hindi corpus
MD5
547b0eaa6eb0fce60af47473c72276a1
 Download file  Preview
 File Preview  
  • HI
    • raw-040.conllu.xz5 MB
    • raw-057.conllu.xz5 MB
    • raw-064.conllu.xz5 MB
    • raw-017.conllu.xz5 MB
    • raw-024.conllu.xz5 MB
    • raw-088.conllu.xz5 MB
    • files.txt1 kB
    • raw-012.conllu.xz5 MB
    • raw-076.conllu.xz5 MB
    • raw-083.conllu.xz5 MB
    • raw-029.conllu.xz5 MB
    • raw-036.conllu.xz5 MB
    • raw-043.conllu.xz5 MB
    • raw-071.conllu.xz5 MB
    • raw-031.conllu.xz5 MB
    • raw-095.conllu.xz5 MB
    • raw-048.conllu.xz5 MB
    • raw-055.conllu.xz5 MB
    • raw-062.conllu.xz5 MB
    • raw-008.conllu.xz5 MB
    • raw-015.conllu.xz5 MB
    • raw-090.conllu.xz5 MB
    • raw-079.conllu.xz5 MB
    • raw-050.conllu.xz5 MB
    • raw-039.conllu.xz5 MB
    • raw-003.conllu.xz5 MB
    • raw-010.conllu.xz5 MB
    • raw-067.conllu.xz5 MB
    • raw-074.conllu.xz5 MB
    • raw-081.conllu.xz5 MB
    • raw-027.conllu.xz5 MB
    • raw-034.conllu.xz5 MB
    • raw-098.conllu.xz5 MB
    • raw-022.conllu.xz5 MB
    • raw-086.conllu.xz5 MB
    • raw-093.conllu.xz5 MB
    • raw-046.conllu.xz5 MB
    • raw-053.conllu.xz5 MB
    • raw-006.conllu.xz5 MB
    • raw-041.conllu.xz5 MB
    • raw-001.conllu.xz5 MB
    • raw-058.conllu.xz5 MB
    • raw-065.conllu.xz5 MB
    • raw-072.conllu.xz5 MB
    • raw-018.conllu.xz5 MB
    • raw-025.conllu.xz5 MB
    • raw-089.conllu.xz5 MB
    • raw-060.conllu.xz5 MB
    • raw-049.conllu.xz5 MB
    • raw-013.conllu.xz5 MB
    • raw-020.conllu.xz5 MB
    • raw-077.conllu.xz5 MB
    • raw-084.conllu.xz5 MB
    • raw-091.conllu.xz5 MB
    • raw-037.conllu.xz5 MB
    • raw-044.conllu.xz5 MB
    • raw-004.conllu.xz5 MB
    • raw-068.conllu.xz5 MB
    • raw-032.conllu.xz5 MB
    • raw-096.conllu.xz5 MB
    • raw-056.conllu.xz5 MB
    • raw-063.conllu.xz5 MB
    • raw-009.conllu.xz5 MB
    • raw-016.conllu.xz5 MB
    • raw-087.conllu.xz5 MB
    • raw-051.conllu.xz5 MB
    • raw-011.conllu.xz5 MB
    • raw-075.conllu.xz5 MB
    • raw-082.conllu.xz5 MB
    • raw-028.conllu.xz5 MB
    • raw-035.conllu.xz5 MB
    • raw-099.conllu.xz5 MB
    • raw-059.conllu.xz5 MB
    • raw-070.conllu.xz5 MB
    • raw-023.conllu.xz5 MB
    • raw-030.conllu.xz5 MB
    • raw-094.conllu.xz5 MB
    • raw-047.conllu.xz5 MB
    • raw-054.conllu.xz5 MB
    • raw-007.conllu.xz5 MB
    • raw-014.conllu.xz5 MB
    • raw-078.conllu.xz5 MB
    • raw-042.conllu.xz5 MB
    • raw-002.conllu.xz5 MB
    • raw-066.conllu.xz5 MB
    • raw-073.conllu.xz5 MB
    • raw-019.conllu.xz5 MB
    • raw-026.conllu.xz5 MB
    • raw-033.conllu.xz5 MB
    • raw-097.conllu.xz5 MB
    • raw-061.conllu.xz5 MB
    • raw-021.conllu.xz5 MB
    • raw-085.conllu.xz5 MB
    • raw-092.conllu.xz5 MB
    • raw-038.conllu.xz5 MB
    • raw-045.conllu.xz5 MB
    • raw-052.conllu.xz5 MB
    • raw-005.conllu.xz5 MB
    • raw-080.conllu.xz5 MB
    • raw-069.conllu.xz5 MB
Icon
Name
IT.tgz
Size
1.59 GB
Format
application/x-gzip
Description
Italian corpus
MD5
ce3e3baf237c011731bcbf42c16f161b
 Download file  Preview
 File Preview  
  • IT
    • raw-108.conllu.xz7 MB
    • raw-227.conllu.xz6 MB
    • raw-158.conllu.xz7 MB
    • raw-204.conllu.xz6 MB
    • raw-159.conllu.xz7 MB
    • raw-017.conllu.xz7 MB
    • raw-136.conllu.xz7 MB
    • raw-067.conllu.xz2 MB
    • raw-113.conllu.xz7 MB
    • raw-186.conllu.xz7 MB
    • raw-068.conllu.xz2 MB
    • raw-232.conllu.xz6 MB
    • raw-045.conllu.xz8 MB
    • raw-164.conllu.xz7 MB
    • raw-210.conllu.xz6 MB
    • raw-022.conllu.xz4 MB
    • raw-095.conllu.xz7 MB
    • raw-141.conllu.xz7 MB
    • raw-191.conllu.xz7 MB
    • raw-073.conllu.xz8 MB
    • raw-192.conllu.xz6 MB
    • raw-050.conllu.xz7 MB
    • raw-109.conllu.xz7 MB
    • raw-228.conllu.xz6 MB
    • raw-205.conllu.xz7 MB
    • raw-206.conllu.xz6 MB
    • raw-018.conllu.xz7 MB
    • raw-137.conllu.xz7 MB
    • raw-114.conllu.xz7 MB
    • raw-187.conllu.xz7 MB
    • raw-069.conllu.xz2 MB
    • raw-233.conllu.xz6 MB
    • raw-115.conllu.xz6 MB
    • raw-188.conllu.xz6 MB
    • raw-046.conllu.xz7 MB
    • raw-165.conllu.xz7 MB
    • raw-211.conllu.xz6 MB
    • raw-023.conllu.xz4 MB
    • raw-096.conllu.xz7 MB
    • raw-142.conllu.xz7 MB
    • raw-024.conllu.xz7 MB
    • raw-097.conllu.xz7 MB
    • raw-001.conllu.xz5 MB
    • raw-074.conllu.xz5 MB
    • raw-120.conllu.xz7 MB
    • raw-193.conllu.xz6 MB
    • raw-051.conllu.xz7 MB
    • raw-170.conllu.xz7 MB
    • raw-052.conllu.xz7 MB
    • raw-229.conllu.xz7 MB
    • raw-207.conllu.xz6 MB
    • raw-019.conllu.xz7 MB
    • raw-138.conllu.xz7 MB
    • raw-234.conllu.xz5 MB
    • raw-116.conllu.xz7 MB
    • raw-189.conllu.xz6 MB
    • raw-235.conllu.xz5 MB
    • raw-047.conllu.xz7 MB
    • raw-166.conllu.xz7 MB
    • raw-212.conllu.xz6 MB
    • raw-143.conllu.xz7 MB
    • raw-025.conllu.xz7 MB
    • raw-098.conllu.xz7 MB
    • raw-144.conllu.xz7 MB
    • raw-002.conllu.xz6 MB
    • raw-075.conllu.xz7 MB
    • raw-121.conllu.xz7 MB
    • raw-194.conllu.xz6 MB
    • raw-240.conllu.xz6 MB
    • raw-171.conllu.xz7 MB
    • raw-053.conllu.xz7 MB
    • raw-030.conllu.xz5 MB
    • raw-080.conllu.xz7 MB
    • raw-081.conllu.xz5 MB
    • raw-208.conllu.xz6 MB
    • raw-139.conllu.xz7 MB
    • raw-117.conllu.xz7 MB
    • raw-236.conllu.xz6 MB
    • raw-048.conllu.xz5 MB
    • raw-167.conllu.xz7 MB
    • raw-049.conllu.xz6 MB
    • raw-213.conllu.xz7 MB
    • raw-026.conllu.xz7 MB
    • raw-099.conllu.xz7 MB
    • raw-145.conllu.xz7 MB
    • raw-003.conllu.xz939 kB
    • raw-076.conllu.xz7 MB
    • raw-122.conllu.xz7 MB
    • raw-004.conllu.xz390 kB
    • raw-195.conllu.xz6 MB
    • raw-241.conllu.xz6 MB
    • raw-172.conllu.xz6 MB
    • raw-054.conllu.xz2 MB
    • raw-100.conllu.xz7 MB
    • raw-173.conllu.xz7 MB
    • raw-031.conllu.xz7 MB
    • raw-150.conllu.xz7 MB
    • raw-082.conllu.xz6 MB
    • raw-209.conllu.xz6 MB
    • raw-118.conllu.xz7 MB
    • raw-237.conllu.xz6 MB
    • raw-168.conllu.xz6 MB
    • raw-214.conllu.xz6 MB
    • raw-169.conllu.xz7 MB
    • raw-027.conllu.xz1 MB
    • raw-146.conllu.xz7 MB
    • raw-077.conllu.xz7 MB
    • raw-005.conllu.xz419 kB
    • raw-123.conllu.xz7 MB
    • raw-196.conllu.xz6 MB
    • raw-078.conllu.xz7 MB
    • raw-242.conllu.xz9 MB
    • raw-124.conllu.xz7 MB
    • raw-055.conllu.xz2 MB
    • raw-101.conllu.xz7 MB
    • raw-174.conllu.xz6 MB
    • raw-220.conllu.xz6 MB
    • raw-032.conllu.xz5 MB
    • raw-151.conllu.xz7 MB
    • raw-033.conllu.xz5 MB
    • raw-010.conllu.xz9 MB
    • raw-083.conllu.xz6 MB
    • raw-060.conllu.xz3 MB
    • raw-119.conllu.xz7 MB
    • raw-238.conllu.xz6 MB
    • raw-215.conllu.xz6 MB
    • raw-216.conllu.xz6 MB
    • raw-028.conllu.xz334 kB
    • raw-147.conllu.xz7 MB
    • raw-006.conllu.xz480 kB
    • raw-197.conllu.xz7 MB
    • raw-079.conllu.xz7 MB
    • raw-243.conllu.xz9 MB
    • raw-125.conllu.xz7 MB
    • raw-198.conllu.xz6 MB
    • raw-244.conllu.xz7 MB
    • raw-056.conllu.xz3 MB
    • raw-102.conllu.xz6 MB
    • raw-175.conllu.xz6 MB
    • raw-221.conllu.xz6 MB
    • raw-152.conllu.xz7 MB
    • raw-034.conllu.xz4 MB
    • raw-153.conllu.xz7 MB
    • raw-011.conllu.xz8 MB
    • raw-084.conllu.xz7 MB
    • raw-130.conllu.xz7 MB
    • raw-061.conllu.xz3 MB
    • raw-180.conllu.xz6 MB
    • raw-062.conllu.xz4 MB
    • raw-239.conllu.xz6 MB
    • raw-217.conllu.xz6 MB
    • raw-029.conllu.xz353 kB
    • raw-148.conllu.xz7 MB
    • raw-007.conllu.xz5 MB
    • raw-126.conllu.xz6 MB
    • raw-199.conllu.xz6 MB
    • raw-245.conllu.xz8 MB
    • raw-057.conllu.xz3 MB
    • raw-103.conllu.xz7 MB
    • raw-176.conllu.xz6 MB
    • raw-222.conllu.xz6 MB
    • raw-035.conllu.xz6 MB
    • raw-154.conllu.xz7 MB
    • raw-200.conllu.xz6 MB
    • raw-012.conllu.xz9 MB
    • raw-085.conllu.xz7 MB
    • raw-131.conllu.xz7 MB
    • raw-181.conllu.xz6 MB
    • raw-063.conllu.xz3 MB
    • raw-182.conllu.xz6 MB
    • raw-040.conllu.xz4 MB
    • raw-090.conllu.xz7 MB
    • raw-091.conllu.xz7 MB
    • raw-218.conllu.xz6 MB
    • raw-149.conllu.xz7 MB
    • raw-008.conllu.xz9 MB
    • raw-127.conllu.xz5 MB
    • raw-246.conllu.xz6 MB
    • raw-058.conllu.xz3 MB
    • raw-104.conllu.xz7 MB
    • raw-177.conllu.xz6 MB
    • raw-059.conllu.xz3 MB
    • raw-223.conllu.xz6 MB
    • raw-105.conllu.xz7 MB
    • raw-036.conllu.xz7 MB
    • raw-155.conllu.xz7 MB
    • raw-201.conllu.xz6 MB
    • raw-013.conllu.xz9 MB
    • raw-086.conllu.xz7 MB
    • raw-132.conllu.xz7 MB
    • raw-014.conllu.xz7 MB
    • raw-087.conllu.xz7 MB
    • raw-064.conllu.xz3 MB
    • raw-110.conllu.xz7 MB
    • raw-183.conllu.xz6 MB
    • raw-041.conllu.xz5 MB
    • raw-160.conllu.xz7 MB
    • raw-092.conllu.xz7 MB
    • raw-219.conllu.xz6 MB
    • raw-009.conllu.xz8 MB
    • raw-128.conllu.xz7 MB
    • raw-178.conllu.xz6 MB
    • raw-224.conllu.xz6 MB
    • raw-106.conllu.xz7 MB
    • raw-179.conllu.xz6 MB
    • raw-225.conllu.xz6 MB
    • raw-037.conllu.xz8 MB
    • raw-156.conllu.xz7 MB
    • raw-202.conllu.xz6 MB
    • raw-133.conllu.xz7 MB
    • raw-015.conllu.xz7 MB
    • raw-088.conllu.xz7 MB
    • raw-134.conllu.xz7 MB
    • raw-065.conllu.xz2 MB
    • raw-111.conllu.xz7 MB
    • raw-184.conllu.xz6 MB
    • raw-230.conllu.xz6 MB
    • raw-042.conllu.xz5 MB
    • files.txt4 kB
    • raw-161.conllu.xz7 MB
    • raw-043.conllu.xz8 MB
    • raw-020.conllu.xz7 MB
    • raw-093.conllu.xz7 MB
    • raw-070.conllu.xz2 MB
    • raw-129.conllu.xz7 MB
    • raw-107.conllu.xz7 MB
    • raw-226.conllu.xz6 MB
    • raw-038.conllu.xz7 MB
    • raw-157.conllu.xz7 MB
    • raw-039.conllu.xz7 MB
    • raw-203.conllu.xz6 MB
    • raw-016.conllu.xz7 MB
    • raw-089.conllu.xz7 MB
    • raw-135.conllu.xz7 MB
    • raw-066.conllu.xz3 MB
    • raw-112.conllu.xz7 MB
    • raw-185.conllu.xz6 MB
    • raw-231.conllu.xz6 MB
    • raw-162.conllu.xz6 MB
    • raw-044.conllu.xz9 MB
    • raw-163.conllu.xz7 MB
    • raw-021.conllu.xz7 MB
    • raw-094.conllu.xz7 MB
    • raw-140.conllu.xz7 MB
    • raw-071.conllu.xz2 MB
    • raw-190.conllu.xz6 MB
    • raw-072.conllu.xz2 MB
Icon
Name
PL-00.tgz
Size
1.03 GB
Format
application/x-gzip
Description
Polish corpus 1/15
MD5
7751e321a902c6435b32689538ea4df3
 Download file  Preview
 File Preview  
  • PL
    • raw-008.conllu.xz117 MB
    • raw-004.conllu.xz117 MB
    • raw-003.conllu.xz115 MB
    • raw-007.conllu.xz117 MB
    • raw-002.conllu.xz115 MB
    • raw-006.conllu.xz117 MB
    • raw-009.conllu.xz117 MB
    • raw-001.conllu.xz115 MB
    • raw-005.conllu.xz119 MB
Icon
Name
PL-01.tgz
Size
1.15 GB
Format
application/x-gzip
Description
Polish corpus 2/15
MD5
30bfd4572fa0558a40de07d54a48bf00
 Download file  Preview
 File Preview  
  • PL
    • raw-010.conllu.xz117 MB
    • raw-014.conllu.xz119 MB
    • raw-013.conllu.xz118 MB
    • raw-017.conllu.xz118 MB
    • raw-012.conllu.xz118 MB
    • raw-016.conllu.xz117 MB
    • raw-019.conllu.xz118 MB
    • raw-011.conllu.xz117 MB
    • raw-015.conllu.xz116 MB
    • raw-018.conllu.xz117 MB
Icon
Name
PL-02.tgz
Size
1.15 GB
Format
application/x-gzip
Description
Polish corpus 3/15
MD5
f789d03f71464da5f3b653d40fbef351
 Download file  Preview
 File Preview  
  • PL
    • raw-023.conllu.xz117 MB
    • raw-027.conllu.xz116 MB
    • raw-022.conllu.xz118 MB
    • raw-026.conllu.xz116 MB
    • raw-029.conllu.xz117 MB
    • raw-021.conllu.xz117 MB
    • raw-025.conllu.xz118 MB
    • raw-028.conllu.xz118 MB
    • raw-020.conllu.xz117 MB
    • raw-024.conllu.xz117 MB
Icon
Name
PL-03.tgz
Size
1.15 GB
Format
application/x-gzip
Description
Polish corpus 4/15
MD5
fe721405c64c7437a9ad45175984b985
 Download file  Preview
 File Preview  
  • PL
    • raw-033.conllu.xz116 MB
    • raw-032.conllu.xz116 MB
    • raw-036.conllu.xz118 MB
    • raw-031.conllu.xz116 MB
    • raw-035.conllu.xz117 MB
    • raw-039.conllu.xz118 MB
    • raw-038.conllu.xz118 MB
    • raw-030.conllu.xz117 MB
    • raw-034.conllu.xz118 MB
    • raw-037.conllu.xz118 MB
Icon
Name
PL-04.tgz
Size
1.14 GB
Format
application/x-gzip
Description
Polish corpus 5/15
MD5
bfbde7fb2b55bbc117f0298a0a9a77fb
 Download file  Preview
 File Preview  
  • PL
    • raw-043.conllu.xz116 MB
    • raw-042.conllu.xz116 MB
    • raw-046.conllu.xz118 MB
    • raw-041.conllu.xz117 MB
    • raw-045.conllu.xz117 MB
    • raw-049.conllu.xz116 MB
    • raw-048.conllu.xz117 MB
    • raw-040.conllu.xz117 MB
    • raw-044.conllu.xz115 MB
    • raw-047.conllu.xz116 MB
Icon
Name
PL-05.tgz
Size
1.14 GB
Format
application/x-gzip
Description
Polish corpus 6/15
MD5
7650964d51f43de10ebcb11ee636db06
 Download file  Preview
 File Preview  
  • PL
    • raw-052.conllu.xz116 MB
    • raw-051.conllu.xz115 MB
    • raw-055.conllu.xz118 MB
    • raw-059.conllu.xz116 MB
    • raw-058.conllu.xz116 MB
    • raw-050.conllu.xz118 MB
    • raw-054.conllu.xz116 MB
    • raw-057.conllu.xz116 MB
    • raw-053.conllu.xz116 MB
    • raw-056.conllu.xz115 MB
Icon
Name
PL-06.tgz
Size
1.14 GB
Format
application/x-gzip
Description
Polish corpus 7/15
MD5
81de90fef4982b69adb5dac7a392f23c
 Download file  Preview
 File Preview  
  • PL
    • raw-062.conllu.xz115 MB
    • raw-061.conllu.xz117 MB
    • raw-065.conllu.xz116 MB
    • raw-069.conllu.xz116 MB
    • raw-060.conllu.xz118 MB
    • raw-064.conllu.xz117 MB
    • raw-068.conllu.xz115 MB
    • raw-067.conllu.xz116 MB
    • raw-063.conllu.xz116 MB
    • raw-066.conllu.xz114 MB
Icon
Name
PL-07.tgz
Size
1.13 GB
Format
application/x-gzip
Description
Polish corpus 8/15
MD5
eb952fd1b29b1097ae0166ac43176df0
 Download file  Preview
 File Preview  
  • PL
    • raw-075.conllu.xz116 MB
    • raw-079.conllu.xz115 MB
    • raw-070.conllu.xz116 MB
    • raw-074.conllu.xz114 MB
    • raw-078.conllu.xz115 MB
    • raw-077.conllu.xz117 MB
    • raw-073.conllu.xz115 MB
    • raw-076.conllu.xz114 MB
    • raw-072.conllu.xz117 MB
    • raw-071.conllu.xz114 MB
Icon
Name
PL-08.tgz
Size
1.12 GB
Format
application/x-gzip
Description
Polish corpus 9/15
MD5
59ec3bbd04f68321fefc1ee23b566b8b
 Download file  Preview
 File Preview  
  • PL
    • raw-081.conllu.xz115 MB
    • raw-080.conllu.xz114 MB
    • raw-084.conllu.xz115 MB
    • raw-088.conllu.xz114 MB
    • raw-083.conllu.xz115 MB
    • raw-087.conllu.xz115 MB
    • raw-086.conllu.xz115 MB
    • raw-082.conllu.xz114 MB
    • raw-085.conllu.xz114 MB
    • raw-089.conllu.xz114 MB
Icon
Name
PL-09.tgz
Size
1016.43 MB
Format
application/x-gzip
Description
Polish corpus 10/15
MD5
e7d5e40e9daf8301a83f0ba0f57f0e46
 Download file  Preview
 File Preview  
  • PL
    • raw-090.conllu.xz113 MB
    • raw-094.conllu.xz113 MB
    • raw-098.conllu.xz112 MB
    • raw-093.conllu.xz84 MB
    • raw-097.conllu.xz20 MB
    • raw-096.conllu.xz114 MB
    • raw-092.conllu.xz115 MB
    • raw-095.conllu.xz112 MB
    • raw-099.conllu.xz113 MB
    • raw-091.conllu.xz114 MB
Icon
Name
PL-10.tgz
Size
1.12 GB
Format
application/x-gzip
Description
Polish corpus 11/15
MD5
bbf111bf808e84e46cb04b28b2b537f9
 Download file  Preview
 File Preview  
  • PL
    • raw-102.conllu.xz117 MB
    • raw-106.conllu.xz114 MB
    • raw-109.conllu.xz115 MB
    • raw-101.conllu.xz113 MB
    • raw-105.conllu.xz115 MB
    • raw-104.conllu.xz114 MB
    • raw-108.conllu.xz114 MB
    • raw-100.conllu.xz114 MB
    • raw-103.conllu.xz114 MB
    • raw-107.conllu.xz115 MB
Icon
Name
PL-11.tgz
Size
1.13 GB
Format
application/x-gzip
Description
Polish corpus 12/15
MD5
63c285f88dfc9e8c180033929067cbbe
 Download file  Preview
 File Preview  
  • PL
    • raw-119.conllu.xz115 MB
    • raw-111.conllu.xz115 MB
    • raw-115.conllu.xz114 MB
    • raw-114.conllu.xz117 MB
    • raw-118.conllu.xz116 MB
    • raw-110.conllu.xz115 MB
    • raw-113.conllu.xz115 MB
    • raw-117.conllu.xz115 MB
    • raw-112.conllu.xz113 MB
    • raw-116.conllu.xz115 MB
Icon
Name
PL-12.tgz
Size
1.09 GB
Format
application/x-gzip
Description
Polish corpus 13/15
MD5
e6713b3961cf9bfb1322461d7b5ad53d
 Download file  Preview
 File Preview  
  • PL
    • raw-121.conllu.xz124 MB
    • raw-125.conllu.xz110 MB
    • raw-128.conllu.xz124 MB
    • raw-120.conllu.xz116 MB
    • raw-124.conllu.xz13 MB
    • raw-123.conllu.xz127 MB
    • raw-127.conllu.xz114 MB
    • raw-122.conllu.xz128 MB
    • raw-126.conllu.xz126 MB
    • raw-129.conllu.xz124 MB
Icon
Name
PL-13.tgz
Size
1.1 GB
Format
application/x-gzip
Description
Polish corpus 14/15
MD5
f0ebe5779d28738918e1fcb9a29fc96e
 Download file  Preview
 File Preview  
  • PL
    • raw-135.conllu.xz117 MB
    • raw-138.conllu.xz112 MB
    • raw-130.conllu.xz124 MB
    • raw-134.conllu.xz118 MB
    • raw-133.conllu.xz120 MB
    • raw-137.conllu.xz109 MB
    • raw-132.conllu.xz76 MB
    • raw-136.conllu.xz114 MB
    • raw-139.conllu.xz110 MB
    • raw-131.conllu.xz122 MB
Icon
Name
PL-14.tgz
Size
107.29 MB
Format
application/x-gzip
Description
Polish corpus 15/15
MD5
676dc9cb0109ab5160d5d1d7ab15c19f
 Download file  Preview
 File Preview  
  • PL
    • raw-140.conllu.xz107 MB
Icon
Name
PT.tgz
Size
1.73 GB
Format
application/x-gzip
Description
Brazilian Portuguese corpus
MD5
dcc41bbd107be5902b8795c267dace5d
 Download file  Preview
 File Preview  
  • PT
    • raw-012.conllu.xz112 MB
    • raw-015.conllu.xz109 MB
    • raw-003.conllu.xz29 MB
    • raw-006.conllu.xz106 MB
    • raw-013.conllu.xz106 MB
    • raw-016.conllu.xz102 MB
    • raw-007.conllu.xz14 MB
    • raw-017.conllu.xz103 MB
    • raw-008.conllu.xz104 MB
    • files.txt342 B
    • raw-018.conllu.xz73 MB
    • raw-009.conllu.xz105 MB
    • raw-010.conllu.xz111 MB
    • raw-001.conllu.xz115 MB
    • raw-004.conllu.xz117 MB
    • raw-019.conllu.xz123 MB
    • raw-011.conllu.xz74 MB
    • raw-002.conllu.xz34 MB
    • raw-014.conllu.xz113 MB
    • raw-005.conllu.xz113 MB
Icon
Name
RO.tgz
Size
88.05 MB
Format
application/x-gzip
Description
Romanian corpus
MD5
55a0cddc185b3c2f7faa9b7c12d0bf85
 Download file  Preview
 File Preview  
  • RO
    • raw-040.conllu.xz867 kB
    • raw-057.conllu.xz860 kB
    • raw-064.conllu.xz929 kB
    • raw-017.conllu.xz961 kB
    • raw-024.conllu.xz945 kB
    • raw-088.conllu.xz956 kB
    • files.txt1 kB
    • raw-012.conllu.xz913 kB
    • raw-076.conllu.xz1008 kB
    • raw-083.conllu.xz878 kB
    • raw-029.conllu.xz963 kB
    • raw-036.conllu.xz841 kB
    • raw-043.conllu.xz847 kB
    • raw-071.conllu.xz1007 kB
    • raw-031.conllu.xz1 MB
    • raw-095.conllu.xz860 kB
    • raw-048.conllu.xz879 kB
    • raw-055.conllu.xz913 kB
    • raw-062.conllu.xz929 kB
    • raw-008.conllu.xz927 kB
    • raw-015.conllu.xz870 kB
    • raw-090.conllu.xz809 kB
    • raw-079.conllu.xz992 kB
    • raw-050.conllu.xz871 kB
    • raw-039.conllu.xz890 kB
    • raw-003.conllu.xz899 kB
    • raw-067.conllu.xz914 kB
    • raw-074.conllu.xz960 kB
    • raw-081.conllu.xz859 kB
    • raw-027.conllu.xz926 kB
    • raw-034.conllu.xz825 kB
    • raw-098.conllu.xz861 kB
    • raw-022.conllu.xz947 kB
    • raw-086.conllu.xz882 kB
    • raw-093.conllu.xz844 kB
    • raw-046.conllu.xz872 kB
    • raw-053.conllu.xz921 kB
    • raw-006.conllu.xz872 kB
    • raw-100.conllu.xz769 kB
    • raw-041.conllu.xz858 kB
    • raw-001.conllu.xz985 kB
    • raw-058.conllu.xz937 kB
    • raw-065.conllu.xz915 kB
    • raw-072.conllu.xz1000 kB
    • raw-018.conllu.xz970 kB
    • raw-025.conllu.xz977 kB
    • raw-089.conllu.xz865 kB
    • raw-060.conllu.xz941 kB
    • raw-049.conllu.xz891 kB
    • raw-013.conllu.xz862 kB
    • raw-020.conllu.xz809 kB
    • raw-077.conllu.xz981 kB
    • raw-084.conllu.xz932 kB
    • raw-091.conllu.xz869 kB
    • raw-037.conllu.xz836 kB
    • raw-044.conllu.xz882 kB
    • raw-004.conllu.xz894 kB
    • raw-068.conllu.xz936 kB
    • raw-032.conllu.xz1 MB
    • raw-096.conllu.xz875 kB
    • raw-056.conllu.xz868 kB
    • raw-063.conllu.xz899 kB
    • raw-009.conllu.xz963 kB
    • raw-016.conllu.xz949 kB
    • raw-087.conllu.xz924 kB
    • raw-051.conllu.xz867 kB
    • raw-011.conllu.xz996 kB
    • raw-075.conllu.xz922 kB
    • raw-082.conllu.xz867 kB
    • raw-028.conllu.xz950 kB
    • raw-035.conllu.xz880 kB
    • raw-099.conllu.xz893 kB
    • raw-059.conllu.xz883 kB
    • raw-070.conllu.xz963 kB
    • raw-023.conllu.xz1012 kB
    • raw-030.conllu.xz971 kB
    • raw-094.conllu.xz868 kB
    • raw-047.conllu.xz865 kB
    • raw-054.conllu.xz920 kB
    • raw-007.conllu.xz973 kB
    • raw-014.conllu.xz901 kB
    • raw-078.conllu.xz984 kB
    • raw-042.conllu.xz861 kB
    • raw-002.conllu.xz995 kB
    • raw-066.conllu.xz954 kB
    • raw-073.conllu.xz911 kB
    • raw-019.conllu.xz969 kB
    • raw-026.conllu.xz930 kB
    • raw-033.conllu.xz925 kB
    • raw-097.conllu.xz881 kB
    • raw-061.conllu.xz930 kB
    • raw-021.conllu.xz847 kB
    • raw-085.conllu.xz834 kB
    • raw-092.conllu.xz813 kB
    • raw-038.conllu.xz819 kB
    • raw-045.conllu.xz870 kB
    • raw-052.conllu.xz871 kB
    • raw-005.conllu.xz960 kB
    • raw-080.conllu.xz879 kB
    • raw-069.conllu.xz957 kB
Icon
Name
SV-00.tgz
Size
953.36 MB
Format
application/x-gzip
Description
Swedish corpus 1/19
MD5
ad5542e3129988bfb6e4af0db1068b35
 Download file  Preview
 File Preview  
  • SV
    • raw-008.conllu.xz106 MB
    • raw-004.conllu.xz106 MB
    • raw-003.conllu.xz105 MB
    • raw-007.conllu.xz106 MB
    • raw-002.conllu.xz105 MB
    • raw-006.conllu.xz105 MB
    • raw-009.conllu.xz105 MB
    • raw-001.conllu.xz105 MB
    • raw-005.conllu.xz106 MB
Icon
Name
SV-01.tgz
Size
1.01 GB
Format
application/x-gzip
Description
Swedish corpus 2/19
MD5
9f71294e5bbe3503964b9dd77e5851f3
 Download file  Preview
 File Preview  
  • SV
    • raw-010.conllu.xz104 MB
    • raw-014.conllu.xz103 MB
    • raw-013.conllu.xz103 MB
    • raw-017.conllu.xz103 MB
    • raw-012.conllu.xz105 MB
    • raw-016.conllu.xz103 MB
    • raw-019.conllu.xz102 MB
    • raw-011.conllu.xz104 MB
    • raw-015.conllu.xz103 MB
    • raw-018.conllu.xz102 MB
Icon
Name
SV-02.tgz
Size
1008.39 MB
Format
application/x-gzip
Description
Swedish corpus 3/19
MD5
06584c4521f93c4924747fa28d24bac5
 Download file  Preview
 File Preview  
  • SV
    • raw-023.conllu.xz101 MB
    • raw-027.conllu.xz99 MB
    • raw-022.conllu.xz101 MB
    • raw-026.conllu.xz100 MB
    • raw-029.conllu.xz98 MB
    • raw-021.conllu.xz102 MB
    • raw-025.conllu.xz100 MB
    • raw-028.conllu.xz99 MB
    • raw-020.conllu.xz101 MB
    • raw-024.conllu.xz102 MB
Icon
Name
SV-03.tgz
Size
970.8 MB
Format
application/x-gzip
Description
Swedish corpus 4/19
MD5
6da6df394d30d975e44daae2a8bfdc2f
 Download file  Preview
 File Preview  
  • SV
    • raw-033.conllu.xz96 MB
    • raw-032.conllu.xz97 MB
    • raw-036.conllu.xz96 MB
    • raw-031.conllu.xz97 MB
    • raw-035.conllu.xz98 MB
    • raw-039.conllu.xz94 MB
    • raw-038.conllu.xz95 MB
    • raw-030.conllu.xz99 MB
    • raw-034.conllu.xz97 MB
    • raw-037.conllu.xz97 MB
Icon
Name
SV-04.tgz
Size
849.36 MB
Format
application/x-gzip
Description
Swedish corpus 5/19
MD5
f76653d2632ef2a4d458c6fbd8e16d10
 Download file  Preview
 File Preview  
  • SV
    • raw-043.conllu.xz94 MB
    • raw-042.conllu.xz95 MB
    • raw-046.conllu.xz105 MB
    • raw-041.conllu.xz94 MB
    • raw-045.conllu.xz104 MB
    • raw-049.conllu.xz102 MB
    • raw-048.conllu.xz101 MB
    • raw-040.conllu.xz94 MB
    • raw-044.conllu.xz39 MB
    • raw-047.conllu.xz16 MB
Icon
Name
SV-05.tgz
Size
1019.38 MB
Format
application/x-gzip
Description
Swedish corpus 6/19
MD5
04e7917c7cf5dc620467171b10639a73
 Download file  Preview
 File Preview  
  • SV
    • raw-052.conllu.xz102 MB
    • raw-051.conllu.xz102 MB
    • raw-055.conllu.xz102 MB
    • raw-059.conllu.xz101 MB
    • raw-058.conllu.xz101 MB
    • raw-050.conllu.xz101 MB
    • raw-054.conllu.xz102 MB
    • raw-057.conllu.xz100 MB
    • raw-053.conllu.xz101 MB
    • raw-056.conllu.xz101 MB
Icon
Name
SV-06.tgz
Size
999.95 MB
Format
application/x-gzip
Description
Swedish corpus 7/19
MD5
8d3994e43c6442e96d5a19951017cef0
 Download file  Preview
 File Preview  
  • SV
    • raw-062.conllu.xz101 MB
    • raw-061.conllu.xz100 MB
    • raw-065.conllu.xz100 MB
    • raw-069.conllu.xz98 MB
    • raw-060.conllu.xz101 MB
    • raw-064.conllu.xz99 MB
    • raw-068.conllu.xz98 MB
    • raw-067.conllu.xz99 MB
    • raw-063.conllu.xz100 MB
    • raw-066.conllu.xz99 MB
Icon
Name
SV-07.tgz
Size
966.7 MB
Format
application/x-gzip
Description
Swedish corpus 8/19
MD5
11f477f6b558f39c7c20031d2783892c
 Download file  Preview
 File Preview  
  • SV
    • raw-075.conllu.xz97 MB
    • raw-079.conllu.xz94 MB
    • raw-070.conllu.xz98 MB
    • raw-074.conllu.xz96 MB
    • raw-078.conllu.xz94 MB
    • raw-077.conllu.xz95 MB
    • raw-073.conllu.xz97 MB
    • raw-076.conllu.xz95 MB
    • raw-072.conllu.xz97 MB
    • raw-071.conllu.xz97 MB
Icon
Name
SV-08.tgz
Size
931.66 MB
Format
application/x-gzip
Description
Swedish corpus 9/19
MD5
5694bd3086b7bf67ca0ea69ecb32f491
 Download file  Preview
 File Preview  
  • SV
    • raw-081.conllu.xz94 MB
    • raw-080.conllu.xz95 MB
    • raw-084.conllu.xz93 MB
    • raw-088.conllu.xz92 MB
    • raw-083.conllu.xz92 MB
    • raw-087.conllu.xz92 MB
    • raw-086.conllu.xz92 MB
    • raw-082.conllu.xz93 MB
    • raw-085.conllu.xz92 MB
    • raw-089.conllu.xz91 MB
Icon
Name
SV-09.tgz
Size
886.31 MB
Format
application/x-gzip
Description
Swedish corpus 10/19
MD5
bc0450f9b8ab9fe0db06def2750e55c5
 Download file  Preview
 File Preview  
  • SV
    • raw-090.conllu.xz90 MB
    • raw-094.conllu.xz89 MB
    • raw-098.conllu.xz65 MB
    • raw-093.conllu.xz89 MB
    • raw-097.conllu.xz87 MB
    • raw-096.conllu.xz88 MB
    • raw-092.conllu.xz89 MB
    • raw-095.conllu.xz88 MB
    • raw-099.conllu.xz105 MB
    • raw-091.conllu.xz90 MB
Icon
Name
SV-10.tgz
Size
987.17 MB
Format
application/x-gzip
Description
Swedish corpus 11/19
MD5
1c974dd27afd4026ce26ba95ff520171
 Download file  Preview
 File Preview  
  • SV
    • raw-102.conllu.xz106 MB
    • raw-106.conllu.xz109 MB
    • raw-109.conllu.xz107 MB
    • raw-101.conllu.xz13 MB
    • raw-105.conllu.xz108 MB
    • raw-104.conllu.xz109 MB
    • raw-108.conllu.xz107 MB
    • raw-100.conllu.xz106 MB
    • raw-103.conllu.xz108 MB
    • raw-107.conllu.xz108 MB
Icon
Name
SV-11.tgz
Size
1.05 GB
Format
application/x-gzip
Description
Swedish corpus 12/19
MD5
247d36e4389774ae59af8aeac055091f
 Download file  Preview
 File Preview  
  • SV
    • raw-119.conllu.xz105 MB
    • raw-111.conllu.xz107 MB
    • raw-115.conllu.xz107 MB
    • raw-114.conllu.xz107 MB
    • raw-118.conllu.xz106 MB
    • raw-110.conllu.xz108 MB
    • raw-113.conllu.xz108 MB
    • raw-117.conllu.xz106 MB
    • raw-112.conllu.xz107 MB
    • raw-116.conllu.xz106 MB
Icon
Name
SV-12.tgz
Size
1.02 GB
Format
application/x-gzip
Description
Swedish corpus 13/19
MD5
11de25c2d239438832522e9c2c3d86e3
 Download file  Preview
 File Preview  
  • SV
    • raw-121.conllu.xz105 MB
    • raw-125.conllu.xz104 MB
    • raw-128.conllu.xz103 MB
    • raw-120.conllu.xz105 MB
    • raw-124.conllu.xz105 MB
    • raw-123.conllu.xz104 MB
    • raw-127.conllu.xz104 MB
    • raw-122.conllu.xz105 MB
    • raw-126.conllu.xz105 MB
    • raw-129.conllu.xz103 MB
Icon
Name
SV-13.tgz
Size
1017.32 MB
Format
application/x-gzip
Description
Swedish corpus 14/19
MD5
279fe30a8afda651c73d5744ceea6ee2
 Download file  Preview
 File Preview  
  • SV
    • raw-135.conllu.xz102 MB
    • raw-138.conllu.xz101 MB
    • raw-130.conllu.xz103 MB
    • raw-134.conllu.xz101 MB
    • raw-133.conllu.xz101 MB
    • raw-137.conllu.xz100 MB
    • raw-132.conllu.xz101 MB
    • raw-136.conllu.xz101 MB
    • raw-139.conllu.xz100 MB
    • raw-131.conllu.xz102 MB
Icon
Name
SV-14.tgz
Size
946.75 MB
Format
application/x-gzip
Description
Swedish corpus 15/19
MD5
d42b12cc53288f077c69ae821fffa16d
 Download file  Preview
 File Preview  
  • SV
    • raw-140.conllu.xz100 MB
    • raw-144.conllu.xz97 MB
    • raw-143.conllu.xz99 MB
    • raw-147.conllu.xz34 MB
    • raw-142.conllu.xz100 MB
    • raw-146.conllu.xz97 MB
    • raw-149.conllu.xz110 MB
    • raw-141.conllu.xz100 MB
    • raw-145.conllu.xz97 MB
    • raw-148.conllu.xz109 MB
Icon
Name
SV-15.tgz
Size
972 MB
Format
application/x-gzip
Description
Swedish corpus 16/19
MD5
7696cc8931abb44cadbefe6ad402a0c8
 Download file  Preview
 File Preview  
  • SV
    • raw-150.conllu.xz5 MB
    • raw-154.conllu.xz108 MB
    • raw-157.conllu.xz108 MB
    • raw-153.conllu.xz107 MB
    • raw-152.conllu.xz106 MB
    • raw-156.conllu.xz106 MB
    • raw-151.conllu.xz105 MB
    • raw-155.conllu.xz108 MB
    • raw-159.conllu.xz106 MB
    • raw-158.conllu.xz108 MB
Icon
Name
SV-16.tgz
Size
1.04 GB
Format
application/x-gzip
Description
Swedish corpus 17/19
MD5
a3ce9f178594be70a30835d4984f3f63
 Download file  Preview
 File Preview  
  • SV
    • raw-167.conllu.xz105 MB
    • raw-163.conllu.xz105 MB
    • raw-162.conllu.xz107 MB
    • raw-166.conllu.xz105 MB
    • raw-161.conllu.xz106 MB
    • raw-165.conllu.xz105 MB
    • raw-169.conllu.xz105 MB
    • raw-168.conllu.xz105 MB
    • raw-160.conllu.xz107 MB
    • raw-164.conllu.xz105 MB
Icon
Name
SV-17.tgz
Size
1 GB
Format
application/x-gzip
Description
Swedish corpus 18/19
MD5
a0c58d61c9670c8c5f954d469aa74a62
 Download file  Preview
 File Preview  
  • SV
    • raw-173.conllu.xz103 MB
    • raw-172.conllu.xz103 MB
    • raw-176.conllu.xz101 MB
    • raw-171.conllu.xz103 MB
    • raw-175.conllu.xz102 MB
    • raw-179.conllu.xz101 MB
    • raw-178.conllu.xz102 MB
    • raw-170.conllu.xz105 MB
    • raw-174.conllu.xz101 MB
    • raw-177.conllu.xz101 MB
Icon
Name
SV-18.tgz
Size
800.99 MB
Format
application/x-gzip
Description
Swedish corpus 19/19
MD5
89243ae8435558ccac8b5ddfb74677f2
 Download file  Preview
 File Preview  
  • SV
    • raw-186.conllu.xz109 MB
    • raw-182.conllu.xz100 MB
    • raw-181.conllu.xz101 MB
    • raw-185.conllu.xz76 MB
    • raw-180.conllu.xz100 MB
    • raw-184.conllu.xz99 MB
    • raw-188.conllu.xz285 kB
    • raw-187.conllu.xz112 MB
    • raw-183.conllu.xz100 MB
Icon
Name
TR.tgz
Size
162.02 MB
Format
application/x-gzip
Description
Turkish corpus
MD5
6323664589945450aaf1853b73ca99cf
 Download file  Preview
 File Preview  
  • TR
    • raw-030.conllu.xz709 kB
    • raw-029.conllu.xz5 MB
    • raw-021.conllu.xz5 MB
    • raw-024.conllu.xz5 MB
    • raw-012.conllu.xz5 MB
    • raw-015.conllu.xz5 MB
    • raw-003.conllu.xz5 MB
    • raw-006.conllu.xz5 MB
    • raw-022.conllu.xz5 MB
    • raw-025.conllu.xz5 MB
    • raw-013.conllu.xz5 MB
    • raw-016.conllu.xz5 MB
    • raw-007.conllu.xz5 MB
    • raw-023.conllu.xz5 MB
    • raw-026.conllu.xz5 MB
    • raw-017.conllu.xz5 MB
    • raw-008.conllu.xz5 MB
    • raw-027.conllu.xz5 MB
    • files.txt540 B
    • raw-018.conllu.xz5 MB
    • raw-009.conllu.xz5 MB
    • raw-010.conllu.xz5 MB
    • raw-001.conllu.xz5 MB
    • raw-004.conllu.xz5 MB
    • raw-028.conllu.xz5 MB
    • raw-020.conllu.xz5 MB
    • raw-019.conllu.xz5 MB
    • raw-011.conllu.xz5 MB
    • raw-002.conllu.xz5 MB
    • raw-014.conllu.xz5 MB
    • raw-005.conllu.xz5 MB
Icon
Name
ZH.tgz
Size
399.11 MB
Format
application/x-gzip
Description
Chinese corpus
MD5
d7f39b560a8f037d76a8e421657e235a
 Download file  Preview
 File Preview  
  • ZH
    • files.txt72 B
    • raw-004.conllu.xz150 MB
    • raw-001.conllu.xz33 MB
    • raw-002.conllu.xz108 MB
    • raw-003.conllu.xz106 MB

Show simple item record