This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.

Verbs annotated for morphemic structure in Czech, English, German, Spanish v2

Please use the following text to cite this item or export to a predefined format:
Hledíková, Hana, 2026, Verbs annotated for morphemic structure in Czech, English, German, Spanish v2, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/11234/1-6124.
Date issued
2026-03-12
Size
69107 items
Description
A sample of verb lemmas in four languages: Czech (19,040 lemmas), English (9,969 lemmas), German (27,158 lemmas), Spanish (11,768 lemmas). Each verb lemma is annotated for its morphemic structure (i.e., segmented into the prefiex(es), root(s), suffix(es) and ending(s) that the given lemma contains), classification of its root morph to a root morpheme where needed (to facilitate grouping of verbs with the same root morpheme), and its frequency of the verb in a 100 M corpus. Two versions are available for each language: one with a more coarse-grained segmentation, which captures the morphemic structure that is synchronically available, and a version with a more fine-grained segmentation, which also captures the word's etymology.
Acknowledgement
This item isPublicly Available
and licensed under:
 Files in this item
Name
Czech_final_v2.tsv
Size
1012.32 KB
Format
application/octet-stream
Description
MD5
3f9a4176a7d36f9b69068929d0417111
Preview
  File Preview
Name
Czech_less_final_v2.tsv
Size
1013.43 KB
Format
application/octet-stream
Description
MD5
a851bf254dfe589db02fa98684c7e3cd
Preview
  File Preview
Name
English_final_v2.tsv
Size
404.73 KB
Format
application/octet-stream
Description
MD5
ee61ba95ee534c083adfff35857e4637
Preview
  File Preview
Name
English_less_final_v2.tsv
Size
370.39 KB
Format
application/octet-stream
Description
MD5
be922ce218ab5c09f938ae2087e89338
Preview
  File Preview
Name
German_final_v2.tsv
Size
1.27 MB
Format
application/octet-stream
Description
MD5
b0ee9d3830336046d7c9be0e5a3bb71d
Preview
  File Preview
Name
German_less_final_v2.tsv
Size
1.25 MB
Format
application/octet-stream
Description
MD5
dbae3c643333e1358e554a8e9d368284
Preview
  File Preview
Name
Spanish_final_v2.tsv
Size
601.03 KB
Format
application/octet-stream
Description
MD5
02e800475374bb5e49ebb33214958d85
Preview
  File Preview
Name
Spanish_less_final_v2.tsv
Size
555.37 KB
Format
application/octet-stream
Description
MD5
c979e6a1847072b8b99fc131c2287543
Preview
  File Preview