Verbs annotated for morphemic structure in Czech, English, German, Spanish v2
Please use the following text to cite this item or export to a predefined format:
Hledíková, Hana, 2026,
Verbs annotated for morphemic structure in Czech, English, German, Spanish v2, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-6124.
Authors
Item identifier
Date issued
2026-03-12
Size
69107 items
Description
A sample of verb lemmas in four languages: Czech (19,040 lemmas), English (9,969 lemmas), German (27,158 lemmas), Spanish (11,768 lemmas). Each verb lemma is annotated for its morphemic structure (i.e., segmented into the prefiex(es), root(s), suffix(es) and ending(s) that the given lemma contains), classification of its root morph to a root morpheme where needed (to facilitate grouping of verbs with the same root morpheme), and its frequency of the verb in a 100 M corpus. Two versions are available for each language: one with a more coarse-grained segmentation, which captures the morphemic structure that is synchronically available, and a version with a more fine-grained segmentation, which also captures the word's etymology.
Acknowledgement
Charles University Grant Agency
Project code:GAUK 246723
Project name:Morfematická komplexita slovesné slovní zásoby ve čtyřech jazycích: Kvantitativní výzkum založený na korpusových datech
Subject(s)
Collections
Version History
This item isPublicly Available
and licensed under:
Files in this item
- Name
- Czech_final_v2.tsv
- Size
- 1012.32 KB
- Format
- application/octet-stream
- Description
- MD5
- 3f9a4176a7d36f9b69068929d0417111

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- Czech_less_final_v2.tsv
- Size
- 1013.43 KB
- Format
- application/octet-stream
- Description
- MD5
- a851bf254dfe589db02fa98684c7e3cd

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- English_final_v2.tsv
- Size
- 404.73 KB
- Format
- application/octet-stream
- Description
- MD5
- ee61ba95ee534c083adfff35857e4637

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- English_less_final_v2.tsv
- Size
- 370.39 KB
- Format
- application/octet-stream
- Description
- MD5
- be922ce218ab5c09f938ae2087e89338

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- German_final_v2.tsv
- Size
- 1.27 MB
- Format
- application/octet-stream
- Description
- MD5
- b0ee9d3830336046d7c9be0e5a3bb71d

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- German_less_final_v2.tsv
- Size
- 1.25 MB
- Format
- application/octet-stream
- Description
- MD5
- dbae3c643333e1358e554a8e9d368284

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- Spanish_final_v2.tsv
- Size
- 601.03 KB
- Format
- application/octet-stream
- Description
- MD5
- 02e800475374bb5e49ebb33214958d85

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- Spanish_less_final_v2.tsv
- Size
- 555.37 KB
- Format
- application/octet-stream
- Description
- MD5
- c979e6a1847072b8b99fc131c2287543

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz

