This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.
 

Universal Segmentations 1.0 (UniSegments 1.0)

Please use the following text to cite this item or export to a predefined format:
Žabokrtský, Zdeněk; et al., 2022, Universal Segmentations 1.0 (UniSegments 1.0), LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/11234/1-4629.
Date issued
2022-01-17
Size
38 files
Description
Universal Segmentations (UniSegments) is a collection of lexical resources capturing morphological segmentations harmonised into a cross-linguistically consistent annotation scheme for many languages. The annotation scheme consists of simple tab-separated columns that stores a word and its morphological segmentations, including pieces of information about the word and the segmented units, e.g., part-of-speech categories, type of morphs/morphemes etc. The current public version of the collection contains 38 harmonised segmentation datasets covering 30 different languages.
Acknowledgement
This item isPublicly Available
and licensed under:
 Files in this item
Name
UniSegments-1.0-public.tar.gz
Size
130.55 MB
Format
application/x-gzip
Description
gzip Archive
MD5
d8a436b31b51e0123231213290f455fd
Preview
  File Preview