Universal Segmentations 1.0 (UniSegments 1.0)
Please use the following text to cite this item or export to a predefined format:
Žabokrtský, Zdeněk; et al., 2022,
Universal Segmentations 1.0 (UniSegments 1.0), LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-4629.
Authors
Žabokrtský, Zdeněk ; et al.
Item identifier
Project URL
Referenced by
Date issued
2022-01-17
Size
38 files
Description
Universal Segmentations (UniSegments) is a collection of lexical resources capturing morphological segmentations harmonised into a cross-linguistically consistent annotation scheme for many languages. The annotation scheme consists of simple tab-separated columns that stores a word and its morphological segmentations, including pieces of information about the word and the segmented units, e.g., part-of-speech categories, type of morphs/morphemes etc. The current public version of the collection contains 38 harmonised segmentation datasets covering 30 different languages.
Acknowledgement
Grantová agentura České Republiky
Project code:19-14534S
Project name:Popis slovotvorné struktury českých slov na základě jazykových dat
Charles University
Project code:START/HUM/010
Project name:A data-based approach to competition in word-formation: selected semantic categories across seven languages
Univerzita Karlova (mimo GAUK)
Project code:SVV 260 453
Project name:Specifický vysokoškolský výzkum
Ministerstvo školství, mládeže a tělovýchovy České republiky
Project code:LM2015071
Project name:LINDAT/CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat
Ministerstvo školství, mládeže a tělovýchovy České republiky
Project code:LM2018101
Project name:LINDAT/CLARIAH-CZ: Digitální výzkumná infrastruktura pro jazykové technologie, umění a humanitní vědy
Collections
Files in this item
- Name
- UniSegments-1.0-public.tar.gz
- Size
- 130.55 MB
- Format
- application/x-gzip
- Description
- unisegments-1.0
- MD5
- d8a436b31b51e0123231213290f455fd

- UniSegments-1.0-public
- data
- por-MorphyNet
- README.TXT633 B
- LICENSE.TXT19 kB
- UniSegments-1.0-por-MorphyNet.useg75 MB
- fin-MorphyNet
- README.TXT633 B
- LICENSE.TXT19 kB
- UniSegments-1.0-fin-MorphyNet.useg67 MB
- eng-MorphoLex
- README.TXT1004 B
- LICENSE.TXT15 kB
- UniSegments-1.0-eng-MorphoLex.useg17 MB
- fra-Demonette
- README.TXT399 B
- LICENSE.TXT19 kB
- UniSegments-1.0-fra-Demonette.useg16 MB
- rus-DerivBaseRU
- README.TXT372 B
- LICENSE.TXT10 kB
- UniSegments-1.0-rus-DerivBaseRU.useg33 MB
- fas-PerSegLex
- README.TXT498 B
- UniSegments-1.0-fas-PerSegLex.useg8 MB
- LICENSE.TXT15 kB
- rus-MorphyNet
- README.TXT633 B
- LICENSE.TXT19 kB
- UniSegments-1.0-rus-MorphyNet.useg119 MB
- fra-Echantinom
- README.TXT203 B
- LICENSE.TXT13 kB
- UniSegments-1.0-fra-Echantinom.useg1 MB
- myv-Uniparser
- README.TXT522 B
- LICENSE.TXT1 kB
- UniSegments-1.0-myv-Uniparser.useg52 MB
- mon-MorphyNet
- README.TXT633 B
- LICENSE.TXT19 kB
- UniSegments-1.0-mon-MorphyNet.useg5 MB
- fra-MorphyNet
- README.TXT633 B
- UniSegments-1.0-fra-MorphyNet.useg61 MB
- LICENSE.TXT19 kB
- hin-KCIS
- README.TXT1 kB
- LICENSE.TXT13 kB
- UniSegments-1.0-hin-KCIS.useg2 MB
- hye-Uniparser
- README.TXT522 B
- LICENSE.TXT1 kB
- UniSegments-1.0-hye-Uniparser.useg189 MB
- spa-MorphyNet
- README.TXT633 B
- LICENSE.TXT19 kB
- UniSegments-1.0-spa-MorphyNet.useg90 MB
- mhr-Uniparser
- README.TXT522 B
- LICENSE.TXT1 kB
- UniSegments-1.0-mhr-Uniparser.useg83 MB
- cat-MorphyNet
- README.TXT633 B
- UniSegments-1.0-cat-MorphyNet.useg86 MB
- LICENSE.TXT19 kB
- fra-MorphoLex
- README.TXT1004 B
- LICENSE.TXT15 kB
- UniSegments-1.0-fra-MorphoLex.useg2 MB
- ita-MorphyNet
- README.TXT633 B
- LICENSE.TXT19 kB
- UniSegments-1.0-ita-MorphyNet.useg100 MB
- pol-MorphyNet
- README.TXT633 B
- UniSegments-1.0-pol-MorphyNet.useg85 MB
- LICENSE.TXT19 kB
- kan-KCIS
- README.TXT1 kB
- LICENSE.TXT13 kB
- UniSegments-1.0-kan-KCIS.useg10 MB
- udm-Uniparser
- README.TXT522 B
- LICENSE.TXT1 kB
- UniSegments-1.0-udm-Uniparser.useg134 MB
- deu-MorphyNet
- README.TXT633 B
- LICENSE.TXT19 kB
- UniSegments-1.0-deu-MorphyNet.useg4 MB
- tgk-Uniparser
- README.TXT522 B
- LICENSE.TXT1 kB
- UniSegments-1.0-tgk-Uniparser.useg78 MB
- mal-KCIS
- README.TXT1 kB
- UniSegments-1.0-mal-KCIS.useg18 MB
- LICENSE.TXT13 kB
- ita-DerIvaTario
- README.TXT438 B
- LICENSE.TXT14 kB
- UniSegments-1.0-ita-DerIvaTario.useg4 MB
- mar-KCIS
- README.TXT1 kB
- LICENSE.TXT13 kB
- UniSegments-1.0-mar-KCIS.useg13 MB
- lat-WordFormationLatin
- README.TXT336 B
- LICENSE.TXT15 kB
- UniSegments-1.0-lat-WordFormationLatin.useg8 MB
- ben-KCIS
- README.TXT1 kB
- LICENSE.TXT13 kB
- UniSegments-1.0-ben-KCIS.useg950 kB
- ces-DeriNet
- README.TXT643 B
- LICENSE.TXT19 kB
- UniSegments-1.0-ces-DeriNet.useg291 MB
- swe-MorphyNet
- README.TXT633 B
- UniSegments-1.0-swe-MorphyNet.useg73 MB
- LICENSE.TXT19 kB
- hrv-CroDeriV
- README.TXT506 B
- LICENSE.TXT19 kB
- UniSegments-1.0-hrv-CroDeriV.useg3 MB
- mdf-Uniparser
- README.TXT522 B
- UniSegments-1.0-mdf-Uniparser.useg32 MB
- LICENSE.TXT1 kB
- hbs-MorphyNet
- README.TXT633 B
- LICENSE.TXT19 kB
- UniSegments-1.0-hbs-MorphyNet.useg5 MB
- kpv-Uniparser
- README.TXT522 B
- LICENSE.TXT1 kB
- UniSegments-1.0-kpv-Uniparser.useg63 MB
- hun-MorphyNet
- README.TXT633 B
- LICENSE.TXT19 kB
- UniSegments-1.0-hun-MorphyNet.useg72 MB
- deu-DerivBaseDE
- README.TXT496 B
- LICENSE.TXT19 kB
- UniSegments-1.0-deu-DerivBaseDE.useg10 MB
- ces-MorphyNet
- README.TXT633 B
- LICENSE.TXT19 kB
- UniSegments-1.0-ces-MorphyNet.useg11 MB
- eng-MorphyNet
- README.TXT633 B
- LICENSE.TXT19 kB
- UniSegments-1.0-eng-MorphyNet.useg49 MB
- por-MorphyNet
- doc
- LICENSE2 kB
- README.md3 kB
- Towards-Universal-Segmentations-Survey-of-Existing-Morphosegmentation-Resources.pdf441 kB
- data

