Show simple item record

 
dc.contributor.author Žabokrtský, Zdeněk
dc.contributor.author Bafna, Nyati
dc.contributor.author Bodnár, Jan
dc.contributor.author Kyjánek, Lukáš
dc.contributor.author Svoboda, Emil
dc.contributor.author Ševčíková, Magda
dc.contributor.author Vidra, Jonáš
dc.contributor.author Angle, Sachi
dc.contributor.author Ansari, Ebrahim
dc.contributor.author Arkhangelskiy, Timofey
dc.contributor.author Batsuren, Khuyagbaatar
dc.contributor.author Bella, Gábor
dc.contributor.author Bertinetto, Pier Marco
dc.contributor.author Bonami, Olivier
dc.contributor.author Celata, Chiara
dc.contributor.author Daniel, Michael
dc.contributor.author Fedorenko, Alexei
dc.contributor.author Filko, Matea
dc.contributor.author Giunchiglia, Fausto
dc.contributor.author Haghdoost, Hamid
dc.contributor.author Hathout, Nabil
dc.contributor.author Khomchenkova, Irina
dc.contributor.author Khurshudyan, Victoria
dc.contributor.author Levonian, Dmitri
dc.contributor.author Litta, Eleonora
dc.contributor.author Medvedeva, Maria
dc.contributor.author Muralikrishna, S. N.
dc.contributor.author Namer, Fiammetta
dc.contributor.author Nikravesh, Mahshid
dc.contributor.author Padó, Sebastian
dc.contributor.author Passarotti, Marco
dc.contributor.author Plungian, Vladimir
dc.contributor.author Polyakov, Alexey
dc.contributor.author Potapov, Mihail
dc.contributor.author Pruthwik, Mishra
dc.contributor.author Rao B, Ashwath
dc.contributor.author Rubakov, Sergei
dc.contributor.author Samar, Husain
dc.contributor.author Sharma, Dipti Misra
dc.contributor.author Šnajder, Jan
dc.contributor.author Šojat, Krešimir
dc.contributor.author Štefanec, Vanja
dc.contributor.author Talamo, Luigi
dc.contributor.author Tribout, Delphine
dc.contributor.author Vodolazsky, Daniil
dc.contributor.author Vydrin, Arseniy
dc.contributor.author Zakirova, Aigul
dc.contributor.author Zeller, Britta
dc.date.accessioned 2022-01-24T15:25:57Z
dc.date.available 2022-01-24T15:25:57Z
dc.date.issued 2022-01-17
dc.identifier.uri http://hdl.handle.net/11234/1-4629
dc.description Universal Segmentations (UniSegments) is a collection of lexical resources capturing morphological segmentations harmonised into a cross-linguistically consistent annotation scheme for many languages. The annotation scheme consists of simple tab-separated columns that stores a word and its morphological segmentations, including pieces of information about the word and the segmented units, e.g., part-of-speech categories, type of morphs/morphemes etc. The current public version of the collection contains 38 harmonised segmentation datasets covering 30 different languages.
dc.language.iso ces
dc.language.iso cat
dc.language.iso deu
dc.language.iso eng
dc.language.iso fas
dc.language.iso fin
dc.language.iso fra
dc.language.iso hbs
dc.language.iso hrv
dc.language.iso hun
dc.language.iso ita
dc.language.iso kpv
dc.language.iso lat
dc.language.iso mdf
dc.language.iso chm
dc.language.iso mon
dc.language.iso myv
dc.language.iso pol
dc.language.iso por
dc.language.iso rus
dc.language.iso spa
dc.language.iso swe
dc.language.iso tgk
dc.language.iso udm
dc.language.iso hye
dc.language.iso ben
dc.language.iso hin
dc.language.iso mal
dc.language.iso mar
dc.language.iso kan
dc.publisher Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
dc.relation.isreferencedby https://ufal.mff.cuni.cz/techrep/tr69.pdf
dc.rights Universal Segmentations 1.0 License Terms
dc.rights.uri https://lindat.mff.cuni.cz/repository/xmlui/page/licence-unisegs-1.0
dc.source.uri https://ufal.mff.cuni.cz/universal-segmentations
dc.subject universal segmentations
dc.subject morphological segmentation
dc.subject word segmentation
dc.subject segmentation
dc.subject morphology
dc.subject morphemes
dc.subject morphological dictionary
dc.subject unisegments
dc.subject morph
dc.subject multilingual
dc.title Universal Segmentations 1.0 (UniSegments 1.0)
dc.type lexicalConceptualResource
metashare.ResourceInfo#ContentInfo.mediaType text
metashare.ResourceInfo#ContentInfo.detailedType lexicon
dc.rights.label PUB
has.files yes
branding LINDAT / CLARIAH-CZ
contact.person Jonáš Vidra vidra@ufal.mff.cuni.cz Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
contact.person Zdeněk Žabokrtský zabokrtsky@ufal.mff.cuni.cz Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
sponsor Grantová agentura České Republiky 19-14534S Popis slovotvorné struktury českých slov na základě jazykových dat nationalFunds
sponsor Charles University START/HUM/010 A data-based approach to competition in word-formation: selected semantic categories across seven languages nationalFunds
sponsor Univerzita Karlova (mimo GAUK) SVV 260 453 Specifický vysokoškolský výzkum nationalFunds
sponsor Ministerstvo školství, mládeže a tělovýchovy České republiky LM2015071 LINDAT/CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat nationalFunds
sponsor Ministerstvo školství, mládeže a tělovýchovy České republiky LM2018101 LINDAT/CLARIAH-CZ: Digitální výzkumná infrastruktura pro jazykové technologie, umění a humanitní vědy nationalFunds
size.info 38 files
files.size 136889577
files.count 1


 Files in this item

This item is
Publicly Available
and licensed under:
Universal Segmentations 1.0 License Terms
The MIT License Distributed under Creative Commons
Icon
Name
UniSegments-1.0-public.tar.gz
Size
130.55 MB
Format
application/x-gzip
Description
unisegments-1.0
MD5
d8a436b31b51e0123231213290f455fd
 Download file  Preview
 File Preview  
  • UniSegments-1.0-public
    • data
      • por-MorphyNet
        • README.TXT633 B
        • LICENSE.TXT19 kB
        • UniSegments-1.0-por-MorphyNet.useg75 MB
      • fin-MorphyNet
        • README.TXT633 B
        • LICENSE.TXT19 kB
        • UniSegments-1.0-fin-MorphyNet.useg67 MB
      • fra-Demonette
        • README.TXT399 B
        • LICENSE.TXT19 kB
        • UniSegments-1.0-fra-Demonette.useg16 MB
      • eng-MorphoLex
        • README.TXT1004 B
        • UniSegments-1.0-eng-MorphoLex.useg17 MB
        • LICENSE.TXT15 kB
      • rus-DerivBaseRU
        • README.TXT372 B
        • LICENSE.TXT10 kB
        • UniSegments-1.0-rus-DerivBaseRU.useg33 MB
      • fas-PerSegLex
        • README.TXT498 B
        • UniSegments-1.0-fas-PerSegLex.useg8 MB
        • LICENSE.TXT15 kB
      • rus-MorphyNet
        • README.TXT633 B
        • LICENSE.TXT19 kB
        • UniSegments-1.0-rus-MorphyNet.useg119 MB
      • fra-Echantinom
        • README.TXT203 B
        • LICENSE.TXT13 kB
        • UniSegments-1.0-fra-Echantinom.useg1 MB
      • myv-Uniparser
        • README.TXT522 B
        • LICENSE.TXT1 kB
        • UniSegments-1.0-myv-Uniparser.useg52 MB
      • mon-MorphyNet
        • README.TXT633 B
        • LICENSE.TXT19 kB
        • UniSegments-1.0-mon-MorphyNet.useg5 MB
      • fra-MorphyNet
        • README.TXT633 B
        • UniSegments-1.0-fra-MorphyNet.useg61 MB
        • LICENSE.TXT19 kB
      • hin-KCIS
        • README.TXT1 kB
        • LICENSE.TXT13 kB
        • UniSegments-1.0-hin-KCIS.useg2 MB
      • hye-Uniparser
        • README.TXT522 B
        • LICENSE.TXT1 kB
        • UniSegments-1.0-hye-Uniparser.useg189 MB
      • spa-MorphyNet
        • README.TXT633 B
        • LICENSE.TXT19 kB
        • UniSegments-1.0-spa-MorphyNet.useg90 MB
      • mhr-Uniparser
        • README.TXT522 B
        • LICENSE.TXT1 kB
        • UniSegments-1.0-mhr-Uniparser.useg83 MB
      • cat-MorphyNet
        • README.TXT633 B
        • UniSegments-1.0-cat-MorphyNet.useg86 MB
        • LICENSE.TXT19 kB
      • fra-MorphoLex
        • README.TXT1004 B
        • LICENSE.TXT15 kB
        • UniSegments-1.0-fra-MorphoLex.useg2 MB
      • ita-MorphyNet
        • README.TXT633 B
        • UniSegments-1.0-ita-MorphyNet.useg100 MB
        • LICENSE.TXT19 kB
      • pol-MorphyNet
        • README.TXT633 B
        • UniSegments-1.0-pol-MorphyNet.useg85 MB
        • LICENSE.TXT19 kB
      • kan-KCIS
        • README.TXT1 kB
        • LICENSE.TXT13 kB
        • UniSegments-1.0-kan-KCIS.useg10 MB
      • udm-Uniparser
        • README.TXT522 B
        • LICENSE.TXT1 kB
        • UniSegments-1.0-udm-Uniparser.useg134 MB
      • deu-MorphyNet
        • README.TXT633 B
        • LICENSE.TXT19 kB
        • UniSegments-1.0-deu-MorphyNet.useg4 MB
      • tgk-Uniparser
        • README.TXT522 B
        • LICENSE.TXT1 kB
        • UniSegments-1.0-tgk-Uniparser.useg78 MB
      • mal-KCIS
        • README.TXT1 kB
        • UniSegments-1.0-mal-KCIS.useg18 MB
        • LICENSE.TXT13 kB
      • ita-DerIvaTario
        • README.TXT438 B
        • LICENSE.TXT14 kB
        • UniSegments-1.0-ita-DerIvaTario.useg4 MB
      • mar-KCIS
        • README.TXT1 kB
        • LICENSE.TXT13 kB
        • UniSegments-1.0-mar-KCIS.useg13 MB
      • lat-WordFormationLatin
        • README.TXT336 B
        • LICENSE.TXT15 kB
        • UniSegments-1.0-lat-WordFormationLatin.useg8 MB
      • ben-KCIS
        • README.TXT1 kB
        • LICENSE.TXT13 kB
        • UniSegments-1.0-ben-KCIS.useg950 kB
      • ces-DeriNet
        • README.TXT643 B
        • LICENSE.TXT19 kB
        • UniSegments-1.0-ces-DeriNet.useg291 MB
      • swe-MorphyNet
        • README.TXT633 B
        • UniSegments-1.0-swe-MorphyNet.useg73 MB
        • LICENSE.TXT19 kB
      • hrv-CroDeriV
        • README.TXT506 B
        • LICENSE.TXT19 kB
        • UniSegments-1.0-hrv-CroDeriV.useg3 MB
      • mdf-Uniparser
        • UniSegments-1.0-mdf-Uniparser.useg32 MB
        • README.TXT522 B
        • LICENSE.TXT1 kB
      • hbs-MorphyNet
        • README.TXT633 B
        • LICENSE.TXT19 kB
        • UniSegments-1.0-hbs-MorphyNet.useg5 MB
      • kpv-Uniparser
        • README.TXT522 B
        • LICENSE.TXT1 kB
        • UniSegments-1.0-kpv-Uniparser.useg63 MB
      • hun-MorphyNet
        • README.TXT633 B
        • LICENSE.TXT19 kB
        • UniSegments-1.0-hun-MorphyNet.useg72 MB
      • deu-DerivBaseDE
        • README.TXT496 B
        • LICENSE.TXT19 kB
        • UniSegments-1.0-deu-DerivBaseDE.useg10 MB
      • ces-MorphyNet
        • README.TXT633 B
        • LICENSE.TXT19 kB
        • UniSegments-1.0-ces-MorphyNet.useg11 MB
      • eng-MorphyNet
        • README.TXT633 B
        • LICENSE.TXT19 kB
        • UniSegments-1.0-eng-MorphyNet.useg49 MB
    • doc
      • LICENSE2 kB
      • README.md3 kB
      • Towards-Universal-Segmentations-Survey-of-Existing-Morphosegmentation-Resources.pdf441 kB

Show simple item record