Show simple item record

 
dc.contributor.author Straka, Milan
dc.contributor.author Straková, Jana
dc.date.accessioned 2017-08-02T18:43:53Z
dc.date.available 2017-08-02T18:43:53Z
dc.date.issued 2017-08-01
dc.identifier.uri http://hdl.handle.net/11234/1-2364
dc.description Tokenizer, POS Tagger, Lemmatizer and Parser models for all 50 languages of Universal Depenencies 2.0 Treebanks, created solely using UD 2.0 data (http://hdl.handle.net/11234/1-1983). The model documentation including performance can be found at http://ufal.mff.cuni.cz/udpipe/users-manual#universal_dependencies_20_models . To use these models, you need UDPipe binary version at least 1.2, which you can download from http://ufal.mff.cuni.cz/udpipe . In addition to models itself, all additional data and value of hyperparameters used for training are available in the second archive, allowing reproducible training.
dc.language.iso eng
dc.publisher Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
dc.relation.isreplacedby http://hdl.handle.net/11234/1-2898
dc.rights Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
dc.rights.uri http://creativecommons.org/licenses/by-nc-sa/4.0/
dc.source.uri http://ufal.mff.cuni.cz/udpipe
dc.subject tokenizer
dc.subject POS tagger
dc.subject lemmatization
dc.subject tagger
dc.subject parser
dc.subject dependency parser
dc.title Universal Dependencies 2.0 Models for UDPipe (2017-08-01)
dc.type toolService
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent true
metashare.ResourceInfo#ContentInfo.detailedType tool
dc.rights.label PUB
has.files yes
branding LINDAT / CLARIAH-CZ
demo.uri http://lindat.mff.cuni.cz/services/udpipe/
contact.person Milan Straka straka@ufal.mff.cuni.cz Charles University, UFAL
sponsor Ministerstvo školství, mládeže a tělovýchovy České republiky LM2015071 LINDAT/CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat nationalFunds
sponsor Univerzita Karlova (mimo GAUK) SVV 260 453 Specifický vysokoškolský výzkum nationalFunds
sponsor Ministerstvo školství, mládeže a tělovýchovy České republiky CZ.02.1.01/0.0/0.0/16_013/0001781 LINDAT/CLARIN - Výzkumná infrastruktura pro jazykové technologie - rozšíření repozitáře a výpočetní kapacity nationalFunds
sponsor Ministerstvo školství, mládeže a tělovýchovy České republiky CZ.02.2.69/0.0/0.0/16_018/0002373 Moderniza ce oboru Matematická lingvistika nationalFunds
files.size 1053800066
files.count 2


 Files in this item

 Download all files in item (1004.98 MB)
Icon
Name
udpipe-ud-2.0-170801.zip
Size
896.89 MB
Format
application/zip
Description
Universal Dependencies 2.0 Models for UDPipe
MD5
44eb11fbb19daa4902675186f2858626
 Download file  Preview
 File Preview  
  • udpipe-ud-2.0-170801
    • LICENSE20 kB
    • swedish-ud-2.0-170801.udpipe7 MB
    • gothic-ud-2.0-170801.udpipe6 MB
    • bulgarian-ud-2.0-170801.udpipe13 MB
    • english-ud-2.0-170801.udpipe15 MB
    • portuguese-ud-2.0-170801.udpipe18 MB
    • ancient_greek-ud-2.0-170801.udpipe17 MB
    • vietnamese-ud-2.0-170801.udpipe4 MB
    • japanese-ud-2.0-170801.udpipe12 MB
    • greek-ud-2.0-170801.udpipe7 MB
    • persian-ud-2.0-170801.udpipe11 MB
    • slovenian-sst-ud-2.0-170801.udpipe4 MB
    • french-partut-ud-2.0-170801.udpipe3 MB
    • ancient_greek-proiel-ud-2.0-170801.udpipe22 MB
    • latin-ittb-ud-2.0-170801.udpipe17 MB
    • swedish-lines-ud-2.0-170801.udpipe8 MB
    • indonesian-ud-2.0-170801.udpipe9 MB
    • danish-ud-2.0-170801.udpipe9 MB
    • english-partut-ud-2.0-170801.udpipe5 MB
    • galician-ud-2.0-170801.udpipe8 MB
    • coptic-ud-2.0-170801.udpipe2 MB
    • finnish-ftb-ud-2.0-170801.udpipe19 MB
    • hebrew-ud-2.0-170801.udpipe14 MB
    • sanskrit-ud-2.0-170801.udpipe2 MB
    • kazakh-ud-2.0-170801.udpipe1 MB
    • finnish-ud-2.0-170801.udpipe23 MB
    • norwegian-bokmaal-ud-2.0-170801.udpipe19 MB
    • czech-cltt-ud-2.0-170801.udpipe3 MB
    • tamil-ud-2.0-170801.udpipe2 MB
    • ukrainian-ud-2.0-170801.udpipe4 MB
    • czech-cac-ud-2.0-170801.udpipe30 MB
    • old_church_slavonic-ud-2.0-170801.udpipe7 MB
    • README28 kB
    • norwegian-nynorsk-ud-2.0-170801.udpipe19 MB
    • irish-ud-2.0-170801.udpipe3 MB
    • dutch-lassysmall-ud-2.0-170801.udpipe8 MB
    • chinese-ud-2.0-170801.udpipe13 MB
    • italian-ud-2.0-170801.udpipe14 MB
    • latin-ud-2.0-170801.udpipe6 MB
    • russian-syntagrus-ud-2.0-170801.udpipe42 MB
    • arabic-ud-2.0-170801.udpipe17 MB
    • spanish-ud-2.0-170801.udpipe27 MB
    • french-ud-2.0-170801.udpipe20 MB
    • urdu-ud-2.0-170801.udpipe15 MB
    • portuguese-br-ud-2.0-170801.udpipe13 MB
    • turkish-ud-2.0-170801.udpipe9 MB
    • estonian-ud-2.0-170801.udpipe7 MB
    • hindi-ud-2.0-170801.udpipe24 MB
    • romanian-ud-2.0-170801.udpipe14 MB
    • polish-ud-2.0-170801.udpipe12 MB
    • russian-ud-2.0-170801.udpipe12 MB
    • croatian-ud-2.0-170801.udpipe20 MB
    • german-ud-2.0-170801.udpipe25 MB
    • galician-treegal-ud-2.0-170801.udpipe3 MB
    • slovak-ud-2.0-170801.udpipe17 MB
    • slovenian-ud-2.0-170801.udpipe16 MB
    • korean-ud-2.0-170801.udpipe13 MB
    • dutch-ud-2.0-170801.udpipe19 MB
    • latin-proiel-ud-2.0-170801.udpipe18 MB
    • french-sequoia-ud-2.0-170801.udpipe5 MB
    • czech-ud-2.0-170801.udpipe53 MB
    • catalan-ud-2.0-170801.udpipe18 MB
    • lithuanian-ud-2.0-170801.udpipe2 MB
    • basque-ud-2.0-170801.udpipe13 MB
    • spanish-ancora-ud-2.0-170801.udpipe19 MB
    • hungarian-ud-2.0-170801.udpipe7 MB
    • uyghur-ud-2.0-170801.udpipe1 MB
    • latvian-ud-2.0-170801.udpipe9 MB
    • english-lines-ud-2.0-170801.udpipe6 MB
    • README.html57 kB
    • belarusian-ud-2.0-170801.udpipe2 MB
Icon
Name
udpipe-ud-2.0-170801-reproducible_training.zip
Size
108.09 MB
Format
application/zip
Description
Scripts, embeddings and other data required for reproducible training of UD 2.0 Models for UDPipe
MD5
24affd2bf9569fc1743ccf9687b17dcc
 Download file  Preview
 File Preview  
  • udpipe-ud-2.0-170801-reproducible_training
    • ud-2.0-raw-texts
      • da.txt488 kB
      • README135 B
      • sl_sst.txt488 kB
      • fi_ftb.txt488 kB
      • grc_proiel.README83 B
      • la_proiel.txt488 kB
      • grc_proiel.txt488 kB
    • README.txt629 B
    • models-ud-2.0
      • params_tagger30 kB
      • params_parser24 kB
      • train.sh747 B
      • binaries.sh65 B
      • train_all.sh193 B
      • params_tokenizer8 kB
    • ud-2.0-embeddings
      • eu.skip.forms.50.vectors3 MB
      • ar.skip.forms.50.vectors5 MB
      • cop.skip.forms.50.vectors211 kB
      • vi.skip.forms.50.vectors984 kB
      • pt_br.skip.forms.50.vectors5 MB
      • pl.skip.forms.50.vectors2 MB
      • hu.skip.forms.50.vectors1 MB
      • kk.skip.forms.50.vectors25 kB
      • gen_all.sh80 B
      • en.skip.forms.50.vectors4 MB
      • grc_proiel.skip.forms.50.vectors5 MB
      • sl.skip.forms.50.vectors4 MB
      • da.skip.forms.50.vectors2 MB
      • bg.skip.forms.50.vectors4 MB
      • es.skip.forms.50.vectors8 MB
      • ru.skip.forms.50.vectors3 MB
      • en_partut.skip.forms.50.vectors1 MB
      • hi.skip.forms.50.vectors4 MB
      • uk.skip.forms.50.vectors449 kB
      • fi_ftb.skip.forms.50.vectors4 MB
      • nl_lassysmall.skip.forms.50.vectors2 MB
      • sv.skip.forms.50.vectors2 MB
      • gen.sh339 B
      • tr.skip.forms.50.vectors1 MB
      • ga.skip.forms.50.vectors446 kB
      • la_ittb.skip.forms.50.vectors3 MB
      • es_ancora.skip.forms.50.vectors8 MB
      • el.skip.forms.50.vectors1 MB
      • pt.skip.forms.50.vectors5 MB
      • ta.skip.forms.50.vectors452 kB
      • sl_sst.skip.forms.50.vectors556 kB
      • be.skip.forms.50.vectors255 kB
      • fr_partut.skip.forms.50.vectors639 kB
      • ja.skip.forms.50.vectors4 MB
      • binaries.sh137 B
      • zh.skip.forms.50.vectors3 MB
      • lv.skip.forms.50.vectors1 MB
      • cs_cltt.skip.forms.50.vectors870 kB
      • la_proiel.skip.forms.50.vectors4 MB
      • it.skip.forms.50.vectors6 MB
      • gl_treegal.skip.forms.50.vectors529 kB
      • ca.skip.forms.50.vectors7 MB
      • sv_lines.skip.forms.50.vectors1 MB
      • cs_cac.skip.forms.50.vectors13 MB
      • fr.skip.forms.50.vectors7 MB
      • ug.skip.forms.50.vectors96 kB
      • lt.skip.forms.50.vectors175 kB
      • got.skip.forms.50.vectors1 MB
      • fa.skip.forms.50.vectors3 MB
      • en_lines.skip.forms.50.vectors1 MB
      • sa.skip.forms.50.vectors50 kB
      • cu.skip.forms.50.vectors1 MB
      • he.skip.forms.50.vectors3 MB
      • et.skip.forms.50.vectors1 MB
      • no_bokmaal.skip.forms.50.vectors5 MB
      • ro.skip.forms.50.vectors5 MB
      • grc.skip.forms.50.vectors5 MB
      • nl.skip.forms.50.vectors4 MB
      • sk.skip.forms.50.vectors3 MB
      • de.skip.forms.50.vectors7 MB
      • la.skip.forms.50.vectors740 kB
      • ru_syntagrus.skip.forms.50.vectors22 MB
      • cs.skip.forms.50.vectors27 MB
      • ko.skip.forms.50.vectors2 MB
      • fi.skip.forms.50.vectors6 MB
      • gl.skip.forms.50.vectors2 MB
      • id.skip.forms.50.vectors3 MB
      • ar_nyuad.skip.forms.50.vectors960 B
      • no_nynorsk.skip.forms.50.vectors5 MB
      • hr.skip.forms.50.vectors6 MB
      • fr_sequoia.skip.forms.50.vectors1 MB
      • ur.skip.forms.50.vectors2 MB
    • ud-2.0
      • conllu_resplit.pl1 kB
      • get.sh1 kB

Show simple item record