This is not the latest version of this item. The latest version can be found here.
Universal Dependencies 2.0 Models for UDPipe (2017-08-01)
Please use the following text to cite this item or export to a predefined format:
Straka, Milan and Straková, Jana, 2017,
Universal Dependencies 2.0 Models for UDPipe (2017-08-01), LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-2364.
Authors
Item identifier
Project URL
Date issued
2017-08-01
Type
Language(s)
Description
Tokenizer, POS Tagger, Lemmatizer and Parser models for all 50 languages of Universal Depenencies 2.0 Treebanks, created solely using UD 2.0 data (http://hdl.handle.net/11234/1-1983). The model documentation including performance can be found at http://ufal.mff.cuni.cz/udpipe/users-manual#universal_dependencies_20_models .
To use these models, you need UDPipe binary version at least 1.2, which you can download from http://ufal.mff.cuni.cz/udpipe .
In addition to models itself, all additional data and value of hyperparameters used for training are available in the second archive, allowing reproducible training.
Acknowledgement
Ministerstvo školství, mládeže a tělovýchovy České republiky
Project code:LM2015071
Project name:LINDAT/CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat
Univerzita Karlova (mimo GAUK)
Project code:SVV 260 453
Project name:Specifický vysokoškolský výzkum
Ministerstvo školství, mládeže a tělovýchovy České republiky
Project code:CZ.02.1.01/0.0/0.0/16_013/0001781
Project name:LINDAT/CLARIN - Výzkumná infrastruktura pro jazykové technologie - rozšíření repozitáře a výpočetní kapacity
Ministerstvo školství, mládeže a tělovýchovy České republiky
Project code:CZ.02.2.69/0.0/0.0/16_018/0002373
Project name:Moderniza ce oboru Matematická lingvistika
Subject(s)
Collections
Version History
This item isPublicly Available
and licensed under:
Files in this item
- Name
- udpipe-ud-2.0-170801-reproducible_training.zip
- Size
- 108.09 MB
- Format
- application/zip
- Description
- Scripts, embeddings and other data required for reproducible training of UD 2.0 Models for UDPipe
- MD5
- 24affd2bf9569fc1743ccf9687b17dcc

- udpipe-ud-2.0-170801-reproducible_training
- ud-2.0-raw-texts
- da.txt488 kB
- sl_sst.txt488 kB
- README135 B
- fi_ftb.txt488 kB
- grc_proiel.README83 B
- la_proiel.txt488 kB
- grc_proiel.txt488 kB
- README.txt629 B
- ud-2.0-embeddings
- eu.skip.forms.50.vectors3 MB
- ar.skip.forms.50.vectors5 MB
- cop.skip.forms.50.vectors211 kB
- vi.skip.forms.50.vectors984 kB
- pt_br.skip.forms.50.vectors5 MB
- pl.skip.forms.50.vectors2 MB
- hu.skip.forms.50.vectors1 MB
- kk.skip.forms.50.vectors25 kB
- gen_all.sh80 B
- en.skip.forms.50.vectors4 MB
- grc_proiel.skip.forms.50.vectors5 MB
- sl.skip.forms.50.vectors4 MB
- da.skip.forms.50.vectors2 MB
- bg.skip.forms.50.vectors4 MB
- es.skip.forms.50.vectors8 MB
- ru.skip.forms.50.vectors3 MB
- en_partut.skip.forms.50.vectors1 MB
- hi.skip.forms.50.vectors4 MB
- uk.skip.forms.50.vectors449 kB
- fi_ftb.skip.forms.50.vectors4 MB
- nl_lassysmall.skip.forms.50.vectors2 MB
- sv.skip.forms.50.vectors2 MB
- gen.sh339 B
- tr.skip.forms.50.vectors1 MB
- ga.skip.forms.50.vectors446 kB
- la_ittb.skip.forms.50.vectors3 MB
- es_ancora.skip.forms.50.vectors8 MB
- el.skip.forms.50.vectors1 MB
- pt.skip.forms.50.vectors5 MB
- ta.skip.forms.50.vectors452 kB
- sl_sst.skip.forms.50.vectors556 kB
- be.skip.forms.50.vectors255 kB
- fr_partut.skip.forms.50.vectors639 kB
- ja.skip.forms.50.vectors4 MB
- binaries.sh137 B
- lv.skip.forms.50.vectors1 MB
- zh.skip.forms.50.vectors3 MB
- cs_cltt.skip.forms.50.vectors870 kB
- la_proiel.skip.forms.50.vectors4 MB
- it.skip.forms.50.vectors6 MB
- gl_treegal.skip.forms.50.vectors529 kB
- ca.skip.forms.50.vectors7 MB
- sv_lines.skip.forms.50.vectors1 MB
- cs_cac.skip.forms.50.vectors13 MB
- fr.skip.forms.50.vectors7 MB
- ug.skip.forms.50.vectors96 kB
- lt.skip.forms.50.vectors175 kB
- got.skip.forms.50.vectors1 MB
- fa.skip.forms.50.vectors3 MB
- en_lines.skip.forms.50.vectors1 MB
- sa.skip.forms.50.vectors50 kB
- cu.skip.forms.50.vectors1 MB
- he.skip.forms.50.vectors3 MB
- et.skip.forms.50.vectors1 MB
- no_bokmaal.skip.forms.50.vectors5 MB
- ro.skip.forms.50.vectors5 MB
- grc.skip.forms.50.vectors5 MB
- nl.skip.forms.50.vectors4 MB
- sk.skip.forms.50.vectors3 MB
- de.skip.forms.50.vectors7 MB
- la.skip.forms.50.vectors740 kB
- ru_syntagrus.skip.forms.50.vectors22 MB
- cs.skip.forms.50.vectors27 MB
- ko.skip.forms.50.vectors2 MB
- fi.skip.forms.50.vectors6 MB
- gl.skip.forms.50.vectors2 MB
- id.skip.forms.50.vectors3 MB
- ar_nyuad.skip.forms.50.vectors960 B
- no_nynorsk.skip.forms.50.vectors5 MB
- hr.skip.forms.50.vectors6 MB
- fr_sequoia.skip.forms.50.vectors1 MB
- ur.skip.forms.50.vectors2 MB
- models-ud-2.0
- params_parser24 kB
- params_tagger30 kB
- train.sh747 B
- binaries.sh65 B
- train_all.sh193 B
- params_tokenizer8 kB
- ud-2.0
- conllu_resplit.pl1 kB
- get.sh1 kB
- ud-2.0-raw-texts
- Name
- udpipe-ud-2.0-170801.zip
- Size
- 896.89 MB
- Format
- application/zip
- Description
- Universal Dependencies 2.0 Models for UDPipe
- MD5
- 44eb11fbb19daa4902675186f2858626

- udpipe-ud-2.0-170801
- LICENSE20 kB
- swedish-ud-2.0-170801.udpipe7 MB
- gothic-ud-2.0-170801.udpipe6 MB
- bulgarian-ud-2.0-170801.udpipe13 MB
- english-ud-2.0-170801.udpipe15 MB
- portuguese-ud-2.0-170801.udpipe18 MB
- ancient_greek-ud-2.0-170801.udpipe17 MB
- vietnamese-ud-2.0-170801.udpipe4 MB
- japanese-ud-2.0-170801.udpipe12 MB
- greek-ud-2.0-170801.udpipe7 MB
- persian-ud-2.0-170801.udpipe11 MB
- slovenian-sst-ud-2.0-170801.udpipe4 MB
- french-partut-ud-2.0-170801.udpipe3 MB
- ancient_greek-proiel-ud-2.0-170801.udpipe22 MB
- indonesian-ud-2.0-170801.udpipe9 MB
- latin-ittb-ud-2.0-170801.udpipe17 MB
- swedish-lines-ud-2.0-170801.udpipe8 MB
- danish-ud-2.0-170801.udpipe9 MB
- coptic-ud-2.0-170801.udpipe2 MB
- galician-ud-2.0-170801.udpipe8 MB
- english-partut-ud-2.0-170801.udpipe5 MB
- finnish-ftb-ud-2.0-170801.udpipe19 MB
- hebrew-ud-2.0-170801.udpipe14 MB
- sanskrit-ud-2.0-170801.udpipe2 MB
- kazakh-ud-2.0-170801.udpipe1 MB
- finnish-ud-2.0-170801.udpipe23 MB
- norwegian-bokmaal-ud-2.0-170801.udpipe19 MB
- czech-cltt-ud-2.0-170801.udpipe3 MB
- tamil-ud-2.0-170801.udpipe2 MB
- ukrainian-ud-2.0-170801.udpipe4 MB
- czech-cac-ud-2.0-170801.udpipe30 MB
- README28 kB
- old_church_slavonic-ud-2.0-170801.udpipe7 MB
- norwegian-nynorsk-ud-2.0-170801.udpipe19 MB
- irish-ud-2.0-170801.udpipe3 MB
- dutch-lassysmall-ud-2.0-170801.udpipe8 MB
- chinese-ud-2.0-170801.udpipe13 MB
- russian-syntagrus-ud-2.0-170801.udpipe42 MB
- latin-ud-2.0-170801.udpipe6 MB
- italian-ud-2.0-170801.udpipe14 MB
- arabic-ud-2.0-170801.udpipe17 MB
- spanish-ud-2.0-170801.udpipe27 MB
- french-ud-2.0-170801.udpipe20 MB
- urdu-ud-2.0-170801.udpipe15 MB
- portuguese-br-ud-2.0-170801.udpipe13 MB
- turkish-ud-2.0-170801.udpipe9 MB
- estonian-ud-2.0-170801.udpipe7 MB
- hindi-ud-2.0-170801.udpipe24 MB
- romanian-ud-2.0-170801.udpipe14 MB
- polish-ud-2.0-170801.udpipe12 MB
- russian-ud-2.0-170801.udpipe12 MB
- croatian-ud-2.0-170801.udpipe20 MB
- german-ud-2.0-170801.udpipe25 MB
- galician-treegal-ud-2.0-170801.udpipe3 MB
- slovak-ud-2.0-170801.udpipe17 MB
- slovenian-ud-2.0-170801.udpipe16 MB
- korean-ud-2.0-170801.udpipe13 MB
- dutch-ud-2.0-170801.udpipe19 MB
- french-sequoia-ud-2.0-170801.udpipe5 MB
- latin-proiel-ud-2.0-170801.udpipe18 MB
- czech-ud-2.0-170801.udpipe53 MB
- catalan-ud-2.0-170801.udpipe18 MB
- lithuanian-ud-2.0-170801.udpipe2 MB
- basque-ud-2.0-170801.udpipe13 MB
- spanish-ancora-ud-2.0-170801.udpipe19 MB
- hungarian-ud-2.0-170801.udpipe7 MB
- uyghur-ud-2.0-170801.udpipe1 MB
- latvian-ud-2.0-170801.udpipe9 MB
- english-lines-ud-2.0-170801.udpipe6 MB
- README.html57 kB
- belarusian-ud-2.0-170801.udpipe2 MB

