dc.contributor.author | Straka, Milan |
dc.date.accessioned | 2018-09-05T11:07:23Z |
dc.date.available | 2018-09-05T11:07:23Z |
dc.date.issued | 2018-05-02 |
dc.identifier.uri | http://hdl.handle.net/11234/1-2859 |
dc.description | Baseline UDPipe models for CoNLL 2018 Shared Task in UD Parsing, and supplementary material. The models require UDPipe version at least 1.2 and are evaluated using the official evaluation script. The models were trained using a custom data split for treebanks where no development data is provided. Also, we trained an additional "Mixed" model, which uses 200 sentences from every training data. All information needed to replicate the model training (hyperparameters, modified train-dev split, and pre-computed word embeddings for the parser) are included in the archive. Additionaly, we provide UD 2.2 CoNLL 2018 training data with automatically predicted morphology. We utilize the baseline models on development data and perform 10-fold jack-knifing (each fold is predicted with a model trained on the rest of the folds) on the training data. |
dc.language.iso | mul |
dc.publisher | Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) |
dc.rights | Licence Universal Dependencies v2.2 |
dc.rights.uri | https://lindat.mff.cuni.cz/repository/xmlui/page/licence-UD-2.2 |
dc.source.uri | http://ufal.mff.cuni.cz/udpipe |
dc.subject | CoNLL 2018 |
dc.subject | tokenizer |
dc.subject | POS tagger |
dc.subject | lemmatization |
dc.subject | tagger |
dc.subject | parser |
dc.subject | dependency parser |
dc.subject | morphology |
dc.subject | treebank |
dc.title | CoNLL 2018 Shared Task - UDPipe Baseline Models and Supplementary Materials |
dc.type | languageDescription |
metashare.ResourceInfo#ContentInfo.mediaType | text |
metashare.ResourceInfo#ContentInfo.detailedType | mlmodel |
dc.rights.label | PUB |
has.files | yes |
branding | LINDAT / CLARIAH-CZ |
contact.person | Milan Straka straka@ufal.mff.cuni.cz Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) |
sponsor | Ministerstvo školství, mládeže a tělovýchovy České republiky LM2015071 LINDAT/CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat nationalFunds |
files.size | 1391416896 |
files.count | 2 |
Soubory tohoto záznamu
- Název
- ud-2.2-conll18-baseline-models.tar.xz
- Velikost
- 1.2 GB
- Formát
- application/x-xz
- Popis
- Baseline UDPipe models for CoNLL 2018 Shared Task in UD Parsing.
- MD5
- 9db1c4e4eacb0ee5dcb259487c721a42
- training
- params_tagger33 kB
- params_parser25 kB
- README.txt1 kB
- embeddings
- ga_idt.skip.forms.50.vectors485 kB
- sr_set.skip.forms.50.vectors2 MB
- ru_taiga.skip.forms.50.vectors364 kB
- nl_alpino.skip.forms.50.vectors4 MB
- hsb_ufal.skip.forms.50.vectors17 kB
- ar_padt.skip.forms.50.vectors5 MB
- mix.skip.forms.50.vectors10 MB
- gl_ctg.skip.forms.50.vectors2 MB
- sv_talbanken.skip.forms.50.vectors2 MB
- sme_giella.skip.forms.50.vectors703 kB
- ja_gsd.skip.forms.50.vectors4 MB
- grc_proiel.skip.forms.50.vectors5 MB
- pl_sz.skip.forms.50.vectors2 MB
- fi_tdt.skip.forms.50.vectors6 MB
- no_nynorsklia.skip.forms.50.vectors122 kB
- fi_ftb.skip.forms.50.vectors4 MB
- nl_lassysmall.skip.forms.50.vectors2 MB
- de_gsd.skip.forms.50.vectors7 MB
- gen.sh346 B
- sl_ssj.skip.forms.50.vectors4 MB
- la_ittb.skip.forms.50.vectors3 MB
- es_ancora.skip.forms.50.vectors8 MB
- sl_sst.skip.forms.50.vectors603 kB
- ca_ancora.skip.forms.50.vectors7 MB
- fr_gsd.skip.forms.50.vectors7 MB
- fa_seraji.skip.forms.50.vectors3 MB
- grc_perseus.skip.forms.50.vectors5 MB
- got_proiel.skip.forms.50.vectors1 MB
- uk_iu.skip.forms.50.vectors3 MB
- pl_lfg.skip.forms.50.vectors3 MB
- la_proiel.skip.forms.50.vectors5 MB
- gl_treegal.skip.forms.50.vectors566 kB
- ur_udtb.skip.forms.50.vectors2 MB
- sv_lines.skip.forms.50.vectors1 MB
- eu_bdt.skip.forms.50.vectors3 MB
- cs_cac.skip.forms.50.vectors13 MB
- bg_btb.skip.forms.50.vectors4 MB
- ko_kaist.skip.forms.50.vectors12 MB
- la_perseus.skip.forms.50.vectors809 kB
- fro_srcmf.skip.forms.50.vectors3 MB
- ro_rrt.skip.forms.50.vectors5 MB
- zh_gsd.skip.forms.50.vectors3 MB
- hu_szeged.skip.forms.50.vectors897 kB
- cs_fictree.skip.forms.50.vectors4 MB
- af_afribooms.skip.forms.50.vectors1 MB
- vi_vtb.skip.forms.50.vectors826 kB
- pt_bosque.skip.forms.50.vectors5 MB
- sk_snk.skip.forms.50.vectors3 MB
- ko_gsd.skip.forms.50.vectors2 MB
- tr_imst.skip.forms.50.vectors1 MB
- hy_armtdp.skip.forms.50.vectors28 kB
- en_lines.skip.forms.50.vectors1 MB
- it_postwita.skip.forms.50.vectors2 MB
- cs_pdt.skip.forms.50.vectors27 MB
- id_gsd.skip.forms.50.vectors3 MB
- no_bokmaal.skip.forms.50.vectors5 MB
- he_htb.skip.forms.50.vectors3 MB
- bxr_bdt.skip.forms.50.vectors7 kB
- ug_udt.skip.forms.50.vectors1007 kB
- en_ewt.skip.forms.50.vectors4 MB
- el_gdt.skip.forms.50.vectors1 MB
- cu_proiel.skip.forms.50.vectors1 MB
- it_isdt.skip.forms.50.vectors6 MB
- ru_syntagrus.skip.forms.50.vectors22 MB
- lv_lvtb.skip.forms.50.vectors3 MB
- en_gum.skip.forms.50.vectors1 MB
- hr_set.skip.forms.50.vectors5 MB
- kmr_mg.skip.forms.50.vectors14 kB
- no_nynorsk.skip.forms.50.vectors5 MB
- fr_sequoia.skip.forms.50.vectors1 MB
- fr_spoken.skip.forms.50.vectors510 kB
- hi_hdtb.skip.forms.50.vectors4 MB
- et_edt.skip.forms.50.vectors10 MB
- kk_ktb.skip.forms.50.vectors26 kB
- da_ddt.skip.forms.50.vectors2 MB
- params_tokenizer8 kB
- training_data
- nl_alpino
- nl_alpino-ud-train.conllu13 MB
- nl_alpino-ud-dev.txt62 kB
- nl_alpino-ud-train.txt1017 kB
- nl_alpino-ud-dev.conllu882 kB
- fro_srcmf
- fro_srcmf-ud-dev.conllu896 kB
- fro_srcmf-ud-dev.txt87 kB
- fro_srcmf-ud-train.txt681 kB
- fro_srcmf-ud-train.conllu6 MB
- fi_ftb
- fi_ftb-ud-dev.txt111 kB
- fi_ftb-ud-train.txt904 kB
- fi_ftb-ud-train.conllu9 MB
- fi_ftb-ud-dev.conllu1 MB
- he_htb
- he_htb-ud-dev.txt71 kB
- he_htb-ud-train.conllu9 MB
- he_htb-ud-train.txt833 kB
- he_htb-ud-dev.conllu838 kB
- vi_vtb
- vi_vtb-ud-dev.conllu522 kB
- vi_vtb-ud-train.conllu939 kB
- vi_vtb-ud-train.txt127 kB
- vi_vtb-ud-dev.txt69 kB
- uk_iu
- uk_iu-ud-train.txt739 kB
- uk_iu-ud-dev.txt106 kB
- uk_iu-ud-dev.conllu995 kB
- uk_iu-ud-train.conllu6 MB
- fi_tdt
- fi_tdt-ud-train.conllu11 MB
- fi_tdt-ud-dev.txt136 kB
- fi_tdt-ud-train.txt1 MB
- fi_tdt-ud-dev.conllu1 MB
- pl_lfg
- pl_lfg-ud-dev.txt74 kB
- pl_lfg-ud-train.txt596 kB
- pl_lfg-ud-train.conllu11 MB
- pl_lfg-ud-dev.conllu1 MB
- ga_idt
- ga_idt-ud-train.conllu792 kB
- ga_idt-ud-dev.conllu90 kB
- ga_idt-ud-train.txt64 kB
- ga_idt-ud-dev.txt7 kB
- hi_hdtb
- hi_hdtb-ud-dev.txt427 kB
- hi_hdtb-ud-train.conllu39 MB
- hi_hdtb-ud-train.txt3 MB
- hi_hdtb-ud-dev.conllu4 MB
- no_nynorsk
- no_nynorsk-ud-train.conllu14 MB
- no_nynorsk-ud-train.txt1 MB
- no_nynorsk-ud-dev.txt166 kB
- no_nynorsk-ud-dev.conllu1 MB
- sl_sst
- sl_sst-ud-dev.txt11 kB
- sl_sst-ud-train.conllu1 MB
- sl_sst-ud-train.txt91 kB
- sl_sst-ud-dev.conllu191 kB
- tr_imst
- tr_imst-ud-dev.conllu787 kB
- tr_imst-ud-train.txt246 kB
- tr_imst-ud-train.conllu2 MB
- tr_imst-ud-dev.txt63 kB
- fa_seraji
- fa_seraji-ud-train.txt995 kB
- fa_seraji-ud-dev.conllu989 kB
- fa_seraji-ud-dev.txt133 kB
- fa_seraji-ud-train.conllu7 MB
- sk_snk
- sk_snk-ud-train.txt447 kB
- sk_snk-ud-train.conllu6 MB
- sk_snk-ud-dev.conllu1 MB
- sk_snk-ud-dev.txt77 kB
- sl_ssj
- sl_ssj-ud-train.txt617 kB
- sl_ssj-ud-dev.conllu1 MB
- sl_ssj-ud-dev.txt79 kB
- sl_ssj-ud-train.conllu9 MB
- grc_perseus
- grc_perseus-ud-train.conllu14 MB
- grc_perseus-ud-dev.txt250 kB
- grc_perseus-ud-train.txt1 MB
- grc_perseus-ud-dev.conllu1 MB
- sv_talbanken
- sv_talbanken-ud-train.conllu5 MB
- sv_talbanken-ud-train.txt402 kB
- sv_talbanken-ud-dev.txt58 kB
- sv_talbanken-ud-dev.conllu834 kB
- got_proiel
- got_proiel-ud-dev.txt62 kB
- got_proiel-ud-train.conllu3 MB
- got_proiel-ud-dev.conllu928 kB
- got_proiel-ud-train.txt221 kB
- eu_bdt
- eu_bdt-ud-train.txt459 kB
- eu_bdt-ud-dev.conllu1 MB
- eu_bdt-ud-train.conllu4 MB
- eu_bdt-ud-dev.txt151 kB
- bxr_bdt
- bxr_bdt-ud-dev.conllu1 kB
- bxr_bdt-ud-train.conllu9 kB
- bxr_bdt-ud-train.txt1 kB
- bxr_bdt-ud-dev.txt192 B
- ru_taiga
- ru_taiga-ud-dev.txt13 kB
- ru_taiga-ud-dev.conllu113 kB
- ru_taiga-ud-train.txt82 kB
- ru_taiga-ud-train.conllu712 kB
- ca_ancora
- ca_ancora-ud-train.conllu26 MB
- ca_ancora-ud-dev.conllu3 MB
- ca_ancora-ud-dev.txt289 kB
- ca_ancora-ud-train.txt2 MB
- hy_armtdp
- hy_armtdp-ud-dev.txt1 kB
- hy_armtdp-ud-dev.conllu10 kB
- hy_armtdp-ud-train.txt7 kB
- hy_armtdp-ud-train.conllu67 kB
- en_lines
- en_lines-ud-train.txt239 kB
- en_lines-ud-dev.txt82 kB
- en_lines-ud-dev.conllu954 kB
- en_lines-ud-train.conllu2 MB
- kk_ktb
- kk_ktb-ud-dev.conllu6 kB
- kk_ktb-ud-train.conllu33 kB
- kk_ktb-ud-train.txt4 kB
- kk_ktb-ud-dev.txt995 B
- ja_gsd
- ja_gsd-ud-dev.conllu612 kB
- ja_gsd-ud-dev.txt57 kB
- ja_gsd-ud-train.txt802 kB
- ja_gsd-ud-train.conllu8 MB
- conllu_to_text.pl8 kB
- sr_set
- sr_set-ud-dev.txt58 kB
- sr_set-ud-train.txt384 kB
- sr_set-ud-train.conllu4 MB
- sr_set-ud-dev.conllu677 kB
- et_edt
- et_edt-ud-dev.conllu2 MB
- et_edt-ud-dev.txt237 kB
- et_edt-ud-train.txt1 MB
- et_edt-ud-train.conllu18 MB
- sv_lines
- sv_lines-ud-dev.txt91 kB
- sv_lines-ud-train.txt265 kB
- sv_lines-ud-train.conllu3 MB
- sv_lines-ud-dev.conllu1 MB
- it_postwita
- it_postwita-ud-train.conllu5 MB
- it_postwita-ud-train.txt541 kB
- it_postwita-ud-dev.conllu754 kB
- it_postwita-ud-dev.txt66 kB
- en_ewt
- en_ewt-ud-train.txt985 kB
- en_ewt-ud-dev.txt123 kB
- en_ewt-ud-dev.conllu1 MB
- en_ewt-ud-train.conllu12 MB
- get.sh780 B
- la_perseus
- la_perseus-ud-train.conllu1 MB
- la_perseus-ud-dev.txt10 kB
- la_perseus-ud-dev.conllu157 kB
- la_perseus-ud-train.txt94 kB
- pl_sz
- pl_sz-ud-train.conllu5 MB
- pl_sz-ud-train.txt383 kB
- pl_sz-ud-dev.conllu942 kB
- pl_sz-ud-dev.txt62 kB
- fr_spoken
- fr_spoken-ud-dev.txt50 kB
- fr_spoken-ud-dev.conllu413 kB
- fr_spoken-ud-train.conllu615 kB
- fr_spoken-ud-train.txt77 kB
- af_afribooms
- af_afribooms-ud-dev.conllu321 kB
- af_afribooms-ud-train.txt195 kB
- af_afribooms-ud-dev.txt30 kB
- af_afribooms-ud-train.conllu2 MB
- fr_gsd
- fr_gsd-ud-train.txt1 MB
- fr_gsd-ud-dev.conllu2 MB
- fr_gsd-ud-dev.txt184 kB
- fr_gsd-ud-train.conllu21 MB
- zh_gsd
- zh_gsd-ud-train.conllu5 MB
- zh_gsd-ud-dev.txt53 kB
- zh_gsd-ud-dev.conllu673 kB
- zh_gsd-ud-train.txt411 kB
- nl_lassysmall
- nl_lassysmall-ud-dev.conllu837 kB
- nl_lassysmall-ud-train.conllu5 MB
- nl_lassysmall-ud-train.txt419 kB
- nl_lassysmall-ud-dev.txt61 kB
- gl_treegal
- gl_treegal-ud-dev.txt7 kB
- gl_treegal-ud-train.conllu931 kB
- gl_treegal-ud-train.txt69 kB
- gl_treegal-ud-dev.conllu98 kB
- sme_giella
- sme_giella-ud-dev.txt17 kB
- sme_giella-ud-dev.conllu172 kB
- sme_giella-ud-train.txt89 kB
- sme_giella-ud-train.conllu978 kB
- la_proiel
- la_proiel-ud-dev.conllu1 MB
- la_proiel-ud-train.conllu15 MB
- la_proiel-ud-dev.txt87 kB
- la_proiel-ud-train.txt1 MB
- id_gsd
- id_gsd-ud-dev.conllu956 kB
- id_gsd-ud-dev.txt74 kB
- id_gsd-ud-train.conllu7 MB
- id_gsd-ud-train.txt575 kB
- es_ancora
- es_ancora-ud-dev.conllu3 MB
- es_ancora-ud-dev.txt275 kB
- es_ancora-ud-train.txt2 MB
- es_ancora-ud-train.conllu28 MB
- hr_set
- hr_set-ud-dev.conllu1 MB
- hr_set-ud-train.txt904 kB
- hr_set-ud-dev.txt115 kB
- hr_set-ud-train.conllu10 MB
- de_gsd
- de_gsd-ud-dev.txt72 kB
- de_gsd-ud-train.conllu18 MB
- de_gsd-ud-train.txt1 MB
- de_gsd-ud-dev.conllu862 kB
- ur_udtb
- ur_udtb-ud-dev.conllu1 MB
- ur_udtb-ud-train.conllu11 MB
- ur_udtb-ud-train.txt853 kB
- ur_udtb-ud-dev.txt115 kB
- ar_padt
- ar_padt-ud-dev.conllu5 MB
- ar_padt-ud-train.txt1 MB
- ar_padt-ud-dev.txt241 kB
- ar_padt-ud-train.conllu38 MB
- ro_rrt
- ro_rrt-ud-train.txt1 MB
- ro_rrt-ud-dev.txt98 kB
- ro_rrt-ud-dev.conllu1 MB
- ro_rrt-ud-train.conllu13 MB
- cu_proiel
- cu_proiel-ud-train.conllu3 MB
- cu_proiel-ud-dev.conllu1 MB
- cu_proiel-ud-train.txt369 kB
- cu_proiel-ud-dev.txt97 kB
- cs_fictree
- cs_fictree-ud-dev.conllu1 MB
- cs_fictree-ud-train.conllu13 MB
- cs_fictree-ud-dev.txt86 kB
- cs_fictree-ud-train.txt696 kB
- mix.sh618 B
- fr_sequoia
- fr_sequoia-ud-train.txt267 kB
- fr_sequoia-ud-train.conllu3 MB
- fr_sequoia-ud-dev.txt52 kB
- fr_sequoia-ud-dev.conllu614 kB
- ko_kaist
- ko_kaist-ud-dev.txt235 kB
- ko_kaist-ud-dev.conllu1 MB
- ko_kaist-ud-train.conllu17 MB
- ko_kaist-ud-train.txt2 MB
- en_gum
- en_gum-ud-dev.txt65 kB
- en_gum-ud-dev.conllu734 kB
- en_gum-ud-train.txt267 kB
- en_gum-ud-train.conllu2 MB
- no_bokmaal
- no_bokmaal-ud-train.conllu14 MB
- no_bokmaal-ud-dev.txt195 kB
- no_bokmaal-ud-dev.conllu2 MB
- no_bokmaal-ud-train.txt1 MB
- gl_ctg
- gl_ctg-ud-dev.conllu1 MB
- gl_ctg-ud-dev.txt155 kB
- gl_ctg-ud-train.conllu4 MB
- gl_ctg-ud-train.txt413 kB
- la_ittb
- la_ittb-ud-dev.conllu949 kB
- la_ittb-ud-train.txt1 MB
- la_ittb-ud-train.conllu23 MB
- la_ittb-ud-dev.txt59 kB
- ru_syntagrus
- ru_syntagrus-ud-dev.conllu10 MB
- ru_syntagrus-ud-train.txt9 MB
- ru_syntagrus-ud-train.conllu77 MB
- ru_syntagrus-ud-dev.txt1 MB
- hu_szeged
- hu_szeged-ud-dev.conllu983 kB
- hu_szeged-ud-train.conllu1 MB
- hu_szeged-ud-dev.txt83 kB
- hu_szeged-ud-train.txt137 kB
- it_isdt
- it_isdt-ud-dev.conllu737 kB
- it_isdt-ud-dev.txt59 kB
- it_isdt-ud-train.conllu16 MB
- it_isdt-ud-train.txt1 MB
- el_gdt
- el_gdt-ud-dev.conllu932 kB
- el_gdt-ud-train.conllu3 MB
- el_gdt-ud-train.txt440 kB
- el_gdt-ud-dev.txt107 kB
- grc_proiel
- grc_proiel-ud-dev.conllu1 MB
- grc_proiel-ud-train.conllu19 MB
- grc_proiel-ud-train.txt2 MB
- grc_proiel-ud-dev.txt159 kB
- da_ddt
- da_ddt-ud-train.conllu4 MB
- da_ddt-ud-dev.conllu643 kB
- da_ddt-ud-train.txt423 kB
- da_ddt-ud-dev.txt54 kB
- ug_udt
- ug_udt-ud-train.conllu1 MB
- ug_udt-ud-dev.txt119 kB
- ug_udt-ud-train.txt219 kB
- ug_udt-ud-dev.conllu886 kB
- ko_gsd
- ko_gsd-ud-train.txt485 kB
- ko_gsd-ud-train.conllu3 MB
- ko_gsd-ud-dev.conllu709 kB
- ko_gsd-ud-dev.txt103 kB
- cs_cac
- cs_cac-ud-dev.txt72 kB
- cs_cac-ud-dev.conllu1 MB
- cs_cac-ud-train.conllu50 MB
- cs_cac-ud-train.txt2 MB
- cs_pdt
- cs_pdt-ud-train.txt7 MB
- cs_pdt-ud-dev.txt989 kB
- cs_pdt-ud-train.conllu125 MB
- cs_pdt-ud-dev.conllu17 MB
- mix
- mix-ud-train.conllu17 MB
- mix-ud-dev.conllu2 MB
- mix-ud-dev.txt181 kB
- mix-ud-train.txt1 MB
- conllu_split.pl993 B
- kmr_mg
- kmr_mg-ud-dev.conllu919 B
- kmr_mg-ud-train.conllu16 kB
- kmr_mg-ud-dev.txt52 B
- kmr_mg-ud-train.txt1 kB
- hsb_ufal
- hsb_ufal-ud-dev.txt452 B
- hsb_ufal-ud-train.conllu26 kB
- hsb_ufal-ud-dev.conllu5 kB
- hsb_ufal-ud-train.txt2 kB
- bg_btb
- bg_btb-ud-dev.txt155 kB
- bg_btb-ud-dev.conllu1 MB
- bg_btb-ud-train.txt1 MB
- bg_btb-ud-train.conllu10 MB
- iso_names.txt1 kB
- lv_lvtb
- lv_lvtb-ud-train.txt499 kB
- lv_lvtb-ud-dev.txt86 kB
- lv_lvtb-ud-train.conllu8 MB
- lv_lvtb-ud-dev.conllu1 MB
- no_nynorsklia
- no_nynorsklia-ud-train.txt12 kB
- no_nynorsklia-ud-dev.conllu34 kB
- no_nynorsklia-ud-train.conllu166 kB
- no_nynorsklia-ud-dev.txt2 kB
- pt_bosque
- pt_bosque-ud-train.conllu13 MB
- pt_bosque-ud-train.txt1020 kB
- pt_bosque-ud-dev.txt52 kB
- pt_bosque-ud-dev.conllu717 kB
- nl_alpino
- models
- arabic-padt-ud-2.2-conll18-180430.udpipe18 MB
- latin-perseus-ud-2.2-conll18-180430.udpipe4 MB
- spanish-ancora-ud-2.2-conll18-180430.udpipe20 MB
- czech-fictree-ud-2.2-conll18-180430.udpipe15 MB
- old_church_slavonic-proiel-ud-2.2-conll18-180430.udpipe6 MB
- kazakh-ktb-ud-2.2-conll18-180430.udpipe1 MB
- italian-postwita-ud-2.2-conll18-180430.udpipe11 MB
- greek-gdt-ud-2.2-conll18-180430.udpipe6 MB
- hungarian-szeged-ud-2.2-conll18-180430.udpipe5 MB
- swedish-talbanken-ud-2.2-conll18-180430.udpipe8 MB
- polish-lfg-ud-2.2-conll18-180430.udpipe15 MB
- czech-pdt-ud-2.2-conll18-180430.udpipe53 MB
- kurmanji-mg-ud-2.2-conll18-180430.udpipe1 MB
- swedish-lines-ud-2.2-conll18-180430.udpipe6 MB
- norwegian-nynorsk-ud-2.2-conll18-180430.udpipe16 MB
- ancient_greek-perseus-ud-2.2-conll18-180430.udpipe17 MB
- latin-ittb-ud-2.2-conll18-180430.udpipe17 MB
- upper_sorbian-ufal-ud-2.2-conll18-180430.udpipe1 MB
- croatian-set-ud-2.2-conll18-180430.udpipe19 MB
- english-gum-ud-2.2-conll18-180430.udpipe6 MB
- polish-sz-ud-2.2-conll18-180430.udpipe11 MB
- north_sami-giella-ud-2.2-conll18-180430.udpipe4 MB
- estonian-edt-ud-2.2-conll18-180430.udpipe30 MB
- indonesian-gsd-ud-2.2-conll18-180430.udpipe13 MB
- vietnamese-vtb-ud-2.2-conll18-180430.udpipe4 MB
- gothic-proiel-ud-2.2-conll18-180430.udpipe6 MB
- english-ewt-ud-2.2-conll18-180430.udpipe16 MB
- bulgarian-btb-ud-2.2-conll18-180430.udpipe14 MB
- old_french-srcmf-ud-2.2-conll18-180430.udpipe10 MB
- persian-seraji-ud-2.2-conll18-180430.udpipe10 MB
- norwegian-nynorsklia-ud-2.2-conll18-180430.udpipe1 MB
- japanese-gsd-ud-2.2-conll18-180430.udpipe11 MB
- portuguese-bosque-ud-2.2-conll18-180430.udpipe16 MB
- russian-taiga-ud-2.2-conll18-180430.udpipe3 MB
- uyghur-udt-ud-2.2-conll18-180430.udpipe5 MB
- urdu-udtb-ud-2.2-conll18-180430.udpipe15 MB
- english-lines-ud-2.2-conll18-180430.udpipe6 MB
- french-sequoia-ud-2.2-conll18-180430.udpipe5 MB
- slovak-snk-ud-2.2-conll18-180430.udpipe13 MB
- irish-idt-ud-2.2-conll18-180430.udpipe4 MB
- norwegian-bokmaal-ud-2.2-conll18-180430.udpipe17 MB
- turkish-imst-ud-2.2-conll18-180430.udpipe8 MB
- galician-treegal-ud-2.2-conll18-180430.udpipe3 MB
- czech-cac-ud-2.2-conll18-180430.udpipe27 MB
- basque-bdt-ud-2.2-conll18-180430.udpipe12 MB
- ukrainian-iu-ud-2.2-conll18-180430.udpipe14 MB
- slovenian-sst-ud-2.2-conll18-180430.udpipe4 MB
- latin-proiel-ud-2.2-conll18-180430.udpipe22 MB
- latvian-lvtb-ud-2.2-conll18-180430.udpipe14 MB
- hindi-hdtb-ud-2.2-conll18-180430.udpipe20 MB
- finnish-tdt-ud-2.2-conll18-180430.udpipe21 MB
- finnish-ftb-ud-2.2-conll18-180430.udpipe19 MB
- italian-isdt-ud-2.2-conll18-180430.udpipe16 MB
- danish-ddt-ud-2.2-conll18-180430.udpipe9 MB
- romanian-rrt-ud-2.2-conll18-180430.udpipe14 MB
- catalan-ancora-ud-2.2-conll18-180430.udpipe18 MB
- mixed-ud-ud-2.2-conll18-180430.udpipe47 MB
- serbian-set-ud-2.2-conll18-180430.udpipe8 MB
- dutch-alpino-ud-2.2-conll18-180430.udpipe15 MB
- afrikaans-afribooms-ud-2.2-conll18-180430.udpipe4 MB
- armenian-armtdp-ud-2.2-conll18-180430.udpipe1 MB
- chinese-gsd-ud-2.2-conll18-180430.udpipe13 MB
- buryat-bdt-ud-2.2-conll18-180430.udpipe1 MB
- hebrew-htb-ud-2.2-conll18-180430.udpipe14 MB
- dutch-lassysmall-ud-2.2-conll18-180430.udpipe8 MB
- galician-ctg-ud-2.2-conll18-180430.udpipe7 MB
- korean-kaist-ud-2.2-conll18-180430.udpipe39 MB
- korean-gsd-ud-2.2-conll18-180430.udpipe15 MB
- french-gsd-ud-2.2-conll18-180430.udpipe20 MB
- russian-syntagrus-ud-2.2-conll18-180430.udpipe42 MB
- ancient_greek-proiel-ud-2.2-conll18-180430.udpipe21 MB
- slovenian-ssj-ud-2.2-conll18-180430.udpipe16 MB
- german-gsd-ud-2.2-conll18-180430.udpipe23 MB
- french-spoken-ud-2.2-conll18-180430.udpipe2 MB
- README.txt31 kB
- conll18_ud_eval.py26 kB
- Název
- ud-2.2-conll18-crossfold-morphology.tar.xz
- Velikost
- 95.25 MB
- Formát
- application/x-xz
- Popis
- UD 2.2 CoNLL 2018 training data with automatically predicted morphology by UDPipe.
- MD5
- 2576d55fcaab880f9f03e2f5d8eea5c9
- UD_English-PUD
- README.md6 kB
- LICENSE.txt19 kB
- UD_Finnish-PUD
- README.txt2 kB
- LICENSE.txt202 B
- UD_Swedish-Talbanken
- sv_talbanken-ud-train.conllu5 MB
- EVALUATION.txt1 kB
- README.md7 kB
- LICENSE.txt20 kB
- sv_talbanken-ud-dev.conllu835 kB
- UD_Romanian-RRT
- EVALUATION.txt1 kB
- README.md3 kB
- LICENSE.txt66 B
- ro_rrt-ud-dev.conllu1 MB
- ro_rrt-ud-train.conllu13 MB
- UD_Gothic-PROIEL
- EVALUATION.txt1 kB
- README.md2 kB
- got_proiel-ud-train.conllu3 MB
- LICENSE.txt279 B
- got_proiel-ud-dev.conllu926 kB
- UD_Czech-PUD
- README.md2 kB
- LICENSE.txt202 B
- UD_French-Sequoia
- EVALUATION.txt1 kB
- README.md3 kB
- fr_sequoia-ud-train.conllu3 MB
- LICENSE.txt4 kB
- fr_sequoia-ud-dev.conllu614 kB
- UD_Swedish-LinES
- EVALUATION.txt1 kB
- README.txt6 kB
- sv_lines-ud-train.conllu3 MB
- sv_lines-ud-dev.conllu1 MB
- LICENSE.txt18 kB
- UD_German-GSD
- EVALUATION.txt1 kB
- README.md12 kB
- LICENSE.txt17 kB
- de_gsd-ud-train.conllu18 MB
- de_gsd-ud-dev.conllu871 kB
- UD_Old_French-SRCMF
- EVALUATION.txt1 kB
- README.md7 kB
- fro_srcmf-ud-dev.conllu896 kB
- LICENSE.txt202 B
- fro_srcmf-ud-train.conllu6 MB
- UD_English-LinES
- EVALUATION.txt1 kB
- README.txt6 kB
- en_lines-ud-dev.conllu954 kB
- en_lines-ud-train.conllu2 MB
- LICENSE.txt18 kB
- UD_Buryat-BDT
- EVALUATION.txt880 B
- README.txt1 kB
- bxr_bdt-ud-train.conllu10 kB
- LICENSE.txt202 B
- UD_Slovenian-SST
- EVALUATION.txt879 B
- README.txt4 kB
- sl_sst-ud-train.conllu1 MB
- LICENSE.txt441 B
- UD_Latin-PROIEL
- EVALUATION.txt1 kB
- README.md2 kB
- la_proiel-ud-dev.conllu1 MB
- la_proiel-ud-train.conllu15 MB
- LICENSE.txt279 B
- UD_Turkish-IMST
- tr_imst-ud-dev.conllu785 kB
- EVALUATION.txt1 kB
- README.txt1 kB
- tr_imst-ud-train.conllu2 MB
- LICENSE.txt20 kB
- UD_Norwegian-Bokmaal
- no_bokmaal-ud-train.conllu14 MB
- EVALUATION.txt1 kB
- README.md6 kB
- no_bokmaal-ud-dev.conllu2 MB
- LICENSE.txt68 B
- UD_Galician-CTG
- EVALUATION.txt1 kB
- README.txt2 kB
- gl_ctg-ud-dev.conllu1 MB
- LICENSE.txt173 B
- gl_ctg-ud-train.conllu4 MB
- UD_Slovenian-SSJ
- EVALUATION.txt1 kB
- sl_ssj-ud-dev.conllu1 MB
- README.txt4 kB
- LICENSE.txt543 B
- sl_ssj-ud-train.conllu9 MB
- UD_Russian-SynTagRus
- ru_syntagrus-ud-dev.conllu10 MB
- EVALUATION.txt1 kB
- README.txt3 kB
- ru_syntagrus-ud-train.conllu77 MB
- LICENSE.txt188 B
- UD_English-GUM
- EVALUATION.txt1 kB
- README.md3 kB
- en_gum-ud-dev.conllu734 kB
- LICENSE.txt1 kB
- en_gum-ud-train.conllu2 MB
- UD_Indonesian-GSD
- id_gsd-ud-dev.conllu957 kB
- EVALUATION.txt1 kB
- README.md8 kB
- id_gsd-ud-train.conllu7 MB
- LICENSE.txt17 kB
- UD_Korean-GSD
- EVALUATION.txt1 kB
- README.md1 kB
- ko_gsd-ud-dev.conllu703 kB
- ko_gsd-ud-train.conllu3 MB
- LICENSE.txt15 kB
- UD_Ancient_Greek-Perseus
- grc_perseus-ud-train.conllu14 MB
- EVALUATION.txt1 kB
- README.md3 kB
- grc_perseus-ud-dev.conllu1 MB
- LICENSE.txt279 B
- UD_Hindi-HDTB
- EVALUATION.txt1 kB
- README.md2 kB
- hi_hdtb-ud-train.conllu39 MB
- LICENSE.txt249 B
- hi_hdtb-ud-dev.conllu4 MB
- UD_Polish-LFG
- EVALUATION.txt1 kB
- pl_lfg-ud-train.conllu11 MB
- README.md6 kB
- pl_lfg-ud-dev.conllu1 MB
- LICENSE.txt34 kB
- UD_French-Spoken
- EVALUATION.txt1 kB
- README.txt469 B
- LICENSE.txt202 B
- fr_spoken-ud-dev.conllu410 kB
- fr_spoken-ud-train.conllu614 kB
- UD_Hungarian-Szeged
- hu_szeged-ud-dev.conllu1 MB
- EVALUATION.txt1 kB
- README.txt3 kB
- hu_szeged-ud-train.conllu1 MB
- LICENSE.txt30 B
- UD_Dutch-Alpino
- nl_alpino-ud-train.conllu13 MB
- EVALUATION.txt1 kB
- README.txt5 kB
- LICENSE.txt19 kB
- nl_alpino-ud-dev.conllu880 kB
- UD_Urdu-UDTB
- ur_udtb-ud-dev.conllu1 MB
- EVALUATION.txt1 kB
- README.md2 kB
- ur_udtb-ud-train.conllu11 MB
- LICENSE.txt247 B
- UD_Estonian-EDT
- et_edt-ud-dev.conllu2 MB
- EVALUATION.txt1 kB
- README.md3 kB
- LICENSE.txt279 B
- et_edt-ud-train.conllu18 MB
- UD_Polish-SZ
- pl_sz-ud-train.conllu5 MB
- EVALUATION.txt1 kB
- README.md2 kB
- pl_sz-ud-dev.conllu941 kB
- LICENSE.txt68 B
- UD_Finnish-FTB
- EVALUATION.txt1 kB
- fi_ftb-ud-train.conllu9 MB
- README.txt2 kB
- LICENSE.txt1 kB
- fi_ftb-ud-dev.conllu1 MB
- UD_Galician-TreeGal
- EVALUATION.txt883 B
- README.md5 kB
- gl_treegal-ud-train.conllu1 MB
- LICENSE.txt14 kB
- UD_Thai-PUD
- README.md5 kB
- LICENSE.txt19 kB
- UD_Latin-Perseus
- EVALUATION.txt883 B
- la_perseus-ud-train.conllu1 MB
- README.md3 kB
- LICENSE.txt279 B
- UD_Czech-FicTree
- cs_fictree-ud-dev.conllu1 MB
- EVALUATION.txt1 kB
- README.md2 kB
- cs_fictree-ud-train.conllu13 MB
- LICENSE.txt219 B
- UD_Latvian-LVTB
- EVALUATION.txt1 kB
- README.md4 kB
- lv_lvtb-ud-train.conllu8 MB
- LICENSE.txt20 kB
- lv_lvtb-ud-dev.conllu1 MB
- UD_Italian-PoSTWITA
- EVALUATION.txt1 kB
- README.md3 kB
- it_postwita-ud-train.conllu5 MB
- LICENSE.txt18 kB
- it_postwita-ud-dev.conllu755 kB
- UD_Breton-KEB
- README.md1 kB
- LICENSE.txt202 B
- UD_Finnish-TDT
- EVALUATION.txt1 kB
- README.txt3 kB
- fi_tdt-ud-train.conllu11 MB
- LICENSE.txt24 kB
- fi_tdt-ud-dev.conllu1 MB
- UD_Kazakh-KTB
- EVALUATION.txt879 B
- kk_ktb-ud-train.conllu39 kB
- README.txt2 kB
- LICENSE.txt206 B
- UD_Swedish-PUD
- README.md3 kB
- LICENSE.txt202 B
- UD_North_Sami-Giella
- EVALUATION.txt883 B
- README.md2 kB
- sme_giella-ud-train.conllu1 MB
- LICENSE.txt202 B
- UD_Croatian-SET
- EVALUATION.txt1 kB
- README.md4 kB
- hr_set-ud-dev.conllu1 MB
- LICENSE.txt233 B
- hr_set-ud-train.conllu10 MB
- UD_Korean-Kaist
- EVALUATION.txt1 kB
- README.md1 kB
- LICENSE.txt202 B
- ko_kaist-ud-dev.conllu1 MB
- ko_kaist-ud-train.conllu17 MB
- UD_Ukrainian-IU
- EVALUATION.txt1 kB
- README.md3 kB
- uk_iu-ud-dev.conllu992 kB
- uk_iu-ud-train.conllu6 MB
- LICENSE.txt172 B
- UD_Persian-Seraji
- EVALUATION.txt1 kB
- README.md5 kB
- fa_seraji-ud-dev.conllu1000 kB
- LICENSE.txt110 B
- fa_seraji-ud-train.conllu7 MB
- UD_Norwegian-Nynorsk
- no_nynorsk-ud-train.conllu14 MB
- EVALUATION.txt1 kB
- README.md4 kB
- LICENSE.txt68 B
- no_nynorsk-ud-dev.conllu1 MB
- UD_Naija-NSC
- README.md3 kB
- LICENSE.txt202 B
- UD_Norwegian-NynorskLIA
- EVALUATION.txt886 B
- no_nynorsklia-ud-train.conllu200 kB
- README.txt1 kB
- LICENSE.txt202 B
- UD_Bulgarian-BTB
- bg_btb-ud-dev.conllu1 MB
- EVALUATION.txt1 kB
- README.txt5 kB
- LICENSE.txt327 B
- bg_btb-ud-train.conllu10 MB
- UD_Serbian-SET
- EVALUATION.txt1 kB
- README.md1 kB
- LICENSE.txt230 B
- sr_set-ud-train.conllu4 MB
- sr_set-ud-dev.conllu675 kB
- UD_Basque-BDT
- EVALUATION.txt1 kB
- README.txt3 kB
- eu_bdt-ud-dev.conllu1 MB
- eu_bdt-ud-train.conllu4 MB
- LICENSE.txt171 B
- UD_Slovak-SNK
- EVALUATION.txt1 kB
- README.md3 kB
- LICENSE.txt202 B
- sk_snk-ud-train.conllu6 MB
- sk_snk-ud-dev.conllu1 MB
- UD_Afrikaans-AfriBooms
- af_afribooms-ud-dev.conllu322 kB
- EVALUATION.txt1 kB
- README.txt1 kB
- af_afribooms-ud-train.conllu2 MB
- LICENSE.txt202 B
- UD_Japanese-GSD
- EVALUATION.txt1 kB
- README.txt2 kB
- ja_gsd-ud-dev.conllu612 kB
- LICENSE.txt17 kB
- ja_gsd-ud-train.conllu8 MB
- UD_Czech-CAC
- cs_cac-ud-dev.conllu1 MB
- cs_cac-ud-train.conllu50 MB
- EVALUATION.txt1 kB
- README.md4 kB
- LICENSE.txt265 B
- UD_Arabic-PADT
- EVALUATION.txt1 kB
- ar_padt-ud-dev.conllu5 MB
- README.md4 kB
- LICENSE.txt19 kB
- ar_padt-ud-train.conllu38 MB
- UD_Faroese-OFT
- README.md1 kB
- LICENSE.txt822 B
- UD_Upper_Sorbian-UFAL
- EVALUATION.txt881 B
- README.md919 B
- hsb_ufal-ud-train.conllu31 kB
- LICENSE.txt202 B
- UD_Ancient_Greek-PROIEL
- grc_proiel-ud-dev.conllu1 MB
- grc_proiel-ud-train.conllu19 MB
- EVALUATION.txt1 kB
- README.md2 kB
- LICENSE.txt279 B
- UD_Czech-PDT
- EVALUATION.txt1 kB
- README.md7 kB
- LICENSE.txt19 kB
- cs_pdt-ud-train.conllu125 MB
- cs_pdt-ud-dev.conllu17 MB
- UD_Chinese-GSD
- zh_gsd-ud-train.conllu5 MB
- EVALUATION.txt1 kB
- README.md903 B
- zh_gsd-ud-dev.conllu673 kB
- LICENSE.txt282 B
- UD_Catalan-AnCora
- EVALUATION.txt1 kB
- README.md743 B
- ca_ancora-ud-train.conllu26 MB
- ca_ancora-ud-dev.conllu3 MB
- LICENSE.txt68 B
- UD_Old_Church_Slavonic-PROIEL
- EVALUATION.txt1 kB
- README.md2 kB
- cu_proiel-ud-train.conllu3 MB
- cu_proiel-ud-dev.conllu1 MB
- LICENSE.txt279 B
- UD_Spanish-AnCora
- es_ancora-ud-dev.conllu3 MB
- EVALUATION.txt1 kB
- README.md648 B
- LICENSE.txt68 B
- es_ancora-ud-train.conllu28 MB
- UD_Dutch-LassySmall
- nl_lassysmall-ud-dev.conllu836 kB
- EVALUATION.txt1 kB
- README.txt2 kB
- LICENSE.txt392 B
- nl_lassysmall-ud-train.conllu5 MB
- UD_Danish-DDT
- EVALUATION.txt1 kB
- da_ddt-ud-train.conllu4 MB
- da_ddt-ud-dev.conllu644 kB
- README.md5 kB
- LICENSE.txt19 kB
- UD_French-GSD
- EVALUATION.txt1 kB
- README.md11 kB
- fr_gsd-ud-dev.conllu2 MB
- LICENSE.txt17 kB
- fr_gsd-ud-train.conllu21 MB
- UD_Portuguese-Bosque
- EVALUATION.txt1 kB
- pt_bosque-ud-train.conllu13 MB
- README.md6 kB
- LICENSE.txt269 B
- pt_bosque-ud-dev.conllu718 kB
- UD_Irish-IDT
- ga_idt-ud-train.conllu882 kB
- EVALUATION.txt879 B
- README.txt5 kB
- LICENSE.txt13 B
- UD_Kurmanji-MG
- EVALUATION.txt879 B
- kmr_mg-ud-train.conllu17 kB
- README.txt1 kB
- LICENSE.txt202 B
- UD_Uyghur-UDT
- EVALUATION.txt1 kB
- README.md1 kB
- ug_udt-ud-train.conllu1 MB
- LICENSE.txt202 B
- ug_udt-ud-dev.conllu890 kB
- UD_Greek-GDT
- EVALUATION.txt1 kB
- README.md3 kB
- el_gdt-ud-dev.conllu933 kB
- LICENSE.txt398 B
- el_gdt-ud-train.conllu3 MB
- UD_English-EWT
- EVALUATION.txt1 kB
- en_ewt-ud-dev.conllu1 MB
- README.md7 kB
- LICENSE.txt19 kB
- en_ewt-ud-train.conllu12 MB
- UD_Latin-ITTB
- EVALUATION.txt1 kB
- README.md3 kB
- la_ittb-ud-dev.conllu948 kB
- LICENSE.txt19 kB
- la_ittb-ud-train.conllu23 MB
- UD_Armenian-ArmTDP
- EVALUATION.txt882 B
- README.md3 kB
- hy_armtdp-ud-train.conllu77 kB
- LICENSE.txt202 B
- UD_Vietnamese-VTB
- vi_vtb-ud-dev.conllu522 kB
- EVALUATION.txt1 kB
- vi_vtb-ud-train.conllu939 kB
- README.txt636 B
- LICENSE.txt19 kB
- UD_Japanese-Modern
- README.txt3 kB
- LICENSE.txt17 kB
- UD_Italian-ISDT
- EVALUATION.txt1 kB
- it_isdt-ud-dev.conllu737 kB
- README.md10 kB
- it_isdt-ud-train.conllu16 MB
- LICENSE.txt22 kB
- UD_Hebrew-HTB
- EVALUATION.txt1 kB
- README.txt3 kB
- he_htb-ud-train.conllu9 MB
- LICENSE.txt249 B
- he_htb-ud-dev.conllu836 kB
- README.txt1 kB
- conll18_ud_eval.py26 kB
- UD_Russian-Taiga
- EVALUATION.txt881 B
- README.md3 kB
- LICENSE.txt202 B
- ru_taiga-ud-train.conllu824 kB