Show simple item record

 
dc.contributor.author Galuščáková, Petra
dc.contributor.author Garabík, Radovan
dc.contributor.author Bojar, Ondřej
dc.date.accessioned 2012-05-15T15:54:40Z
dc.date.available 2012-05-15T15:54:40Z
dc.date.issued 2012-05-15
dc.identifier.uri http://hdl.handle.net/11858/00-097C-0000-0006-AADF-0
dc.description Czech-Slovak parallel corpus consisting of several freely available corpora (Acquis [1], Europarl [2], Official Journal of the European Union [3] and part of OPUS corpus [4] – EMEA, EUConst, KDE4 and PHP) and downloaded website of European Commission [5]. Corpus is published in both in plaintext format and with an automatic morphological annotation. References: [1] http://langtech.jrc.it/JRC-Acquis.html/ [2] http://www.statmt.org/europarl/ [3] http://apertium.eu/data [4] http://opus.lingfil.uu.se/ [5] http://ec.europa.eu/
dc.description.sponsorship This work has been supported by the grant Euro-MatrixPlus (FP7-ICT-2007-3-231720 of the EU and 7E09003 of the Czech Republic)
dc.language.iso slk
dc.language.iso ces
dc.publisher Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
dc.relation info:eu-repo/grantAgreement/EC/FP7/231720
dc.rights Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)
dc.rights.uri http://creativecommons.org/licenses/by-nc-sa/3.0/
dc.subject parallel corpus
dc.subject Czech-Slovak corpus
dc.title Czech-Slovak Parallel Corpus
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
dc.rights.label PUB
has.files yes
branding LINDAT / CLARIAH-CZ
contact.person Petra Galuščáková galuscakova@ufal.mff.cuni.cz Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
sponsor European Union FP7-ICT-2007-3-231720 EuroMatrix Plus euFunds info:eu-repo/grantAgreement/EC/FP7/231720
sponsor Ministerstvo školství, mládeže a tělovýchovy České republiky 7E09003 EuroMatrixPlus – Bringing Machine Translation for European Languages to the User nationalFunds
size.info 5700000 sentences
files.size 1192222551
files.count 2
featuredService.kontext Czech-Slovak|http://lindat.mff.cuni.cz/services/kontext/run.cgi/first_form?corpname=czeslo_cs_m
featuredService.kontext Slovak-Czech|http://lindat.mff.cuni.cz/services/kontext/run.cgi/first_form?corpname=czeslo_sk_m


 Files in this item

This item is
Publicly Available
and licensed under:
Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)
Distributed under Creative Commons Attribution Required Noncommercial Share Alike
Icon
Name
corpora-cs-sk-plaintext.tar.gz
Size
342 MB
Format
application/x-gzip
Description
Corpus in plaintext format
MD5
f68136ee855aec4f18f29af557591819
 Download file  Preview
 File Preview  
  • corpora-cs-sk-plaintext
    • OPUS
      • EMEA.cs-sk.cs.gz18 MB
      • KDE4.cs-sk.sk.gz889 kB
      • README52 B
      • PHP.cs-sk.cs.gz56 kB
      • KDE4.cs-sk.cs.gz923 kB
      • EUconst.cs-sk.sk.gz247 kB
      • EMEA.cs-sk.sk.gz18 MB
      • EUconst.cs-sk.cs.gz231 kB
      • PHP.cs-sk.sk.gz44 kB
    • acquis
      • README271 B
      • acquis-train_cs.txt.gz37 MB
      • acquis-train_sk.txt.gz38 MB
    • ec-europa
      • corpora-ec-europa.sk.gz936 kB
      • corpora-ec-europa.cs.gz939 kB
      • README37 B
    • journal
      • eu-journal.sk.gz91 MB
      • eu-journal.cs.gz90 MB
      • README215 B
    • europarl
      • europarl-v6.sk-cs.sk.gz21 MB
      • europarl-v6.sk-cs.cs.gz21 MB
      • README45 B
Icon
Name
corpora-cs-sk-export-format.tar.gz
Size
794.99 MB
Format
application/x-gzip
Description
Corpus with morphological information
MD5
0efa8e221b25cf89e27660a421f4dbbf
 Download file  Preview
 File Preview  
  • corpora-cs-sk-export-format
    • OPUS
      • PHP.cs-sk-tagged.cs.gz130 kB
      • EMEA.cs-sk-tagged.sk.gz49 MB
      • KDE4.cs-sk-tagged.sk.gz1 MB
      • EMEA.cs-sk-tagged.cs.gz57 MB
      • KDE4.cs-sk-tagged.cs.gz1 MB
      • EUconst.cs-sk-tagged.sk.gz494 kB
      • EUconst.cs-sk-tagged.cs.gz536 kB
      • PHP.cs-sk-tagged.sk.gz84 kB
    • acquis
      • acquis-train-tagged.sk.gz79 MB
      • acquis-train-tagged.cs.gz89 MB
    • eu-journal
      • eu-journal-tagged.sk.gz190 MB
      • eu-journal-tagged.cs.gz223 MB
    • ec-europa
      • corpora-ec-europa-tagged.sk.gz1 MB
      • corpora-ec-europa-tagged.cs.gz2 MB
    • europarl
      • europarl-v6.sk-cs-tagged.sk.gz45 MB
      • europarl-v6.sk-cs-tagged.cs.gz52 MB

Show simple item record