dc.contributor.author | Galuščáková, Petra |
dc.contributor.author | Garabík, Radovan |
dc.contributor.author | Bojar, Ondřej |
dc.date.accessioned | 2012-05-15T15:54:40Z |
dc.date.available | 2012-05-15T15:54:40Z |
dc.date.issued | 2012-05-15 |
dc.identifier.uri | http://hdl.handle.net/11858/00-097C-0000-0006-AADF-0 |
dc.description | Czech-Slovak parallel corpus consisting of several freely available corpora (Acquis [1], Europarl [2], Official Journal of the European Union [3] and part of OPUS corpus [4] – EMEA, EUConst, KDE4 and PHP) and downloaded website of European Commission [5]. Corpus is published in both in plaintext format and with an automatic morphological annotation. References: [1] http://langtech.jrc.it/JRC-Acquis.html/ [2] http://www.statmt.org/europarl/ [3] http://apertium.eu/data [4] http://opus.lingfil.uu.se/ [5] http://ec.europa.eu/ |
dc.description.sponsorship | This work has been supported by the grant Euro-MatrixPlus (FP7-ICT-2007-3-231720 of the EU and 7E09003 of the Czech Republic) |
dc.language.iso | slk |
dc.language.iso | ces |
dc.publisher | Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) |
dc.relation | info:eu-repo/grantAgreement/EC/FP7/231720 |
dc.rights | Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/3.0/ |
dc.subject | parallel corpus |
dc.subject | Czech-Slovak corpus |
dc.title | Czech-Slovak Parallel Corpus |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
dc.rights.label | PUB |
has.files | yes |
branding | LINDAT / CLARIAH-CZ |
contact.person | Petra Galuščáková galuscakova@ufal.mff.cuni.cz Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) |
sponsor | European Union FP7-ICT-2007-3-231720 EuroMatrix Plus euFunds info:eu-repo/grantAgreement/EC/FP7/231720 |
sponsor | Ministerstvo školství, mládeže a tělovýchovy České republiky 7E09003 EuroMatrixPlus – Bringing Machine Translation for European Languages to the User nationalFunds |
size.info | 5700000 sentences |
files.size | 1192222551 |
files.count | 2 |
featuredService.kontext | Czech-Slovak|http://lindat.mff.cuni.cz/services/kontext/run.cgi/first_form?corpname=czeslo_cs_m |
featuredService.kontext | Slovak-Czech|http://lindat.mff.cuni.cz/services/kontext/run.cgi/first_form?corpname=czeslo_sk_m |
Files in this item
This item is
Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)
Publicly Available
and licensed under:Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)
- Name
- corpora-cs-sk-plaintext.tar.gz
- Size
- 342 MB
- Format
- application/x-gzip
- Description
- Corpus in plaintext format
- MD5
- f68136ee855aec4f18f29af557591819
- corpora-cs-sk-plaintext
- OPUS
- EMEA.cs-sk.cs.gz18 MB
- KDE4.cs-sk.sk.gz889 kB
- README52 B
- PHP.cs-sk.cs.gz56 kB
- KDE4.cs-sk.cs.gz923 kB
- EUconst.cs-sk.sk.gz247 kB
- EMEA.cs-sk.sk.gz18 MB
- EUconst.cs-sk.cs.gz231 kB
- PHP.cs-sk.sk.gz44 kB
- acquis
- README271 B
- acquis-train_cs.txt.gz37 MB
- acquis-train_sk.txt.gz38 MB
- ec-europa
- corpora-ec-europa.sk.gz936 kB
- corpora-ec-europa.cs.gz939 kB
- README37 B
- journal
- eu-journal.sk.gz91 MB
- eu-journal.cs.gz90 MB
- README215 B
- europarl
- europarl-v6.sk-cs.sk.gz21 MB
- europarl-v6.sk-cs.cs.gz21 MB
- README45 B
- OPUS
- Name
- corpora-cs-sk-export-format.tar.gz
- Size
- 794.99 MB
- Format
- application/x-gzip
- Description
- Corpus with morphological information
- MD5
- 0efa8e221b25cf89e27660a421f4dbbf
- corpora-cs-sk-export-format
- OPUS
- PHP.cs-sk-tagged.cs.gz130 kB
- EMEA.cs-sk-tagged.sk.gz49 MB
- KDE4.cs-sk-tagged.sk.gz1 MB
- EMEA.cs-sk-tagged.cs.gz57 MB
- KDE4.cs-sk-tagged.cs.gz1 MB
- EUconst.cs-sk-tagged.sk.gz494 kB
- EUconst.cs-sk-tagged.cs.gz536 kB
- PHP.cs-sk-tagged.sk.gz84 kB
- acquis
- acquis-train-tagged.sk.gz79 MB
- acquis-train-tagged.cs.gz89 MB
- eu-journal
- eu-journal-tagged.sk.gz190 MB
- eu-journal-tagged.cs.gz223 MB
- ec-europa
- corpora-ec-europa-tagged.sk.gz1 MB
- corpora-ec-europa-tagged.cs.gz2 MB
- europarl
- europarl-v6.sk-cs-tagged.sk.gz45 MB
- europarl-v6.sk-cs-tagged.cs.gz52 MB
- OPUS