dc.contributor.author | Galuščáková, Petra |
dc.contributor.author | Garabík, Radovan |
dc.contributor.author | Bojar, Ondřej |
dc.date.accessioned | 2012-05-15T16:11:21Z |
dc.date.available | 2012-05-15T16:11:21Z |
dc.date.issued | 2012-05-15 |
dc.identifier.uri | http://hdl.handle.net/11858/00-097C-0000-0006-AAE0-A |
dc.description | English-Slovak parallel corpus consisting of several freely available corpora (Acquis [1], Europarl [2], Official Journal of the European Union [3] and part of OPUS corpus [4] – EMEA, EUConst, KDE4 and PHP) and downloaded website of European Commission [5]. Corpus is published in both in plaintext format and with an automatic morphological annotation. References: [1] http://langtech.jrc.it/JRC-Acquis.html/ [2] http://www.statmt.org/europarl/ [3] http://apertium.eu/data [4] http://opus.lingfil.uu.se/ [5] http://ec.europa.eu/ |
dc.description.sponsorship | This work has been supported by the grant Euro-MatrixPlus (FP7-ICT-2007-3-231720 of the EU and 7E09003 of the Czech Republic) |
dc.language.iso | slk |
dc.language.iso | eng |
dc.publisher | Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) |
dc.relation | info:eu-repo/grantAgreement/EC/FP7/231720 |
dc.rights | Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/3.0/ |
dc.subject | parallel corpus |
dc.subject | English-Slovak corpus |
dc.title | English-Slovak Parallel Corpus |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
dc.rights.label | PUB |
has.files | yes |
branding | LINDAT / CLARIAH-CZ |
contact.person | Petra Galuščáková galuscakova@ufal.mff.cuni.cz Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) |
sponsor | European Union FP7-ICT-2007-3-231720 EuroMatrix Plus euFunds info:eu-repo/grantAgreement/EC/FP7/231720 |
sponsor | Ministerstvo školství, mládeže a tělovýchovy České republiky 7E09003 EuroMatrixPlus – Bringing Machine Translation for European Languages to the User nationalFunds |
files.size | 1172350203 |
files.count | 2 |
Soubory tohoto záznamu
Licenční kategorie:
Licence: Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)
Publicly Available
Licence: Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)
- Název
- corpus-en-sk-plaintext.tar.gz
- Velikost
- 431.69 MB
- Formát
- application/x-gzip
- Popis
- Corpus in plaintext format
- MD5
- 6d6885672e9d40c4d4c31f51796f1aa0
- corpus-en-sk-plaintext
- OPUS
- KDE4.en-sk.en.gz952 kB
- EUconst.en-sk.sk.gz236 kB
- PHP.en-sk.sk.gz74 kB
- EMEA.en-sk.sk.gz18 MB
- README53 B
- EUconst.en-sk.en.gz225 kB
- PHP.en-sk.en.gz84 kB
- EMEA.en-sk.en.gz15 MB
- KDE4.en-sk.sk.gz1 MB
- acquis
- ac-ensk-train-tagged.sk.gz96 MB
- README271 B
- ac-ensk-train-tagged.en.gz74 MB
- ec-europa
- eceuropa.en-sk.en.gz455 kB
- README38 B
- eceuropa.en-sk.sk.gz498 kB
- journal
- eu-journal.sk.gz97 MB
- README215 B
- eu-journal.en.gz85 MB
- europarl
- europarl-v6.sk-en.en.gz20 MB
- README45 B
- europarl-v6.sk-en.sk.gz22 MB
- OPUS
- Název
- corpus-en-sk-export-format.tar.gz
- Velikost
- 686.35 MB
- Formát
- application/x-gzip
- Popis
- Corpus with morphological information
- MD5
- e6b3cd54b1485893fbc352d3c3becfc8
- corpus-en-sk-export-format
- EU-journal
- eu-journal-ensk-tagged.en.gz153 MB
- eu-journal-ensk-tagged.sk.gz199 MB
- OPUS
- KDE4.en-sk-tagged.sk.gz1 MB
- EMEA.en-sk-tagged.sk.gz49 MB
- EUconst.en-sk-tagged.en.gz388 kB
- PHP.en-sk-tagged.en.gz149 kB
- EUconst.en-sk-tagged.sk.gz470 kB
- PHP.en-sk-tagged.sk.gz142 kB
- KDE4.en-sk-tagged.en.gz1 MB
- EMEA.en-sk-tagged.en.gz37 MB
- acquis
- acquis-tagged.en.gz69 MB
- acquis-tagged.sk.gz90 MB
- ec-europa
- eceuropa.en-sk-tagged.en.gz811 kB
- eceuropa.en-sk-tagged.sk.gz1 MB
- europarl
- europarl-v6.sk-en-tagged.sk.gz46 MB
- europarl-v6.sk-en-tagged.en.gz35 MB
- EU-journal