This is not the latest version of this item. The latest version can be found here.
Khresmoi Summary Translation Test Data 1.1
Please use the following text to cite this item or export to a predefined format:
Dušek, Ondřej; et al., 2014,
Khresmoi Summary Translation Test Data 1.1, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11858/00-097C-0000-0023-866E-1.
Authors
Dušek, Ondřej ; et al.
Item identifier
Project URL
Date issued
2014-04-28
Size
1500 sentences
Description
This package contains data sets for development and testing of machine translation of sentences from summaries of medical articles between Czech, English, French, and German.
Acknowledgement
European Union
Project code:FP7-ICT-2010-6-257528
Project name:Khresmoi
Ministerstvo školství, mládeže a tělovýchovy České republiky
Project code:LM2010013
Project name:LINDAT/CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- khresmoi-summary-test-set.tgz
- Size
- 637.26 KB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- c0d333e3f6d8f2db1cc281821d5bcbe8

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- README.TXT
- Size
- 10.34 KB
- Format
- text/plain
- Description
- Text
- MD5
- e1f594eb6743282cccfadd155e297ca9

Khresmoi Summary Translation Test Data for the Medical Domain version 1.1
Apr 28, 2014
Pavel Pecina <pecina@ufal.mff.cuni.cz>
1. Description
This package contains data sets for development (Section dev) and testing
(Section test) of machine translation of sentences from summaries of
medical articles between Czech, English, French, and German.
Version 1.1 of this data set differs from version 1.0 in punctuation which
was normalized using the attached script normalize-punctuation.pl.
2. Preamble
2.1 Source
The original sentences are sampled from summaries of English medical
documents crawled from the web in 2012 and identified to be relevant
to 50 medical topics.
The translations were carried out by the Charles University in Prague.
2.2 License
The Khresmoi Summary Test Set is made available under the terms of the
Creative Commons Attribution-Noncommercial (CC-BY-NC) license, version
3.0 unported. You may use them for academic research and all non-
commercial purposes as long as the authors (cf. Section 4) are properly
credited and sources acknowledged (cf. Section 6 and 7). See
http://creativecommons.org/licenses/by-nc/3.0/ for a full description
and explanation of the licensing terms.
4. Authors
Ondrej Dušek <odusek@ufal.mff.cuni.cz>,
Jan Hajič <hajic@ufal.mff.cuni.cz>,
Jaroslava Hlaváčová <hlavacova@ufal.mff.cuni.cz>,
Pavel Pecina <pecina@ufal.mff.cuni.cz>,
Aleš Tamchyna <tamchyna@ufal.mff.cuni.cz>,
Zdeňka Urešová <uresova@ufal.mff.cuni.cz>
Charles University in Prague
Faculty of Mathematics and Physics
Institute of Formal and Applied Linguistics
Malostranské nám. 25
118 00 Prague 1
Czech Republic
3. Data
3.1 Description
The original sentences in English were rando . . .
