Creator: Kloudová, Věra - LINDAT/CLARIAH-CZ Catalog Search Results

Start Over Creator Kloudová, Věra

1. CzeDLex 0.7

Creator:: Poláková, Lucie, Mírovský, Jiří, Synková, Pavlína, Kloudová, Věra, and Rysová, Magdaléna
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text, lexicon, and lexicalConceptualResource
Subject:: lexicon and discourse annotation
Language:: Czech
Description:: CzeDLex 0.7 is the third development version of the Lexicon of Czech discourse connectives. The lexicon contains connectives partially automatically extracted from the Prague Discourse Treebank 2.0 (PDiT 2.0) and, as a supplementary resource, the Czech part of the Prague Czech–English Dependency Treebank with discourse annotation projected from the Penn Discourse Treebank 3.0. The most frequent entries in the lexicon (131 out of total 218 entries, covering more than 95% of discourse relations annotated in PDiT 2.0), have been manually checked, translated to English and supplemented with additional linguistic information.
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

2. CzeDLex 1.0

Creator:: Mírovský, Jiří, Synková, Pavlína, Poláková, Lucie, Kloudová, Věra, and Rysová, Magdaléna
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text, lexicon, and lexicalConceptualResource
Subject:: lexicon and discourse
Language:: Czech
Description:: CzeDLex 1.0 is the first production version (the fourth development version) of the Lexicon of Czech discourse connectives. The lexicon contains connectives partially automatically extracted from resources annotated manually with discourse relations: the Prague Discourse Treebank 2.0 (PDiT 2.0) as the primary resource, and two supplementary resources: (i) the Czech part of the Prague Czech–English Dependency Treebank with discourse annotation projected from the Penn Discourse Treebank 3.0, and (ii) a thousand sentences selected from various fiction novels and transcriptions of public speeches. All 200 entries in the lexicon have been manually checked, translated to English and supplemented with additional linguistic information.
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

3. EdUKate translation software 1

Creator:: Popel, Martin, Novák, Michal, Balhar, Jiří, Košarko, Ondřej, Mayer, Jiří, Poláková, Lucie, Kloudová, Věra, and Anisimova, Mariia
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: tool and toolService
Subject:: machine translation and Ukrainian
Language:: Ukrainian and Czech
Description:: This software package includes three tools: web frontend for machine translation featuring phonetic transcription of Ukrainian suitable for Czech speakers, API server and a tool for translation of documents with markup (html, docx, odt, pptx, odp,...). These tools are used in the Charles Translator service (https://translator.cuni.cz). This software was developed within the EdUKate project, which aims to help mitigate language barriers between non-Czech-speaking children in the Czech Republic and the education in the Czech school system. The project focuses on the development and dissemination of multilingual digital learning materials for students in primary and secondary schools.
Rights:: BSD 2-Clause "Simplified" or "FreeBSD" license, http://opensource.org/licenses/BSD-2-Clause, and PUB

4. EMMT (Eyetracked Multi-Modal Translation)

Creator:: Bhattacharya, Sunit, Kloudová, Věra, Zouhar, Vilém, and Bojar, Ondřej
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: audio and corpus
Subject:: sight translation and multi-modal
Language:: English and Czech
Description:: Eyetracked Multi-Modal Translation (EMMT) is a simultaneous eye-tracking, 4-electrode EEG and audio corpus for multi-modal reading and translation scenarios. It contains monocular eye movement recordings, audio data and 4-electrode wearable electroencephalogram (EEG) data of 43 participants while engaged in sight translation supported by an image. The details about the experiment and the dataset can be found in the README file.
Rights:: Creative Commons - Attribution 4.0 International (CC BY 4.0), http://creativecommons.org/licenses/by/4.0/, and PUB

5. Optimal reference translation of English-Czech WMT2020

Creator:: Kloudová, Věra, Mraček, David, Bojar, Ondřej, and Popel, Martin
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: translational equivalence, reference translation, optimal reference translation, and WMT
Language:: Czech and English
Description:: We define "optimal reference translation" as a translation thought to be the best possible that can be achieved by a team of human translators. Optimal reference translations can be used in assessments of excellent machine translations. We selected 50 documents (online news articles, with 579 paragraphs in total) from the 130 English documents included in the WMT2020 news test (http://www.statmt.org/wmt20/) with the aim to preserve diversity (style, genre etc.) of the selection. In addition to the official Czech reference translation provided by the WMT organizers (P1), we hired two additional translators (P2 and P3, native Czech speakers) via a professional translation agency, resulting in three independent translations. The main contribution of this dataset are two additional translations (i.e. optimal reference translations N1 and N2), done jointly by two translators-cum-theoreticians with an extreme care for various aspects of translation quality, while taking into account the translations P1-P3. We publish also internal comments (in Czech) for some of the segments. Translation N1 should be closer to the English original (with regards to the meaning and linguistic structure) and female surnames use the Czech feminine suffix (e.g. "Mai" is translated as "Maiová"). Translation N2 is more free, trying to be more creative, idiomatic and entertaining for the readers and following the typical style used in Czech media, while still preserving the rules of functional equivalence. Translation N2 is missing for the segments where it was not deemed necessary to provide two alternative translations. For applications/analyses needing translation of all segments, this should be interpreted as if N2 is the same as N1 for a given segment. We provide the dataset in two formats: OpenDocument spreadsheet (odt) and plain text (one file for each translation and the English original). Some words were highlighted using different colors during the creation of optimal reference translations; this highlighting and comments are present only in the odt format (some comments refer to row numbers in the odt file). Documents are separated by empty lines and each document starts with a special line containing the document name (e.g. "# upi.205735"), which allows alignment with the original WMT2020 news test. For the segments where N2 translations are missing in the odt format, the respective N1 segments are used instead in the plain-text format.
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

6. Optimal Reference Translations from English to Czech

Creator:: Zouhar, Vilém, Kloudová, Věra, Popel, Martin, and Bojar, Ondřej
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: translation, evaluation, and optimal reference translation
Language:: English and Czech
Description:: This corpus contains annotations of translation quality from English to Czech in seven categories on both segment- and document-level. There are 20 documents in total, each with 4 translations (evaluated by each annotator in paralel) of 8 segments (can be longer than one sentence). Apart from the evaluation, the annotators also proposed their own, improved versions of the translations. There were 11 annotators in total, on expertise levels ranging from non-experts to professional translators.
Rights:: Creative Commons - Attribution 4.0 International (CC BY 4.0), http://creativecommons.org/licenses/by/4.0/, and PUB

1. CzeDLex 0.7

2. CzeDLex 1.0

3. EdUKate translation software 1

4. EMMT (Eyetracked Multi-Modal Translation)

5. Optimal reference translation of English-Czech WMT2020

6. Optimal Reference Translations from English to Czech

Limit your search

Show values starting with

Show values starting with

Search

Search Constraints

Search Results

Limit your search

Contributor

Creator

Show values starting with

Language

Publisher

Rights

Subject

Show values starting with

Type

Original context has metadata only

Harvested from