A view of the house no. 20 on Karlova Street, Prague. Sculptor Emanuel Kodet working in his studio. A view of the nearby Astronomical Tower at the Clementinum. His brother, sculptor Jan Kodet, working in his studio in a segment from Československý filmový týdeník (Czechoslovak Film Weekly Newsreel) 1958, issue no. 35. Jan Kodet with an unidentified man.
Historian Emanuel Svoboda at a gathering to mark the anniversary of the birth of Mikoláš Aleš in a fragmented segment from Československé filmové noviny (Czechoslovak Film News) 1952, issue no. 46. Svoboda on Bohumil Veselý's balcony. Svoboda with his wife Maryna (née Alšová) and an unidentified man on a park bench.
Emil František Burian during a guest performance in Zlín in a segment from Československý zvukový týdeník Aktualita (Czechoslovak Aktualita Sound Newsreel) 1939, issue no. 41B. Burian at his desk in a segment from Československý filmový týdeník (Czechoslovak Film Weekly Newsreel) 1954, issue no. 25.
Athlete Emil Zátopek wins the 5,000-metre race at the 1952 Summer Olympics in Helsinki in a segment from Československé filmové noviny (Czechoslovak Film News) 1952, issue no. 37. In 1952, he also wins the 5,000-metre race in Opava in a segment from Československé filmové noviny (Czechoslovak Film News) 1952, issue no. 42. Zátopek with his wife Dana Zátopková at Strahov Stadium in Prague. Zátopek accepting the Order of the Republic from the hands of Minister of Defence Alexej Čepička in 1952 in a segment from Československé filmové noviny (Czechoslovak Film News) 1952, issue no. 42.
Eyetracked Multi-Modal Translation (EMMT) is a simultaneous eye-tracking, 4-electrode EEG and audio corpus for multi-modal reading and translation scenarios. It contains monocular eye movement recordings, audio data and 4-electrode wearable electroencephalogram (EEG) data of 43 participants while engaged in sight translation supported by an image.
The details about the experiment and the dataset can be found in the README file.
Sentence-parallel corpus made from English and Czech Wikipedias based on translated articles from English into Czech.
The work done is described in the paper: ŠTROMAJEROVÁ, Adéla, Vít BAISA a Marek BLAHUŠ. Between Comparable and Parallel: English-Czech Corpus from Wikipedia. In RASLAN 2016 Recent Advances in Slavonic Natural Language Processing. Brno: Tribun EU, 2016. s. 3-8, 6 s. ISBN 978-80-263-1095-2.
Enriched discourse annotation of a subset of the Prague Discourse Treebank, adding implicit relations, entity based relations, question-answer relations and other discourse structuring phenomena.
ESIC (Europarl Simultaneous Interpreting Corpus) is a corpus of 370 speeches (10 hours) in English, with manual transcripts, transcribed simultaneous interpreting into Czech and German, and parallel translations.
The corpus contains source English videos and audios. The interpreters' voices are not published within the corpus, but there is a tool that downloads them from the web of European Parliament, where they are publicly avaiable.
The transcripts are equipped with metadata (disfluencies, mixing voices and languages, read or spontaneous speech, etc.), punctuated, and with word-level timestamps.
The speeches in the corpus come from the European Parliament plenary sessions, from the period 2008-11. Most of the speakers are MEP, both native and non-native speakers of English. The corpus contains metadata about the speakers (name, surname, id, fraction) and about the speech (date, topic, read or spontaneous).
The current version of ESIC is v1.0. It has validation and evaluation parts.