Dictionaries with different representations for various languages. Representations include brown clusters of different sizes and morphological dictionaries extracted using different morphological analyzers. All representations cover the most frequent 250,000 word types on the Wikipedia version of the respective language.
Analzers used: MAGYARLANC (Hungarian, Zsibrita et al. (2013)), FREELING (English and Spanish, Padro and Stanilovsky (2012)), SMOR (German, Schmid et al. (2004)), an MA from Charles University (Czech, Hajic (2001)) and LATMOR (Latin, Springmann et al. (2014)).
Czech translation of WordSim353. The Czech translation of English WordSim353 word pairs were obtained from four translators. All translation variants were scored according to the lexical similarity/relatedness annotation instructions for WordSim353 annotators, by 25 Czech annotators. The resulting data set consists of two annotation files: "WordSim353-cs.csv" and "WordSim-cs-Multi.csv". Both files are encoded in UTF-8, have a header, text is enclosed in double quotes, and columns are separated by commas. The rows are numbered. The WordSim-cs-Multi data set has rows numbered from 1 to 634, whereas the row indices in the WordSim353-cs data set reflect the corresponding row numbers in the WordSim-cs-Multi data set.
The WordSim353-cs file contains a one-to-one mapping selection of 353 Czech equivalent pairs whose judgments have proven to be most similar to the judgments of their corresponding English originals (compared by the absolute value of the difference between the means over all annotators in each language counterpart). In one case ("psychology-cognition"), two Czech equivalent pairs had identical means as well as confidence intervals, so we randomly selected one.
The "WordSim-cs-Multi.csv" file contains human judgments for all translation variants.
In both data sets, we preserved all 25 individual scores. In the WordSim353-cs data set, we added a column with their Czech means as well as a column containing the original English means and 95% confidence intervals in separate columns for each mean (computed by the CI function in the Rmisc R package). The WordSim-cs-Multi data set contains only the Czech means and confidence intervals. For the most convenient lexical search, we provided separate columns with the respective Czech and English single words, entire word pairs, and eventually an English-Czech quadruple in both data sets.
The data set also contains an xls table with the four translations and a preliminary selection of the best variants performed by an adjudicator.
Segment from Československý zvukový týdeník Aktualita (Czechoslovak Aktualita Sound Newsreel) 1942, issue no. 32B, reports on a workers´ holiday organized by the Reinhard Heydrich Foundation for Workers´ Recuperation in Český Šternberk. A view of the exterior of the health resort. Holidaymakers are sunbathing on the terrace. A waiter is carrying plates full of food in the dining room. People are eating. A close-up of a man drinking beer from a beer mug. Holidaymakers playing volleyball. A fisherman is sitting on the bank of the Sázava River. People are bathing in the river and in the weir. Český Šternberk Castle can be seen in the background.
Segment from Československý zvukový týdeník Aktualita (Czechoslovak Aktualita Sound Newsreel) 1942, issue no. 24A, reports on a workers´ holiday organized by the Reinhard Heydrich Foundation for Workers´ Recuperation in Luhačovice. Footage of a train arriving at the railway station and the welcoming of the holidaymakers. Lunch is ready for visitors at a local restaurant. Holidaymakers rest on the hotel terrace, some play volleyball or skittles. Others explore the surrounding countryside. Footage of a walk to the Luhačovice Dam. Girls sit on the grass, weaving flower wreaths. Holidaymakers taste the local mineral water.
Segment from Československý zvukový týdeník Aktualita (Czechoslovak Aktualita Sound Newsreel) 1942, issue no. 32A, reports on a workers´ holiday organized by the Reinhard Heydrich Foundation for Workers´ Recuperation in the village of Věšín u Blatné. Holidaymakers walk through the health resort´s gate. Morning exercise in the courtyard. A waiter carries plates full of food across the outdoor dining room, people are eating. Footage of holidaymakers enjoying leisure activities, an improvised boxing match, swimming in the pool, playing water sports. A view of an entrance arch with a sign saying "Welcome to the Workers´ Health Resort".
Segment from Český zvukový týdeník Aktualita (Czech Aktualita Sound Newsreel) issue no. 51B from 1943 depicts the Youth Basketball Championship organised by the Board of Trustees for the Education of Youth and held in the Great Hall of Lucerna Palace in Prague from 10 to 12 December. The boys´ final was won by the Central Bohemia I team, who beat the Brno Region I team 27:13. The girls´ final was won by the Brno Region I team, who beat the team from Polabí 17:5.
Segment from Český zvukový týdeník Aktualita (Czech Aktualita Sound Newsreel) issue no. 7B from 1944 was shot at the Youth Ice Sports Championship, which culminated with the Ice Sports Week organised by the Board of Trustees for the Education of Youth at Štvanice Ice Arena in Prague from 1 to 6 February. The team LTC Prague beat the team SSC Říčany 4:0 to win the youth ice hockey final. General Secretary of the Board František Teuner presented diplomas to the winners of Ice Sports Week.
Segment from Český zvukový týdeník Aktualita (Czech Aktualita Sound Newsreel) issue no. 7A from 1944 contains footage from the Youth Ice Sports Championship , which culminated with the Ice Sports Week organised by the Board of Trustees for the Education of Youth and held at Štvanice Ice Arena in Prague from 1 to 6 February. The programme included a performance of single figure skating.
Segment from Český zvukový týdeník Aktualita (Czech Aktualita Sound Newsreel) issue no. 20A from 1944 was shot during the Youth Spring Day event organised by the Board of Trustees for the Education of Youth and held across the Protectorate on 7 May 1944. The purpose of the event was to renew folk customs and traditions. In Prague´s Stromovka, girls in folk costumes danced a quadrille dance called "Česká beseda" under a maypole. Boys celebrated the day with races.