Datasets and R scripts for modelling Czech translation counterparts of Romance causative constructions
Please use the following text to cite this item or export to a predefined format:
Štichauer,Pavel and Čermák, Petr, 2026,
Datasets and R scripts for modelling Czech translation counterparts of Romance causative constructions, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-5841.
Authors
Item identifier
Date issued
2026-03-16
Size
1.4 mb
Description
This repository contains the datasets and code used in the study “Predicting translation counterparts in causative constructions.”
The datasets consist of annotated examples of Italian and Spanish causative constructions and their Czech translation counterparts. The repository includes (i) full annotated datasets for Italian and Spanish, (ii) revised datasets used for statistical modelling, and (iii) the R script used to estimate Bayesian multinomial regression models using the brms package (Stan backend).
The models estimate the probability of selecting a Czech translation counterpart (TYPE) as a function of verb valency (VALENCY) and complement class (COMP_CLASS), with random effects for VERB and TRANSLATOR.
The repository also contains summaries of the fitted models.
Acknowledgement
Ministry of Youth, Education, and Sports of the Czech Republic
Project code:CZ.02.01.01/00/22_008/0004595
Project name:Beyond Security: Role of Conflict in Resilience-Building
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- Causatives_table_full_Spanish.xlsx
- Size
- 752.69 KB
- Format
- application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
- Description
- MD5
- 48cb5346c5d691d09723099e97d4428c

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- Causatives_table_full_Italian.xlsx
- Size
- 397.49 KB
- Format
- application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
- Description
- MD5
- 250e3367f4c728567123f50a2f8aa8f4

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- README_causatives_repository.txt
- Size
- 1.41 KB
- Format
- text/plain
- Description
- MD5
- d84644028ea2f287973b3f9a37bf65dc

README – Supplementary materials This repository contains the data and code used in the article: “Predicting translation counterparts in causative constructions” FILES INCLUDED DATA - Causatives_table_full_Spanish.xlsx Full annotated dataset for Spanish causative constructions and their Czech translation counterparts. - Causatives_table_full_Italian.xlsx Full annotated dataset for Italian causative constructions and their Czech translation counterparts. - causatives_es_revised.csv Revised Spanish dataset used for statistical modelling, including the variable COMP_CLASS. - causatives_it_revised.csv Revised Italian dataset used for statistical modelling, including the variable COMP_CLASS. CODE - brms_causatives_models.R R script used to estimate the Bayesian multinomial regression models reported in the paper. The models were estimated using the brms package (Stan backend). MODEL OUTPUT - brms_summary_causatives.txt Summaries of the fitted Bayesian multinomial models for both datasets. MODEL SPECIFICATION The models estimate the probability of selecting a Czech translation counterpart (TYPE) as a function of: - VALENCY (valency of the base verb) - COMP_CLASS (class of the complement) with random effects for: - VERB (random intercepts and slopes for VALENCY) - TRANSLATOR (random intercepts) The models were fitted using Hamiltonian Monte Carlo as implemented in Stan via the brms package in R.
- Name
- brms_causatives_models.R
- Size
- 1.08 KB
- Format
- application/octet-stream
- Description
- MD5
- 704d59f4a2ec2dc48f083eccef06d331

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- brms_summary_causatives.txt
- Size
- 20.13 KB
- Format
- text/plain
- Description
- MD5
- d225ad3c0e22d05bcc5034cdea76dbe6

> summary(brms_it_noanim)
Family: categorical
Links: muB = logit; muC = logit; muD = logit; muE = logit; muF = logit; muX = logit
Formula: TYPE ~ VALENCY + COMP_CLASS + (1 + VALENCY | VERB) + (1 | TRANSLATOR)
Data: it (Number of observations: 1394)
Draws: 4 chains, each with iter = 4000; warmup = 1000; thin = 1;
total post-warmup draws = 12000
Multilevel Hyperparameters:
~TRANSLATOR (Number of levels: 15)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(muB_Intercept) 0.41 0.25 0.03 0.97 1.00 2192 2556
sd(muC_Intercept) 0.76 0.29 0.32 1.44 1.00 4258 5629
sd(muD_Intercept) 0.51 0.35 0.03 1.33 1.00 3006 4035
sd(muE_Intercept) 0.87 0.51 0.10 2.11 1.00 2721 2697
sd(muF_Intercept) 0.39 0.26 0.02 1.00 1.00 2508 3392
sd(muX_Intercept) 0.86 0.33 0.34 1.66 1.00 3778 4805
~VERB (Number of levels: 63)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(muB_Intercept) 1.43 0.22 1.05 1.91 1.00 3231 5719
sd(muB_VALENCYreflexive) 1.47 0.97 0.08 3.74 1.00 3007 3738
sd(muB_VALENCYtransitive) 0.73 0.52 0.03 1.93 1.01 1353 2253
sd(muC_Intercept) 1.87 0.32 1.29 2.57 1.00 2879 5073
sd(muC_VALENCYreflexive) 0.85 0.72 0.04 2.65 1.00 2437 3604
sd(muC_VALENCYtransitive) 0.67 0.53 0.03 1.99 1.00 1439 1685
sd(muD_Intercept) 1.97 0.46 1.20 3.05 1.00 3270 5731
sd(muD_VALENCYreflexive) 1.83 1.39 0.08 5.17 1.00 3972 5000
sd(muD_VALENC . . .- Name
- causatives_es_revised.csv
- Size
- 98.96 KB
- Format
- text/csv
- Description
- MD5
- 4f3bba1e0c12890b58d4dd3de7088d98

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- causatives_it_revised.csv
- Size
- 93.68 KB
- Format
- text/csv
- Description
- MD5
- 7a319ccd13a3269c26da517408cf7bde

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz

