This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.

Diadem Speech-Cognitive Dataset (DSCD-CZ)

Please use the following text to cite this item or export to a predefined format:
Šmídl, Luboš; et al., 2025, Diadem Speech-Cognitive Dataset (DSCD-CZ), LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/11234/1-5912.
Date issued
2025-05-29
Size
268 entries
Language(s)
Description
The dataset was created to investigate the speech and cognitive performance of people with varying degrees of cognitive impairment, primarily dementia. The dataset contains a comprehensive set of data including the results of standardized neuropsychological tests (RBANS, ALBA, POBAV, MASTCZ), speech tasks focused on comprehension, memory, naming, and repetition, and demographic data (age, gender, education). Participants were divided into four groups based on clinical assessment: healthy individuals, healthy individuals with possible mild cognitive impairment, patients with mild cognitive impairment, and patients with dementia. All recordings and examinations were managed as part of routine clinical practice in the neurological outpatient clinic – Memory Disorders Advisory Unit, at the Neurological Clinic of the Faculty Hospital Královské Vinohrady. The dataset containing 268 examinations was divided into a training and test part using stratification by clinical group, age, gender, and level of education to ensure an even distribution of these key characteristics in both parts of the data. The aim of the dataset is to support the development of methods for automated detection of cognitive disorders based on speech analysis and cognitive performance. The data are suitable for research in the areas of clinical neuropsychology, computational linguistics, and machine learning. The dataset is intended for non-commercial research purposes.
Acknowledgement
 Files in this item
Name
DiademSpeechCognitiveDataset.md
Size
16.97 KB
Format
application/octet-stream
Description
Readme
MD5
3f499281751e97066a23c736a69c3876
Preview
  File Preview
Name
DiademSpeechCognitiveDataset.zip
Size
797.27 KB
Format
application/zip
Description
Data
MD5
b597b55576e9be2f4e191cb5fb378b92
Preview
  File Preview
    • transcriptions_wav2vec_lm-extra04_20250430.json1 MB
    • transcriptions_zipformer_20250430.json1 MB
    • metadata_20250430.json777 kB
    • transcriptions_wav2vec_nolm_20250430.json1 MB
    • recordings_20250430.json1 MB
    • train_20250430.json6 kB
    • transcriptions_annotation_20250430.json243 kB
    • sessions_20250430.json26 kB
    • ddd.yaml428 B
    • test_20250430.json1 kB