This is not the latest version of this item. The latest version can be found here.
Diadem Speech-Cognitive Dataset (DSCD-CZ)
Please use the following text to cite this item or export to a predefined format:
Šmídl, Luboš; et al., 2025,
Diadem Speech-Cognitive Dataset (DSCD-CZ), LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-5912.
Authors
Šmídl, Luboš ; et al.
Item identifier
Project URL
Date issued
2025-05-29
Size
268 entries
Language(s)
Description
The dataset was created to investigate the speech and cognitive performance of people with varying degrees of cognitive impairment, primarily dementia. The dataset contains a comprehensive set of data including the results of standardized neuropsychological tests (RBANS, ALBA, POBAV, MASTCZ), speech tasks focused on comprehension, memory, naming, and repetition, and demographic data (age, gender, education).
Participants were divided into four groups based on clinical assessment: healthy individuals, healthy individuals with possible mild cognitive impairment, patients with mild cognitive impairment, and patients with dementia. All recordings and examinations were managed as part of routine clinical practice in the neurological outpatient clinic – Memory Disorders Advisory Unit, at the Neurological Clinic of the Faculty Hospital Královské Vinohrady. The dataset containing 268 examinations was divided into a training and test part using stratification by clinical group, age, gender, and level of education to ensure an even distribution of these key characteristics in both parts of the data.
The aim of the dataset is to support the development of methods for automated detection of cognitive disorders based on speech analysis and cognitive performance. The data are suitable for research in the areas of clinical neuropsychology, computational linguistics, and machine learning. The dataset is intended for non-commercial research purposes.
Acknowledgement
Technology Agency of the Czech Republic
Project code:TQ01000332
Project name:Telemedicine self-examination of speech and memory for rapid detection of cognitive impairments using machine learning methods
Collections
Version History
This item isPublicly Available
and licensed under:
Files in this item
- Name
- DiademSpeechCognitiveDataset.md
- Size
- 16.97 KB
- Format
- application/octet-stream
- Description
- Readme
- MD5
- 3f499281751e97066a23c736a69c3876

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- DiademSpeechCognitiveDataset.zip
- Size
- 797.27 KB
- Format
- application/zip
- Description
- Data
- MD5
- b597b55576e9be2f4e191cb5fb378b92

-
- transcriptions_wav2vec_lm-extra04_20250430.json1 MB
- transcriptions_zipformer_20250430.json1 MB
- metadata_20250430.json777 kB
- transcriptions_wav2vec_nolm_20250430.json1 MB
- recordings_20250430.json1 MB
- train_20250430.json6 kB
- transcriptions_annotation_20250430.json243 kB
- sessions_20250430.json26 kB
- ddd.yaml428 B
- test_20250430.json1 kB

