Migrant Stories
Please use the following text to cite this item or export to a predefined format:
Hájek, Martin; Mírovský, Jiří and Hladká, Barbora, 2022,
Migrant Stories, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-4818.
Authors
Item identifier
Date issued
2022-10-22
Size
1017 entries
Language(s)
Description
Migrant Stories is a corpus of 1017 short biographic narratives of migrants supplemented with meta information about countries of origin/destination, the migrant gender, GDP per capita of the respective countries, etc. The corpus has been compiled as a teaching material for data analysis.
Acknowledgement
4EU+ European University Alliance
Project code:2021_F3_10
Project name:@SWitCH: Crash Course on Data Analytics for Students of Social Studies and Humanities
Subject(s)
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- Migrant_Stories.zip
- Size
- 761.94 KB
- Format
- application/zip
- Description
- Migrant Stories distribution
- MD5
- d0b899ae9bc071f01f4c077b9fcb119f

- Migrant_Stories
- README.TXT2 kB
- data
- migrant_stories.tsv2 MB
- Name
- README.TXT
- Size
- 2.95 KB
- Format
- text/plain
- Description
- Migrant Stories description
- MD5
- b1f9ac74f0b13c1f4296a21f6ab111b7

=============== Migrant Stories =============== Authors ======= Martin Hájek (martin.hajek@fsv.cuni.cz) Jiří Mírovský (mirovsky@ufal.mff.cuni.cz) Barbora Hladká (hladka@ufal.mff.cuni.cz) Introduction ============ Migrant Stories is a corpus of 1017 short biographic narratives of migrants originally published on https://iamamigrant.org/stories/. For the original site, the narratives had been adapted by people or organizations submitting the particular story and eventually selected for publication by The International Organization for Migration (IOM, the UN organization providing help for migrants). It is a very heterogeneous sample of migrants' stories and cannot be taken as representative or unbiased sample of migrant experiences over the world. In the Migrant Stories corpus, the narratives have been supplemented with meta information about countries of origin/destination, the migrant gender, GDP per capita of the respective countries etc., see below for details. The Migrant Stories corpus was compiled for students in the course NPFL134 (Data Analytics for Students of Social Studies and Humanities) at the Institute of Formal and Applied Linguistics in the summer semester of 2022 (https://ufal.mff.cuni.cz/courses/npfl134), as a teaching material for data analysis. Data Format =========== The data are distributed in a single TSV (tab-separated values) file. Each story is represented by a single row containing the following fields (columns): - id_story (a numerical id from 1 to 1017) - name (the name of the migrant) - country_or (the country of origin) - country_de (the destination country) - conti_or (the original (part of) continent; A for Africa, E for Europe, I for Asia, LA for Latin America, M for Middle East, NA for North America, O for other) - conti_de (the destination (part of ) continent) - distance (classification of the distance from the origin to the destination into two classes: close, far - country_or_gdp (GDP per capita of the original . . .

