Show simple item record

 
dc.contributor.author Bučková, Aneta
dc.contributor.author Nekula, Marek
dc.contributor.author Lukeš, David
dc.contributor.author Woźniak, Michał
dc.contributor.author Wastl, Michael
dc.contributor.author Polowy, Louisa
dc.date.accessioned 2023-02-24T17:10:50Z
dc.date.available 2023-02-24T17:10:50Z
dc.date.issued 2023-01-04
dc.identifier.uri http://hdl.handle.net/11372/LRT-4777
dc.description LANGUAGES IN MIGRATION is designed as a representation of authentic spoken Czech and German that is used in informal speech (private environment, spontaneity, unpreparedness etc.) by Czech-German bilingual speakers born in Czechoslovakia around 1955 and who departed for Germany after becoming 12 years old. The corpus is composed of interviews conducted from 2018–2020 with 20 speakers on language biographies and narrated in Czech and German respectively. 10 interviews were recorded with late (German) repatriates and 10 with Czech migrants. The corpus includes transcripts of ca. 14 hours of Czech recordings and ca. 13,5 hours of German recordings. It contains 217 650 orthographic words (i.e. a total of 286 533 tokens including punctuation). Metadata of LANGUAGES IN MIGRATION include basic sociolinguistically relevant speaker categories (gender, year of birth and of migration, level of education and region of childhood and present residence). The transcription of LANGUAGES IN MIGRATION is linked to the corresponding audio track. The transcription was carried out on the orthographic tier and supplemented by an additional metalanguage tier. The corpus LANGUAGES IN MIGRATION is lemmatized and morphologically tagged in different formats for Czech and German (Stuttgart-Tübingen-Tagset). Deviations from the norm of the spoken Czech and German of the homeland, which are understood as the result of language contact and language isolation, are tagged in a further tier both in the Czech and in the German sub-corpuses of LANGUAGES IN MIGRATION. The (anonymized) corpus is provided in form of transcripts in EAF format, which can be viewed via the freely available ELAN program, and a (semi-XML) vertical format used as an input to the Manatee query engine. The data thus correspond to the corpus available via the KonText query engine to registered users of the CNC at http://www.korpus.cz
dc.language.iso deu
dc.language.iso ces
dc.publisher Faculty of Arts, Institute of the Czech National Corpus, Charles University in Prague
dc.publisher Universität Regensburg
dc.relation.isreferencedby https://bruecken.ff.cuni.cz/magazin/2-28-2021/
dc.rights Czech National Corpus (Shuffled Corpus Data)
dc.rights.uri https://lindat.mff.cuni.cz/repository/xmlui/page/license-cnc
dc.subject spoken language
dc.subject bilingual
dc.subject syntactic annotation
dc.subject migrant language
dc.subject narrative interviews
dc.subject language biography
dc.title Languages in Migration
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
dc.rights.label ACA
has.files yes
branding LRT + Open Submissions
demo.uri https://www.korpus.cz/kontext/query?corpname=jazyky_v_migraci
contact.person Marek Nekula marek.nekula@ur.de Universität Regensburg
sponsor Deutsche Forschungsgesellschaft HA 2659/9-1 Language across generations: contact induced change in morphosyntax in German-Polish bilingual speech nationalFunds
files.size 5240851
files.count 6


 Files in this item

 Download all files in item (5 MB)
This item is
Academic Use
and licensed under:
Czech National Corpus (Shuffled Corpus Data)
Attribution Required Noncommercial
Icon
Name
file2speaker.csv
Size
10.4 KB
Format
Unknown
Description
Metadata: linking speakers to files
MD5
4ddb65907fd495b33721d40d3624381a
 Download file
Icon
Name
files.csv
Size
16.56 KB
Format
Unknown
Description
Metadata: information about files
MD5
c622f8f5d80f7c6fd2009c526b3df6ac
 Download file
Icon
Name
speakers.csv
Size
3 KB
Format
Unknown
Description
Metadata: information about speakers
MD5
899f87a68fe0e890aad1121366bb6019
 Download file
Icon
Name
manual_repozitar_Jazyky_v_migraci.pdf
Size
141.61 KB
Format
PDF
Description
Manual Jazyky v migraci / Languages in Migration
MD5
49090f4045da9aa9d057c078cf3e3318
 Download file
Icon
Name
eaf.zip
Size
2.26 MB
Format
application/zip
Description
transcripts in ELAN
MD5
d53d4805033ce4a0f7cc75458ce6e60d
 Download file
Icon
Name
vrt.zip
Size
2.57 MB
Format
application/zip
Description
transcripts in vertical format
MD5
892d9166bcec5de0e91f65b6eaa36ebb
 Download file

Show simple item record