This is not the latest version of this item. The latest version can be found here.
Nottinghamer Korpus Deutscher YouTube-Sprache (The NottDeuYTSch Corpus)
Please use the following text to cite this item or export to a predefined format:
Cotgrove, Louis Alexander, 2018,
Nottinghamer Korpus Deutscher YouTube-Sprache (The NottDeuYTSch Corpus), LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11372/LRT-4779.
Authors
Item identifier
Date issued
2018
Size
33760494 tokens,
32549462 words
Language(s)
Description
The NottDeuYTSch corpus contains over 33 million words taken from approximately 3 million YouTube comments from videos published between 2008 to 2018 targeted at a young, German-speaking demographic and represents an authentic language snapshot of young German speakers. The corpus was proportionally sampled based on video category and year from a database of 112 popular German-speaking YouTube channels in the DACH region for optimal representativeness and balance and contains a considerable amount of associated metadata for each comment that enable further longitudinal cross-sectional analyses.
Publisher
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- NottDeuYTSch_Corpus.rda
- Size
- 280.29 MB
- Format
- application/octet-stream
- Description
- Unknown
- MD5
- e66260b11688917660e5ca511de4d066

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz


