Ancillary Monitor Corpus: Common Crawl - german web (YEAR 2016 – VERSION 1)
Please use the following text to cite this item or export to a predefined format:
Rüdiger, Jan Oliver, 2024,
Ancillary Monitor Corpus: Common Crawl - german web (YEAR 2016 – VERSION 1), LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11372/LRT-5790.
Authors
Item identifier
Date issued
2024-11-12
Size
3357178 texts,
3872097668 tokens
Language(s)
Description
*** german version see below ***
The ‘Ancillary Monitor Corpus: Common Crawl - german web’ was designed with the aim of enabling a broad-based linguistic analysis of the German-language (visible) internet over time - with the aim of achieving comparability with the DeReKo (‘German Reference Corpus’ of the Leibniz Institute for the German Language - DeReKo volume 57 billion tokens - status: DeReKo Release 2024-I). The corpus is separated by year (here year 2016) and versioned (here version 1). Version 1 comprises (all years 2013-2024) 97.45 billion tokens.
The corpus is based on the data dumps from CommonCrawl (https://commoncrawl.org/). CommonCrawl is a non-profit organisation that provides copies of the visible Internet free of charge for research purposes.
The CommonCrawl WET raw data was first filtered by TLD (top-level domain). Only pages ending in the following TLDs were taken into account: ‘.at; .bayern; .berlin; .ch; .cologne; .de; .gmbh; .hamburg; .koeln; .nrw; .ruhr; .saarland; .swiss; .tirol; .wien; .zuerich’. These are the exclusive German-language TLDs according to ICANN (https://data.iana.org/TLD/tlds-alpha-by-domain.txt) as of 1 June 2024 - TLDs with a purely corporate reference (e.g. ‘.edeka; .bmw; .ford’) were excluded. The language of the individual documents (URLs) was then estimated with the help of NTextCat (https://github.com/ivanakcheurov/ntextcat) (via the CORE14 profile of NTextCat) - only those documents/URLs for which German was the most likely language were processed further (e.g. to exclude foreign-language material such as individual subpages). The third step involved filtering for manual selectors and filtering for 1:1 duplicates (within one year).
The filtering and subsequent processing was carried out using CorpusExplorer (http://hdl.handle.net/11234/1-2634) and our own (supplementary) scripts, and the TreeTagger (http://hdl.handle.net/11372/LRT-323) was used for automatic annotation. The corpus was processed on the HELIX HPC cluster. The author would like to take this opportunity to thank the state of Baden-Württemberg and the German Research Foundation (DFG) for the possibility to use the bwHPC/HELIX HPC cluster - funding code HPC cluster: INST 35/1597-1 FUGG.
Data content:
- Tokens and record boundaries
- Automatic lemma and POS annotation (using TreeTagger)
- Metadata:
- GUID - Unique identifier of the document
- YEAR - Year of capture (please use this information for data slices)
- Url - Full URL
- Tld - Top-Level Domain
- Domain - Domain without TLD (but with sub-domains if applicable)
- DomainFull - Complete domain (incl. TLD)
- DomainFull - Complete domain (incl. TLD)
- Datum - (System Information): Date of the CorpusExplorer (date of capture by CommonCrawl - not date of creation/modification of the document).
- Hash - (System Information): SHA1 hash of the CommonCrawl
- Pfad - (System Information): Path of the cluster (raw data) - is supplied by the system.
Please note that the files are saved as *.cec6.gz. These are binary files of the CorpusExplorer (see above). These files ensure efficient archiving. You can use both CorpusExplorer and the ‘CEC6-Converter’ (available for Linux, MacOS and Windows - see: https://lindat.mff.cuni.cz/repository/xmlui/handle/11372/LRT-5705) to convert the data. The data can be exported in the following formats:
- CATMA v6
- CoNLL
- CSV
- CSV (only meta-data)
- DTA TCF-XML
- DWDS TEI-XML
- HTML
- IDS I5-XML
- IDS KorAP XML
- IMS Open Corpus Workbench
- JSON
- OPUS Corpus Collection XCES
- Plaintext
- SaltXML
- SlashA XML
- SketchEngine VERT
- SPEEDy/CODEX (JSON)
- TLV-XML
- TreeTagger
- TXM
- WebLicht
- XML
Please note that an export increases the storage space requirement extensively. The ‘CorpusExplorerConsole’ (https://github.com/notesjor/CorpusExplorer.Terminal.Console - available for Linux, MacOS and Windows) also offers a simple solution for editing and analysing. If you have any questions, please contact the author.
Legal information
The data was downloaded on 01.11.2024. The use, processing and distribution is subject to §60d UrhG (german copyright law), which authorises the use for non-commercial purposes in research and teaching. LINDAT/CLARIN is responsible for long-term archiving in accordance with §69d para. 5 and ensures that only authorised persons can access the data. The data has been checked to the best of our knowledge and belief (on a random basis) - should you nevertheless find legal violations (e.g. right to be forgotten, personal rights, etc.), please write an e-mail to the author (amc_report@jan-oliver-ruediger.de) with the following information: 1) why this content is undesirable (please outline only briefly) and 2) how the content can be identified - e.g. file name, URL or domain, etc. The author will endeavour to identify the content. The author will endeavour to remove the content and re-upload the data (modified) within two weeks (new version). If you have any further questions, please contact CLARIN.
*** english version see above ***
Das ‚Ancillary Monitor Corpus: Common Crawl - german web‘ wurde mit dem Ziel konzipiert - eine breit angelegte und zeitlich begleitende linguistische Analyse des deutschsprachigen (sichtbaren) Internets zu ermöglichen - wobei eine Vergleichbarkeit mit dem DeReKo (‚Deutsches Referenz Korpus‘ des Leibniz-Instituts für Deutsche Sprache - DeReKo Umfang 57 Mrd. Token - Stand: DeReKo Release 2024-I) angestrebt wird. Das Korpus ist nach Jahren getrennt (hier Jahr 2016) und versioniert (hier Version 1). Die Version 1 umfasst (alle Jahre 2013-2024) 97,45 Mrd. Token.
Das Korpus basiert auf den Daten-Dumps von CommonCrawl (https://commoncrawl.org/). CommonCrawl ist eine Non-Profit-Organisation, die Kopien des sichtbaren Internets kostenlos für die Forschung zur Verfügung stellt.
Die CommonCrawl WET Rohdaten wurden zunächst nach TLD (Top-Level Domain) gefiltert. Es wurden nur Seiten berücksichtigt, die auf folgende TLDs enden: „.at; .bayern; .berlin; .ch; .cologne; .de; .gmbh; .hamburg; .koeln; .nrw; .ruhr; .saarland; .swiss; .tirol; .wien; .zuerich“. Dies sind die exklusiven deutschsprachigen TLDs laut ICANN (https://data.iana.org/TLD/tlds-alpha-by-domain.txt) zum Stand 01.06.2024 - ausgeschlossen wurden TLDs mit reinem Firmenbezug (z.B. „.edeka; .bmw; .ford“). Für die einzelnen Dokumente (URLs) wurde dann mit Hilfe von NTextCat (https://github.com/ivanakcheurov/ntextcat) die Sprache geschätzt (über das CORE14-Profil von NTextCat) - es wurden nur solche Dokumente/URLs weiterverarbeitet, bei denen Deutsch die wahrscheinlichste Sprache war (z.B. um möglichst auszuschließen, dass fremdsprachiges Material wie einzelne Unterseitenbereiche enthalten sind). Als dritter Schritt erfolgte eine Filterung nach manuellen Selektoren und eine Filterung nach 1:1-Dubletten (innerhalb eines Jahres).
Die Filterung und anschließende Aufbereitung erfolgte mit dem CorpusExplorer (http://hdl.handle.net/11234/1-2634) und eigenen (ergänzenden) Skripten, wobei für die automatische Annotation der TreeTagger (http://hdl.handle.net/11372/LRT-323) verwendet wurde. Die Aufbereitung des Korpus erfolgte auf dem HELIX-HPC-Cluster. Der Autor dankt an dieser Stelle dem Land Baden-Württemberg und der Deutschen Forschungsgemeinschaft (DFG) für die Möglichkeit das bwHPC/HELIX HPC-Cluster nutzen zu können – Förderkennzeichen HPC-Cluster: INST 35/1597-1 FUGG.
Dateninhalt:
- Token und Satzgrenzen
- Automatische Lemma- und POS-Annotation (mittels TreeTagger)
- Metadaten:
- GUID - Eindeutiger Identifikator des Dokuments
- YEAR - Jahr der Erfassung (bitte verwenden Sie diese Angabe für Datenschnitte)
- Url - Vollständige URL
- Tld – Top-Level Domain
- Domain – Domain ohne TLD (aber ggf. mit Sub-Domains)
- DomainFull – Vollständige Domain (inkl. TLD)
- DomainFull - Komplette Domain (inkl. TLD)
- Datum - (System Information): Datum des CorpusExplorers (Tag der Erfassung durch CommonCrawl - nicht Tag der Erstellung/Änderung des Dokuments).
- Hash - (System Information): SHA1-Hash des CommonCrawl
- Pfad - (System Information): Pfad des Clusters (Rohdaten) - wird systembedingt geliefert.
Bitte beachten Sie, dass die Dateien als *.cec6.gz gespeichert sind. Dies sind Binärdateien des CorpusExplorers (siehe oben). Diese Dateien gewährleisten eine effiziente Archivierung. Sie können sowohl den CorpusExplorer als auch den ‚CEC6-Converter‘ (verfügbar für Linux, MacOS und Windows - siehe: https://lindat.mff.cuni.cz/repository/xmlui/handle/11372/LRT-5705) zur Konvertierung der Daten verwenden. Die Daten können in folgende Formate exportiert werden:
- CATMA v6
- CoNLL
- CSV
- CSV (only meta-data)
- DTA TCF-XML
- DWDS TEI-XML
- HTML
- IDS I5-XML
- IDS KorAP XML
- IMS Open Corpus Workbench
- JSON
- OPUS Corpus Collection XCES
- Plaintext
- SaltXML
- SlashA XML
- SketchEngine VERT
- SPEEDy/CODEX (JSON)
- TLV-XML
- TreeTagger
- TXM
- WebLicht
- XML
Bitte beachten Sie, dass ein Export den Speicherplatzbedarf erheblich erhöht. Eine einfache Lösung zur Bearbeitung und Analyse bietet auch die „CorpusExplorerConsole“ (https://github.com/notesjor/CorpusExplorer.Terminal.Console - verfügbar für Linux, MacOS und Windows). Bei Fragen wenden Sie sich bitte an den Autor.
Rechtliche Hinweise
Die Daten wurden am 01.11.2024 heruntergeladen. Die Nutzung, Verarbeitung und Verbreitung unterliegt §60d UrhG, der die Nutzung für nicht kommerzielle Zwecke in Forschung und Lehre erlaubt. LINDAT/CLARIN übernimmt die Langzeitarchivierung nach §69d Abs. 5 und stellt sicher, dass nur berechtigte Personen auf die Daten zugreifen können. Die Daten wurden nach bestem Wissen und Gewissen (stichprobenartig) überprüft - sollten Sie dennoch Rechtsverletzungen (z.B. Recht auf Vergessenwerden, Persönlichkeitsrechte etc.) finden, schreiben Sie bitte eine E-Mail an den Autor (amc_report@jan-oliver-ruediger.de) mit folgenden Informationen: 1) warum dieser Inhalt unerwünscht ist (bitte nur kurz skizzieren) und 2) wie der Inhalt identifiziert werden kann - z.B. Dateiname, URL oder Domain etc. Der Autor wird sich bemühen, den Inhalt zu entfernen und die Daten innerhalb von zwei Wochen (verändert) wieder hochzuladen (neue Version). Bei weiteren Fragen wenden Sie sich bitte an CLARIN.
Publisher
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- 2016_0015.cec6.gz
- Size
- 197.46 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 774c03f4e25b9cc00671a47cd96904cb

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0001.cec6.gz
- Size
- 197.17 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- c9c7de0cbcc5904fcf1397fc2abdefda

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0035.cec6.gz
- Size
- 200.11 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 5352a1a25ec5cd3303b3a0b717e0445a

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0070.cec6.gz
- Size
- 197.49 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 209e83a4d856b3adc76b25f64060c20e

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0084.cec6.gz
- Size
- 201.97 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 348010c0de4df47a17870d5a6b7e3b82

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0068.cec6.gz
- Size
- 200.47 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 62c440ea7a47a35c7a82e010201ef69b

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0003.cec6.gz
- Size
- 197.89 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- f1372a63652786f2aa909c2fe7013fba

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0031.cec6.gz
- Size
- 196.68 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- eacabac74c09e0a7a22b89562826c7c3

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0004.cec6.gz
- Size
- 199.48 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 6e32b4d101daa18b7129712bfa5ba170

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0083.cec6.gz
- Size
- 199.91 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 47289adda5be2bc82ec0b65850daa30e

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0023.cec6.gz
- Size
- 195.23 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 301ab519e9e2b0f432ac37f14d57c71f

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0005.cec6.gz
- Size
- 199.93 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 9b3fba1469be8c90f5ea0baedc17faa5

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0025.cec6.gz
- Size
- 197.78 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 299e7fa51f9dfa67b662c7ecaacb55e0

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0080.cec6.gz
- Size
- 198.03 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- e8273ea98427f8bc0af4fc844a7518e4

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0047.cec6.gz
- Size
- 198.07 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- f6f2987530046d75402449e91500009c

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0063.cec6.gz
- Size
- 200.55 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- b18964447f1c6e019912c28e246f0775

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0045.cec6.gz
- Size
- 199.37 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- ff7f3b5eb838764cd4169deb2533365e

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0066.cec6.gz
- Size
- 197.43 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 393ef48d8a1de106bc376874aa1f6abe

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0058.cec6.gz
- Size
- 200.13 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 0f9914224b65823597a7c12df50f1276

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0033.cec6.gz
- Size
- 203.93 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 92e247ef7ed79b6ed0e6c1176d63b0fb

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0052.cec6.gz
- Size
- 201.88 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 2ec531fbf279cbb4ba8e3f7b99228aa2

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0056.cec6.gz
- Size
- 195.89 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 55c939c4d8f0302748f84182efce8a2c

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0036.cec6.gz
- Size
- 202.06 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- d3013307ef9e26b097a9ff62c65a3cbb

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0092.cec6.gz
- Size
- 196.55 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- f382909115ece48cf21ec5d1358f03c5

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0021.cec6.gz
- Size
- 199.06 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 874a36945bb44bd08c337b0641115a82

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0030.cec6.gz
- Size
- 199.64 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 0fd3fabd61a1bef3314e3c33140acdf9

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0017.cec6.gz
- Size
- 200 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 333c7261a0b725fb9ea22aaff2964986

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0040.cec6.gz
- Size
- 202.16 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 2d8f1c62480cf2d49e49abc9e984ca69

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0002.cec6.gz
- Size
- 202.2 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 1c228d91b91610e8c87c05107b925b1a

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0042.cec6.gz
- Size
- 196.99 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 929c965637d1f1eb7849fde5933c1c87

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0041.cec6.gz
- Size
- 198.35 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 364452a95f50cd3f4ba90ed6afed5297

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0008.cec6.gz
- Size
- 199.2 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- e2dd4fe39dff6558cf4b4d1181e65373

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0038.cec6.gz
- Size
- 197.38 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- ff660d54cae09d45b2e668f6a07f56d1

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0010.cec6.gz
- Size
- 197.07 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- b737f39f2996d59fbb7b0274cb19997f

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0081.cec6.gz
- Size
- 196.83 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 0da38585384b8d2543198ea0e9591c4f

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0050.cec6.gz
- Size
- 201.42 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- c1b43409ba4121831003462decafe778

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0085.cec6.gz
- Size
- 198.7 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 5801b3b10706c74cdffa2df38fba229c

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0029.cec6.gz
- Size
- 196.52 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 316e8b458e272e7e17bde10362ec51c2

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0049.cec6.gz
- Size
- 197.57 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- cb5cce8a1682d08d5dd119fe79dcc83a

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0016.cec6.gz
- Size
- 199.07 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 122960d5a4ba548e532ce98615c9224f

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0055.cec6.gz
- Size
- 197.28 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 7ccfe5ea8f08b1a2ffab58bcb0f844dc

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0057.cec6.gz
- Size
- 199.13 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- a37e5dc2c70d035add4c7b47bcb0ffe8

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0065.cec6.gz
- Size
- 200.78 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- af3f2d7d4f35011340725c3834b03206

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0037.cec6.gz
- Size
- 199.23 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 7561f269ce090600a6f27c2385e843a9

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0046.cec6.gz
- Size
- 198.4 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 648f5a55b63dbf56f6fc49a6ca5f919e

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0088.cec6.gz
- Size
- 198.82 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 81412e2a60c6cb40c9001bcf11367898

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0059.cec6.gz
- Size
- 200.74 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 87c997c74943abb4f10d10f3485a1184

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0027.cec6.gz
- Size
- 200.13 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- f643bce7da272f217fd818edff5557ca

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0014.cec6.gz
- Size
- 199.33 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 19fd5584b64da2c91a6346e1bffb6490

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0086.cec6.gz
- Size
- 198.52 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 3303b6bffbd91272f3496d2b44c9f494

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0082.cec6.gz
- Size
- 200.03 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 2056ab1a99dc2d6b6e71ba06400c1764

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0048.cec6.gz
- Size
- 199.54 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- d5dff396cf31501587151c7e102c3d77

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0039.cec6.gz
- Size
- 201.23 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 695c883a68268767f96ff4f6a6893d13

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0013.cec6.gz
- Size
- 197.55 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- c61d22cd7e6fbfc869f28d8cd65e4d8c

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0061.cec6.gz
- Size
- 199.03 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- ec4db882f2d8652a43280c1757e0a3ef

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0074.cec6.gz
- Size
- 200.01 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- c33522fb6ec5317849f810c32b654848

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0076.cec6.gz
- Size
- 196.93 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- e2ff7657f113e7c99627b192ad649051

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0077.cec6.gz
- Size
- 202.17 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 1662c6d4ba641c59cb279faf891eaeed

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0012.cec6.gz
- Size
- 195.5 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 1b6d573da0131abc6668eddf232f48eb

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0071.cec6.gz
- Size
- 195.83 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 0a2534cc75646f6ab86aa9aa0b9b18fc

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0051.cec6.gz
- Size
- 198.73 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 283d85d12d71b0c2a67652c8b8f89053

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0079.cec6.gz
- Size
- 197.81 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 04b2a83ea4f9de87ca194fccba4d3f96

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0053.cec6.gz
- Size
- 199.85 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 98f1f1fbcdb12683dbed8224bf8863af

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0064.cec6.gz
- Size
- 200.3 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- eeb62042244964f74c9bee7ab716cf61

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0095.cec6.gz
- Size
- 116.74 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 77be5645134b3f86bda1f23367d213b6

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0019.cec6.gz
- Size
- 197.16 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- a3fe80328be10711217c7b39fdb5b2d3

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0089.cec6.gz
- Size
- 198.65 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- c5f921e7855fbc65f6b9a8d016719d9d

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0011.cec6.gz
- Size
- 197.22 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 3d6f3aaecb465d06499a6c2cbbeb360c

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0006.cec6.gz
- Size
- 200.2 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- a568a461aade0ce48544911656e6588a

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0009.cec6.gz
- Size
- 199.27 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- e99837b9cadb9aa253ee46f3999422bf

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0091.cec6.gz
- Size
- 200.04 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- e65ab6ff6b302ab13c41796a66b39ec7

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0022.cec6.gz
- Size
- 197.32 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 0c431619c8d414ffc14c1c76e61cc994

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0020.cec6.gz
- Size
- 197.85 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- cfffff0f1b474681762f677140c3c6e7

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0060.cec6.gz
- Size
- 200.1 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 3c7619c5431bde3e0d5148be8433417d

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0062.cec6.gz
- Size
- 199.69 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 12ecd17cea7838d22c24fca8ec881f68

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0078.cec6.gz
- Size
- 199.55 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 329aa3304258cfa5521e214ecb41b5bc

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0026.cec6.gz
- Size
- 198.29 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- eabe8cd8207b6c086b53ffe4d9518c14

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0043.cec6.gz
- Size
- 200.03 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 8cad6fb1072f70766480dec32eac3a56

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0094.cec6.gz
- Size
- 198.76 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 455c44ac65a022c5df8e1756b3f2e22d

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0044.cec6.gz
- Size
- 197.75 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 6dba04e72a077ff5f0fedd189ecee1c3

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0072.cec6.gz
- Size
- 195.72 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 304e8b8610fc296dd461c974386f4ec0

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0073.cec6.gz
- Size
- 198.29 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 6fc4aa03d8bac6832c1d13e6931c704e

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0067.cec6.gz
- Size
- 198.85 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 85272570313f580b751b5c7f50de803f

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0024.cec6.gz
- Size
- 199.54 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 9444884dac5dd50719c94638b3b1ce04

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0007.cec6.gz
- Size
- 197.28 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 2e481252537d069b8a6b354e5724d58b

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0075.cec6.gz
- Size
- 201.07 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- d07258e384a5c1db7d7b9d6e9025518f

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0069.cec6.gz
- Size
- 201.81 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 164e426e631d7a3a72cd33c738e3ec72

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0034.cec6.gz
- Size
- 197.51 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 6b5d1ad7d55858748b1612c32b71f6d5

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0054.cec6.gz
- Size
- 200.84 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- f3f204dcf36446ae6a88634437d76fcd

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0093.cec6.gz
- Size
- 197.13 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 575fd37cab2f3a7e690b28621d02e2b7

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0032.cec6.gz
- Size
- 198.98 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 64f4bc5d80e2b181abba409779b11418

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0090.cec6.gz
- Size
- 200.86 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 9b0549d62ae0df1a5702c273c16d9986

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0087.cec6.gz
- Size
- 197.69 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 8ac4ea75144a591a98ebb8052720bb4d

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0028.cec6.gz
- Size
- 199.52 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- e9ef6e2c5207c84191d72b63255fa02b

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2016_0018.cec6.gz
- Size
- 195.14 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- f0e9378c12e641f058e8b1142826a4fc

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz

