Ancillary Monitor Corpus: Common Crawl - german web (YEAR 2018 – VERSION 1)
Please use the following text to cite this item or export to a predefined format:
Rüdiger, Jan Oliver, 2024,
Ancillary Monitor Corpus: Common Crawl - german web (YEAR 2018 – VERSION 1), LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11372/LRT-5792.
Authors
Item identifier
Date issued
2024-11-12
Size
3841824 texts,
3803258610 tokens
Language(s)
Description
*** german version see below ***
The ‘Ancillary Monitor Corpus: Common Crawl - german web’ was designed with the aim of enabling a broad-based linguistic analysis of the German-language (visible) internet over time - with the aim of achieving comparability with the DeReKo (‘German Reference Corpus’ of the Leibniz Institute for the German Language - DeReKo volume 57 billion tokens - status: DeReKo Release 2024-I). The corpus is separated by year (here year 2018) and versioned (here version 1). Version 1 comprises (all years 2013-2024) 97.45 billion tokens.
The corpus is based on the data dumps from CommonCrawl (https://commoncrawl.org/). CommonCrawl is a non-profit organisation that provides copies of the visible Internet free of charge for research purposes.
The CommonCrawl WET raw data was first filtered by TLD (top-level domain). Only pages ending in the following TLDs were taken into account: ‘.at; .bayern; .berlin; .ch; .cologne; .de; .gmbh; .hamburg; .koeln; .nrw; .ruhr; .saarland; .swiss; .tirol; .wien; .zuerich’. These are the exclusive German-language TLDs according to ICANN (https://data.iana.org/TLD/tlds-alpha-by-domain.txt) as of 1 June 2024 - TLDs with a purely corporate reference (e.g. ‘.edeka; .bmw; .ford’) were excluded. The language of the individual documents (URLs) was then estimated with the help of NTextCat (https://github.com/ivanakcheurov/ntextcat) (via the CORE14 profile of NTextCat) - only those documents/URLs for which German was the most likely language were processed further (e.g. to exclude foreign-language material such as individual subpages). The third step involved filtering for manual selectors and filtering for 1:1 duplicates (within one year).
The filtering and subsequent processing was carried out using CorpusExplorer (http://hdl.handle.net/11234/1-2634) and our own (supplementary) scripts, and the TreeTagger (http://hdl.handle.net/11372/LRT-323) was used for automatic annotation. The corpus was processed on the HELIX HPC cluster. The author would like to take this opportunity to thank the state of Baden-Württemberg and the German Research Foundation (DFG) for the possibility to use the bwHPC/HELIX HPC cluster - funding code HPC cluster: INST 35/1597-1 FUGG.
Data content:
- Tokens and record boundaries
- Automatic lemma and POS annotation (using TreeTagger)
- Metadata:
- GUID - Unique identifier of the document
- YEAR - Year of capture (please use this information for data slices)
- Url - Full URL
- Tld - Top-Level Domain
- Domain - Domain without TLD (but with sub-domains if applicable)
- DomainFull - Complete domain (incl. TLD)
- DomainFull - Complete domain (incl. TLD)
- Datum - (System Information): Date of the CorpusExplorer (date of capture by CommonCrawl - not date of creation/modification of the document).
- Hash - (System Information): SHA1 hash of the CommonCrawl
- Pfad - (System Information): Path of the cluster (raw data) - is supplied by the system.
Please note that the files are saved as *.cec6.gz. These are binary files of the CorpusExplorer (see above). These files ensure efficient archiving. You can use both CorpusExplorer and the ‘CEC6-Converter’ (available for Linux, MacOS and Windows - see: https://lindat.mff.cuni.cz/repository/xmlui/handle/11372/LRT-5705) to convert the data. The data can be exported in the following formats:
- CATMA v6
- CoNLL
- CSV
- CSV (only meta-data)
- DTA TCF-XML
- DWDS TEI-XML
- HTML
- IDS I5-XML
- IDS KorAP XML
- IMS Open Corpus Workbench
- JSON
- OPUS Corpus Collection XCES
- Plaintext
- SaltXML
- SlashA XML
- SketchEngine VERT
- SPEEDy/CODEX (JSON)
- TLV-XML
- TreeTagger
- TXM
- WebLicht
- XML
Please note that an export increases the storage space requirement extensively. The ‘CorpusExplorerConsole’ (https://github.com/notesjor/CorpusExplorer.Terminal.Console - available for Linux, MacOS and Windows) also offers a simple solution for editing and analysing. If you have any questions, please contact the author.
Legal information
The data was downloaded on 01.11.2024. The use, processing and distribution is subject to §60d UrhG (german copyright law), which authorises the use for non-commercial purposes in research and teaching. LINDAT/CLARIN is responsible for long-term archiving in accordance with §69d para. 5 and ensures that only authorised persons can access the data. The data has been checked to the best of our knowledge and belief (on a random basis) - should you nevertheless find legal violations (e.g. right to be forgotten, personal rights, etc.), please write an e-mail to the author (amc_report@jan-oliver-ruediger.de) with the following information: 1) why this content is undesirable (please outline only briefly) and 2) how the content can be identified - e.g. file name, URL or domain, etc. The author will endeavour to identify the content. The author will endeavour to remove the content and re-upload the data (modified) within two weeks (new version). If you have any further questions, please contact CLARIN.
*** english version see above ***
Das ‚Ancillary Monitor Corpus: Common Crawl - german web‘ wurde mit dem Ziel konzipiert - eine breit angelegte und zeitlich begleitende linguistische Analyse des deutschsprachigen (sichtbaren) Internets zu ermöglichen - wobei eine Vergleichbarkeit mit dem DeReKo (‚Deutsches Referenz Korpus‘ des Leibniz-Instituts für Deutsche Sprache - DeReKo Umfang 57 Mrd. Token - Stand: DeReKo Release 2024-I) angestrebt wird. Das Korpus ist nach Jahren getrennt (hier Jahr 2018) und versioniert (hier Version 1). Die Version 1 umfasst (alle Jahre 2013-2024) 97,45 Mrd. Token.
Das Korpus basiert auf den Daten-Dumps von CommonCrawl (https://commoncrawl.org/). CommonCrawl ist eine Non-Profit-Organisation, die Kopien des sichtbaren Internets kostenlos für die Forschung zur Verfügung stellt.
Die CommonCrawl WET Rohdaten wurden zunächst nach TLD (Top-Level Domain) gefiltert. Es wurden nur Seiten berücksichtigt, die auf folgende TLDs enden: „.at; .bayern; .berlin; .ch; .cologne; .de; .gmbh; .hamburg; .koeln; .nrw; .ruhr; .saarland; .swiss; .tirol; .wien; .zuerich“. Dies sind die exklusiven deutschsprachigen TLDs laut ICANN (https://data.iana.org/TLD/tlds-alpha-by-domain.txt) zum Stand 01.06.2024 - ausgeschlossen wurden TLDs mit reinem Firmenbezug (z.B. „.edeka; .bmw; .ford“). Für die einzelnen Dokumente (URLs) wurde dann mit Hilfe von NTextCat (https://github.com/ivanakcheurov/ntextcat) die Sprache geschätzt (über das CORE14-Profil von NTextCat) - es wurden nur solche Dokumente/URLs weiterverarbeitet, bei denen Deutsch die wahrscheinlichste Sprache war (z.B. um möglichst auszuschließen, dass fremdsprachiges Material wie einzelne Unterseitenbereiche enthalten sind). Als dritter Schritt erfolgte eine Filterung nach manuellen Selektoren und eine Filterung nach 1:1-Dubletten (innerhalb eines Jahres).
Die Filterung und anschließende Aufbereitung erfolgte mit dem CorpusExplorer (http://hdl.handle.net/11234/1-2634) und eigenen (ergänzenden) Skripten, wobei für die automatische Annotation der TreeTagger (http://hdl.handle.net/11372/LRT-323) verwendet wurde. Die Aufbereitung des Korpus erfolgte auf dem HELIX-HPC-Cluster. Der Autor dankt an dieser Stelle dem Land Baden-Württemberg und der Deutschen Forschungsgemeinschaft (DFG) für die Möglichkeit das bwHPC/HELIX HPC-Cluster nutzen zu können – Förderkennzeichen HPC-Cluster: INST 35/1597-1 FUGG.
Dateninhalt:
- Token und Satzgrenzen
- Automatische Lemma- und POS-Annotation (mittels TreeTagger)
- Metadaten:
- GUID - Eindeutiger Identifikator des Dokuments
- YEAR - Jahr der Erfassung (bitte verwenden Sie diese Angabe für Datenschnitte)
- Url - Vollständige URL
- Tld – Top-Level Domain
- Domain – Domain ohne TLD (aber ggf. mit Sub-Domains)
- DomainFull – Vollständige Domain (inkl. TLD)
- DomainFull - Komplette Domain (inkl. TLD)
- Datum - (System Information): Datum des CorpusExplorers (Tag der Erfassung durch CommonCrawl - nicht Tag der Erstellung/Änderung des Dokuments).
- Hash - (System Information): SHA1-Hash des CommonCrawl
- Pfad - (System Information): Pfad des Clusters (Rohdaten) - wird systembedingt geliefert.
Bitte beachten Sie, dass die Dateien als *.cec6.gz gespeichert sind. Dies sind Binärdateien des CorpusExplorers (siehe oben). Diese Dateien gewährleisten eine effiziente Archivierung. Sie können sowohl den CorpusExplorer als auch den ‚CEC6-Converter‘ (verfügbar für Linux, MacOS und Windows - siehe: https://lindat.mff.cuni.cz/repository/xmlui/handle/11372/LRT-5705) zur Konvertierung der Daten verwenden. Die Daten können in folgende Formate exportiert werden:
- CATMA v6
- CoNLL
- CSV
- CSV (only meta-data)
- DTA TCF-XML
- DWDS TEI-XML
- HTML
- IDS I5-XML
- IDS KorAP XML
- IMS Open Corpus Workbench
- JSON
- OPUS Corpus Collection XCES
- Plaintext
- SaltXML
- SlashA XML
- SketchEngine VERT
- SPEEDy/CODEX (JSON)
- TLV-XML
- TreeTagger
- TXM
- WebLicht
- XML
Bitte beachten Sie, dass ein Export den Speicherplatzbedarf erheblich erhöht. Eine einfache Lösung zur Bearbeitung und Analyse bietet auch die „CorpusExplorerConsole“ (https://github.com/notesjor/CorpusExplorer.Terminal.Console - verfügbar für Linux, MacOS und Windows). Bei Fragen wenden Sie sich bitte an den Autor.
Rechtliche Hinweise
Die Daten wurden am 01.11.2024 heruntergeladen. Die Nutzung, Verarbeitung und Verbreitung unterliegt §60d UrhG, der die Nutzung für nicht kommerzielle Zwecke in Forschung und Lehre erlaubt. LINDAT/CLARIN übernimmt die Langzeitarchivierung nach §69d Abs. 5 und stellt sicher, dass nur berechtigte Personen auf die Daten zugreifen können. Die Daten wurden nach bestem Wissen und Gewissen (stichprobenartig) überprüft - sollten Sie dennoch Rechtsverletzungen (z.B. Recht auf Vergessenwerden, Persönlichkeitsrechte etc.) finden, schreiben Sie bitte eine E-Mail an den Autor (amc_report@jan-oliver-ruediger.de) mit folgenden Informationen: 1) warum dieser Inhalt unerwünscht ist (bitte nur kurz skizzieren) und 2) wie der Inhalt identifiziert werden kann - z.B. Dateiname, URL oder Domain etc. Der Autor wird sich bemühen, den Inhalt zu entfernen und die Daten innerhalb von zwei Wochen (verändert) wieder hochzuladen (neue Version). Bei weiteren Fragen wenden Sie sich bitte an CLARIN.
Publisher
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- 2018_0011.cec6.gz
- Size
- 202.66 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- d7efbbac61f4eafed6f1daf406dadab4

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0016.cec6.gz
- Size
- 205.91 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- f82bbfc33ec93a347314dbf0ee871167

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0013.cec6.gz
- Size
- 203.85 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 9a5451c653b98a8b2d086af8edc0d62b

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0017.cec6.gz
- Size
- 203.66 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 9ab3bc1ce29c82e230124d6044a728a4

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0018.cec6.gz
- Size
- 206.71 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 19c84f97467e1f89f5d1c9c4c0ed0ff8

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0012.cec6.gz
- Size
- 205.83 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 18325c2351a47a3ef0fd0f7fd7e75ee1

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0014.cec6.gz
- Size
- 200.1 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 2eda3b9564b89b992b161d14387292f1

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0020.cec6.gz
- Size
- 199.71 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 93e829b18f9fd5c02b8383215ed98a81

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0023.cec6.gz
- Size
- 198.78 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 306e244addd23dd4ebe08fcc407238a7

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0024.cec6.gz
- Size
- 205.38 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- ae8a0759ad85be678660e21dc515217f

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0027.cec6.gz
- Size
- 200.56 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 1bc0cee4857f070f72aebd2e1fb18298

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0026.cec6.gz
- Size
- 202.95 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- b4cda1a56c0fe160a8199b6a49ac9748

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0025.cec6.gz
- Size
- 202.87 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 12824019e88207bd2d7dbb2a2c8f59bb

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0028.cec6.gz
- Size
- 204.19 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- a73dd0d0347e460b8742a0987356ec89

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0031.cec6.gz
- Size
- 199.06 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 3ffa50c08c3f086dff69e153b00025cc

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0034.cec6.gz
- Size
- 202.91 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 0f1239fbb85f19a26b5300d36146757d

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0032.cec6.gz
- Size
- 207.09 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- dacbe484ed1e86f8f912b2b0bb6a226b

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0015.cec6.gz
- Size
- 200.69 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 558ea09d17840ec2b6fdccc983fcc4cd

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0021.cec6.gz
- Size
- 206.62 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- cd4c90267a303d7556ee6dd488e27f07

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0022.cec6.gz
- Size
- 201.61 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 0a2e178e8eac06a2f06d6b0e586cf706

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0019.cec6.gz
- Size
- 205.83 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 78e29d1a98d4270aa14b2059504e5eef

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0029.cec6.gz
- Size
- 205.54 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 409cb37e4177ad3ed2c3ae971d00b60f

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0030.cec6.gz
- Size
- 204.73 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 0b64eda8963a5a708671fc15e4b41b5e

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0035.cec6.gz
- Size
- 203.42 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- c1bc7a1037c2fba5a24b94b7c2801436

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0033.cec6.gz
- Size
- 199.96 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 4f79c22174f5dab3f0314f0aa918072b

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0037.cec6.gz
- Size
- 203.29 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- bf55b677031f629cd901c6d1e7917a2b

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0036.cec6.gz
- Size
- 199.37 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 402831852671e3d50e277aae84a38c72

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0039.cec6.gz
- Size
- 203.86 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 22339ac2b381f1f04632f131f7a00580

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0042.cec6.gz
- Size
- 203.79 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 44affa5a433f5bb464f6bcd01a672aa1

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0038.cec6.gz
- Size
- 202.74 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 97cd7ccacb904c83684051c09f77b839

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0040.cec6.gz
- Size
- 206.69 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 032134e04a26060e02ccd6795bc41a4f

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0044.cec6.gz
- Size
- 201.99 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- dd93395d61e750e1356d60b5860c2161

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0045.cec6.gz
- Size
- 203.62 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 58633e1913f2e52eb85916ff8a5e9bfa

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0046.cec6.gz
- Size
- 202.63 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 54229175c9446ba49b1293b7d599232a

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0049.cec6.gz
- Size
- 200.43 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- d0686d78797f1021dbb39c85d4733d74

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0050.cec6.gz
- Size
- 205.2 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 396cbd7bb9aeb29c4951b530f1e0e0d1

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0048.cec6.gz
- Size
- 206.22 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- bf7126e8437abfbc342f249203fd19e4

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0051.cec6.gz
- Size
- 202.58 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 26696898d87dd0eb97710d44701ff0cd

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0055.cec6.gz
- Size
- 202.12 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- f4683609cc7c85841ac02ec3b3d2a777

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0053.cec6.gz
- Size
- 204.24 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 93c6e74d510bccaa660923229d1450cd

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0056.cec6.gz
- Size
- 203.33 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 5a62200338c5010bf3c435e5f05bdd90

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0059.cec6.gz
- Size
- 201.4 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 9123854e65cba41a40f5951074ef9ca8

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0041.cec6.gz
- Size
- 201.91 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- c9a61d63305285047630e33a58db9e72

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0043.cec6.gz
- Size
- 203.53 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- f30e34225a9274b790011072f0a6cedd

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0047.cec6.gz
- Size
- 202.85 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 12318d44bd2c2908f1d6edbf2627f309

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0052.cec6.gz
- Size
- 201.73 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- e4071a4a3759069cbd41c4ee16e26a65

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0054.cec6.gz
- Size
- 200.67 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 2a8a2ac9c15400de19a2d3042e15a784

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0057.cec6.gz
- Size
- 203.92 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 7a8fac27813aef37f20aa874d16c88b2

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0058.cec6.gz
- Size
- 202.47 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 146c9e2591743fad3a57fd2aa13ab2e3

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0060.cec6.gz
- Size
- 203.4 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- e7b5f6dbaea0b2581b322c56efd6976d

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0061.cec6.gz
- Size
- 200.64 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 914ed9117ba5e0a76affb9d013520b74

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0063.cec6.gz
- Size
- 203.48 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 171db1ed541bf9c04f713ecb6c919188

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0062.cec6.gz
- Size
- 203.58 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- e429a12d16252f1fa8880651c28623fb

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0064.cec6.gz
- Size
- 205.18 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 85a43378d87ca36d4473b9ac41d28d15

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0065.cec6.gz
- Size
- 202.86 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 677a6be4877700ee0d6821e94f143b9a

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0066.cec6.gz
- Size
- 201.14 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- c48012593c70c91aa116b428beded38c

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0068.cec6.gz
- Size
- 197.99 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 2d4c6c5885a93307c98817472dae28f0

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0070.cec6.gz
- Size
- 197.21 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- aef9824a290e2c9441715c4cc447cbf3

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0069.cec6.gz
- Size
- 205 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 291e4fa4f6d667cf1e29eb7c6854f723

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0072.cec6.gz
- Size
- 202.57 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 88cee45ea68deaa2b0c42b763c5ffe77

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0073.cec6.gz
- Size
- 200.16 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 085412ff6d738ff81fbd2cda5e8d1490

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0074.cec6.gz
- Size
- 204.91 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- d2f2dc4d4fed22ec9b499483781a4b45

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0075.cec6.gz
- Size
- 197.44 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 1f524cdca04c8c16d4cc66e67b07c4fa

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0077.cec6.gz
- Size
- 203.59 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- c76678bc1e532a309060f04f4514cfa0

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0082.cec6.gz
- Size
- 196.19 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 6ebf41818ef4173456796b39d4332e61

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0079.cec6.gz
- Size
- 203.67 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- b4619b64a5a17c0a556771ec72e663ae

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0090.cec6.gz
- Size
- 202.48 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 08d0696d6c1831979ed4faceebc62f0a

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0094.cec6.gz
- Size
- 202.2 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 1a3844dad8a7149c3bb2c5e7b8a34938

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0067.cec6.gz
- Size
- 201.24 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- b34a6797a38153138596d2115cbd02ff

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0071.cec6.gz
- Size
- 200.44 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- b833b357c03d21557374d7d936e4d0f1

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0076.cec6.gz
- Size
- 202.43 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 8985ca8fb84a6cb544e08fb97df522f8

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0078.cec6.gz
- Size
- 201.9 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- b6b65c58ee5a6c803245a6ad62c44377

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0080.cec6.gz
- Size
- 198.33 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 6b211a9d013ad30363f9b0e40d7a9739

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0085.cec6.gz
- Size
- 203.23 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 1715fd3641b7cf310998723ac2250ceb

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0081.cec6.gz
- Size
- 204.12 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 66658c92ba5cc4683170d2c6494ae91c

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0092.cec6.gz
- Size
- 198.56 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- c3332e0422eba057006a922291302ea2

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0091.cec6.gz
- Size
- 203.74 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- acffa164ee5793839740fd5086970d94

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0089.cec6.gz
- Size
- 196.59 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 817f1136f35bb962c2a258bcfa17e456

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0093.cec6.gz
- Size
- 198.97 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 78cc0013c1229b8d21cfa096cfc770a3

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0095.cec6.gz
- Size
- 199.32 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- fde03858eaf1c18bc70bdcb496dd9772

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0097.cec6.gz
- Size
- 64.9 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- b22188b33326fe805e9801b7f344b855

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0096.cec6.gz
- Size
- 197.87 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- b705c71da71f985a7dee909c8f34792a

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0002.cec6.gz
- Size
- 199.99 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- a0770324662223b7f2f3a2347993a6c6

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0003.cec6.gz
- Size
- 200.97 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- a222441c19ea26094cff9559f8229b6c

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0001.cec6.gz
- Size
- 201.4 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 99ba45e1963237e0d2f61a1d834e1648

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0005.cec6.gz
- Size
- 200.14 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- ecb015d41d969c80df34c4279d452952

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0004.cec6.gz
- Size
- 202.35 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- ebf64978b6d774fe564f8a2be2b3b7a3

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0006.cec6.gz
- Size
- 204.98 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 3b7efbdf7340fa2917ff8042071e572a

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0007.cec6.gz
- Size
- 202.95 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 1fb421660b3238d14c1a2f96c71f0ead

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0008.cec6.gz
- Size
- 203.4 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 09413dcab68ad4ebca2d8337bfe23d91

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0009.cec6.gz
- Size
- 202.22 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- a5db730391db89e832c2d0c48c00c26e

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0010.cec6.gz
- Size
- 201.36 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 2d3dad886648726a48205315c2346cbd

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0088.cec6.gz
- Size
- 197.87 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 0a2c3311ce6039332d961f03c77ac87f

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0087.cec6.gz
- Size
- 201.79 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 59b4500a5e298bef823433a01d5373f2

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0084.cec6.gz
- Size
- 199.74 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 6448856a192ab98e77b029a13b752617

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0086.cec6.gz
- Size
- 200.68 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 93e921dfe8507a354a04117edb7b1341

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2018_0083.cec6.gz
- Size
- 206.72 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 207568b35c734f372ea6be20509d0522

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz

