Ancillary Monitor Corpus: Common Crawl - german web (YEAR 2014 – VERSION 1)
Please use the following text to cite this item or export to a predefined format:
Rüdiger, Jan Oliver, 2024,
Ancillary Monitor Corpus: Common Crawl - german web (YEAR 2014 – VERSION 1), LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11372/LRT-5788.
Authors
Item identifier
Date issued
2024-11-13
Size
3756830572 tokens,
6124710 articles
Language(s)
Description
*** german version see below ***
The ‘Ancillary Monitor Corpus: Common Crawl - german web’ was designed with the aim of enabling a broad-based linguistic analysis of the German-language (visible) internet over time - with the aim of achieving comparability with the DeReKo (‘German Reference Corpus’ of the Leibniz Institute for the German Language - DeReKo volume 57 billion tokens - status: DeReKo Release 2024-I). The corpus is separated by year (here year 2014) and versioned (here version 1). Version 1 comprises (all years 2013-2024) 97.45 billion tokens.
The corpus is based on the data dumps from CommonCrawl (https://commoncrawl.org/). CommonCrawl is a non-profit organisation that provides copies of the visible Internet free of charge for research purposes.
The CommonCrawl WET raw data was first filtered by TLD (top-level domain). Only pages ending in the following TLDs were taken into account: ‘.at; .bayern; .berlin; .ch; .cologne; .de; .gmbh; .hamburg; .koeln; .nrw; .ruhr; .saarland; .swiss; .tirol; .wien; .zuerich’. These are the exclusive German-language TLDs according to ICANN (https://data.iana.org/TLD/tlds-alpha-by-domain.txt) as of 1 June 2024 - TLDs with a purely corporate reference (e.g. ‘.edeka; .bmw; .ford’) were excluded. The language of the individual documents (URLs) was then estimated with the help of NTextCat (https://github.com/ivanakcheurov/ntextcat) (via the CORE14 profile of NTextCat) - only those documents/URLs for which German was the most likely language were processed further (e.g. to exclude foreign-language material such as individual subpages). The third step involved filtering for manual selectors and filtering for 1:1 duplicates (within one year).
The filtering and subsequent processing was carried out using CorpusExplorer (http://hdl.handle.net/11234/1-2634) and our own (supplementary) scripts, and the TreeTagger (http://hdl.handle.net/11372/LRT-323) was used for automatic annotation. The corpus was processed on the HELIX HPC cluster. The author would like to take this opportunity to thank the state of Baden-Württemberg and the German Research Foundation (DFG) for the possibility to use the bwHPC/HELIX HPC cluster - funding code HPC cluster: INST 35/1597-1 FUGG.
Data content:
- Tokens and record boundaries
- Automatic lemma and POS annotation (using TreeTagger)
- Metadata:
- GUID - Unique identifier of the document
- YEAR - Year of capture (please use this information for data slices)
- Url - Full URL
- Tld - Top-Level Domain
- Domain - Domain without TLD (but with sub-domains if applicable)
- DomainFull - Complete domain (incl. TLD)
- DomainFull - Complete domain (incl. TLD)
- Datum - (System Information): Date of the CorpusExplorer (date of capture by CommonCrawl - not date of creation/modification of the document).
- Hash - (System Information): SHA1 hash of the CommonCrawl
- Pfad - (System Information): Path of the cluster (raw data) - is supplied by the system.
Please note that the files are saved as *.cec6.gz. These are binary files of the CorpusExplorer (see above). These files ensure efficient archiving. You can use both CorpusExplorer and the ‘CEC6-Converter’ (available for Linux, MacOS and Windows - see: https://lindat.mff.cuni.cz/repository/xmlui/handle/11372/LRT-5705) to convert the data. The data can be exported in the following formats:
- CATMA v6
- CoNLL
- CSV
- CSV (only meta-data)
- DTA TCF-XML
- DWDS TEI-XML
- HTML
- IDS I5-XML
- IDS KorAP XML
- IMS Open Corpus Workbench
- JSON
- OPUS Corpus Collection XCES
- Plaintext
- SaltXML
- SlashA XML
- SketchEngine VERT
- SPEEDy/CODEX (JSON)
- TLV-XML
- TreeTagger
- TXM
- WebLicht
- XML
Please note that an export increases the storage space requirement extensively. The ‘CorpusExplorerConsole’ (https://github.com/notesjor/CorpusExplorer.Terminal.Console - available for Linux, MacOS and Windows) also offers a simple solution for editing and analysing. If you have any questions, please contact the author.
Legal information
The data was downloaded on 01.11.2024. The use, processing and distribution is subject to §60d UrhG (german copyright law), which authorises the use for non-commercial purposes in research and teaching. LINDAT/CLARIN is responsible for long-term archiving in accordance with §69d para. 5 and ensures that only authorised persons can access the data. The data has been checked to the best of our knowledge and belief (on a random basis) - should you nevertheless find legal violations (e.g. right to be forgotten, personal rights, etc.), please write an e-mail to the author (amc_report@jan-oliver-ruediger.de) with the following information: 1) why this content is undesirable (please outline only briefly) and 2) how the content can be identified - e.g. file name, URL or domain, etc. The author will endeavour to identify the content. The author will endeavour to remove the content and re-upload the data (modified) within two weeks (new version). If you have any further questions, please contact CLARIN.
*** english version see above ***
Das ‚Ancillary Monitor Corpus: Common Crawl - german web‘ wurde mit dem Ziel konzipiert - eine breit angelegte und zeitlich begleitende linguistische Analyse des deutschsprachigen (sichtbaren) Internets zu ermöglichen - wobei eine Vergleichbarkeit mit dem DeReKo (‚Deutsches Referenz Korpus‘ des Leibniz-Instituts für Deutsche Sprache - DeReKo Umfang 57 Mrd. Token - Stand: DeReKo Release 2024-I) angestrebt wird. Das Korpus ist nach Jahren getrennt (hier Jahr 2014) und versioniert (hier Version 1). Die Version 1 umfasst (alle Jahre 2013-2024) 97,45 Mrd. Token.
Das Korpus basiert auf den Daten-Dumps von CommonCrawl (https://commoncrawl.org/). CommonCrawl ist eine Non-Profit-Organisation, die Kopien des sichtbaren Internets kostenlos für die Forschung zur Verfügung stellt.
Die CommonCrawl WET Rohdaten wurden zunächst nach TLD (Top-Level Domain) gefiltert. Es wurden nur Seiten berücksichtigt, die auf folgende TLDs enden: „.at; .bayern; .berlin; .ch; .cologne; .de; .gmbh; .hamburg; .koeln; .nrw; .ruhr; .saarland; .swiss; .tirol; .wien; .zuerich“. Dies sind die exklusiven deutschsprachigen TLDs laut ICANN (https://data.iana.org/TLD/tlds-alpha-by-domain.txt) zum Stand 01.06.2024 - ausgeschlossen wurden TLDs mit reinem Firmenbezug (z.B. „.edeka; .bmw; .ford“). Für die einzelnen Dokumente (URLs) wurde dann mit Hilfe von NTextCat (https://github.com/ivanakcheurov/ntextcat) die Sprache geschätzt (über das CORE14-Profil von NTextCat) - es wurden nur solche Dokumente/URLs weiterverarbeitet, bei denen Deutsch die wahrscheinlichste Sprache war (z.B. um möglichst auszuschließen, dass fremdsprachiges Material wie einzelne Unterseitenbereiche enthalten sind). Als dritter Schritt erfolgte eine Filterung nach manuellen Selektoren und eine Filterung nach 1:1-Dubletten (innerhalb eines Jahres).
Die Filterung und anschließende Aufbereitung erfolgte mit dem CorpusExplorer (http://hdl.handle.net/11234/1-2634) und eigenen (ergänzenden) Skripten, wobei für die automatische Annotation der TreeTagger (http://hdl.handle.net/11372/LRT-323) verwendet wurde. Die Aufbereitung des Korpus erfolgte auf dem HELIX-HPC-Cluster. Der Autor dankt an dieser Stelle dem Land Baden-Württemberg und der Deutschen Forschungsgemeinschaft (DFG) für die Möglichkeit das bwHPC/HELIX HPC-Cluster nutzen zu können – Förderkennzeichen HPC-Cluster: INST 35/1597-1 FUGG.
Dateninhalt:
- Token und Satzgrenzen
- Automatische Lemma- und POS-Annotation (mittels TreeTagger)
- Metadaten:
- GUID - Eindeutiger Identifikator des Dokuments
- YEAR - Jahr der Erfassung (bitte verwenden Sie diese Angabe für Datenschnitte)
- Url - Vollständige URL
- Tld – Top-Level Domain
- Domain – Domain ohne TLD (aber ggf. mit Sub-Domains)
- DomainFull – Vollständige Domain (inkl. TLD)
- DomainFull - Komplette Domain (inkl. TLD)
- Datum - (System Information): Datum des CorpusExplorers (Tag der Erfassung durch CommonCrawl - nicht Tag der Erstellung/Änderung des Dokuments).
- Hash - (System Information): SHA1-Hash des CommonCrawl
- Pfad - (System Information): Pfad des Clusters (Rohdaten) - wird systembedingt geliefert.
Bitte beachten Sie, dass die Dateien als *.cec6.gz gespeichert sind. Dies sind Binärdateien des CorpusExplorers (siehe oben). Diese Dateien gewährleisten eine effiziente Archivierung. Sie können sowohl den CorpusExplorer als auch den ‚CEC6-Converter‘ (verfügbar für Linux, MacOS und Windows - siehe: https://lindat.mff.cuni.cz/repository/xmlui/handle/11372/LRT-5705) zur Konvertierung der Daten verwenden. Die Daten können in folgende Formate exportiert werden:
- CATMA v6
- CoNLL
- CSV
- CSV (only meta-data)
- DTA TCF-XML
- DWDS TEI-XML
- HTML
- IDS I5-XML
- IDS KorAP XML
- IMS Open Corpus Workbench
- JSON
- OPUS Corpus Collection XCES
- Plaintext
- SaltXML
- SlashA XML
- SketchEngine VERT
- SPEEDy/CODEX (JSON)
- TLV-XML
- TreeTagger
- TXM
- WebLicht
- XML
Bitte beachten Sie, dass ein Export den Speicherplatzbedarf erheblich erhöht. Eine einfache Lösung zur Bearbeitung und Analyse bietet auch die „CorpusExplorerConsole“ (https://github.com/notesjor/CorpusExplorer.Terminal.Console - verfügbar für Linux, MacOS und Windows). Bei Fragen wenden Sie sich bitte an den Autor.
Rechtliche Hinweise
Die Daten wurden am 01.11.2024 heruntergeladen. Die Nutzung, Verarbeitung und Verbreitung unterliegt §60d UrhG, der die Nutzung für nicht kommerzielle Zwecke in Forschung und Lehre erlaubt. LINDAT/CLARIN übernimmt die Langzeitarchivierung nach §69d Abs. 5 und stellt sicher, dass nur berechtigte Personen auf die Daten zugreifen können. Die Daten wurden nach bestem Wissen und Gewissen (stichprobenartig) überprüft - sollten Sie dennoch Rechtsverletzungen (z.B. Recht auf Vergessenwerden, Persönlichkeitsrechte etc.) finden, schreiben Sie bitte eine E-Mail an den Autor (amc_report@jan-oliver-ruediger.de) mit folgenden Informationen: 1) warum dieser Inhalt unerwünscht ist (bitte nur kurz skizzieren) und 2) wie der Inhalt identifiziert werden kann - z.B. Dateiname, URL oder Domain etc. Der Autor wird sich bemühen, den Inhalt zu entfernen und die Daten innerhalb von zwei Wochen (verändert) wieder hochzuladen (neue Version). Bei weiteren Fragen wenden Sie sich bitte an CLARIN.
Publisher
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- 2014_0018.cec6.gz
- Size
- 208.14 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 07fef81135f82afb046d610276d66a12

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0020.cec6.gz
- Size
- 207.08 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 59ed7b3f22c6c53ee8780da123f38370

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0021.cec6.gz
- Size
- 206.17 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 375159d308d898a6f4051340c802b672

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0003.cec6.gz
- Size
- 208.76 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- a655f0944edb336aff86291e1ae67ac5

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0004.cec6.gz
- Size
- 203.53 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 9102d11b8e3107986c60a8d5279512be

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0002.cec6.gz
- Size
- 208.57 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- e65bc718bfae2e8c1a7e8830ce6b263c

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0005.cec6.gz
- Size
- 207.98 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- d900538a03ec3a514cb88e8b253fe7f0

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0007.cec6.gz
- Size
- 207.34 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 172979363460894cd9afa8b0fc6cfa80

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0006.cec6.gz
- Size
- 208.35 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 3de99d2377fac5d22eba83820953975a

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0008.cec6.gz
- Size
- 207.36 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 3963c15658fd2f0413df87ec57d8d8b8

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0010.cec6.gz
- Size
- 205.8 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 83eed47b80d599f616dd579baa1db603

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0012.cec6.gz
- Size
- 209.32 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 675c1fe6540ec047291b93f7f4c664e4

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0011.cec6.gz
- Size
- 206.61 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- eb5efd0299c3b6ac173c9e48b6c83936

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0013.cec6.gz
- Size
- 207.69 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- d7fe039c7280cf188c705b9184b8a9ea

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0019.cec6.gz
- Size
- 206.55 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- b03c7337f0c16e4b1105e56e2fea10eb

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0023.cec6.gz
- Size
- 207.67 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 7f14cae1e0ddb8389d8ed37a04c9466d

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0022.cec6.gz
- Size
- 207.9 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 818a38fb1b3605a6803a25c9479ae58c

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0024.cec6.gz
- Size
- 210.24 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- f0b5ed988e169501b0c86cae4aaee2f1

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0025.cec6.gz
- Size
- 202.68 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 71dc5df87c5ed95d2b01db02cd0e626e

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0001.cec6.gz
- Size
- 205.4 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- a103dbed178e5458765f54154bf350a2

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0009.cec6.gz
- Size
- 207.84 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- c3035019ed9a80d6fa3699bc53a1c795

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0014.cec6.gz
- Size
- 207.08 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 2b69abae4d2c4645d7d9a2c8bf62779d

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0016.cec6.gz
- Size
- 206.68 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- c85b94a9fc4fb2b88b8e3d1a4940d823

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0015.cec6.gz
- Size
- 206.06 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 288998937cbd6026a7a0a9a1070c9998

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0017.cec6.gz
- Size
- 207.09 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- b2465f0dbdf09675c8097742917dba96

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0048.cec6.gz
- Size
- 206.27 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- c834b127390c52032918c67593516388

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0049.cec6.gz
- Size
- 206.85 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 3ced41e7c78f5d82cc1633e5d0b264aa

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0047.cec6.gz
- Size
- 208.31 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 2ba814945c400129d32902b18d687371

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0050.cec6.gz
- Size
- 207.4 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 1b22e3dacaab6d89480215b5cd4ba3fc

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0046.cec6.gz
- Size
- 206.73 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- db57cbcf39dd5a369af7d6e19bd722f1

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0026.cec6.gz
- Size
- 208.27 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 28dab031bbfa650d4e9cf7a4555a3113

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0030.cec6.gz
- Size
- 207.01 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 4ba570a77517ae1d44dfe5676c660d9a

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0031.cec6.gz
- Size
- 206.29 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 5ee697cfe3b40f4bdd8eba67e033cdda

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0029.cec6.gz
- Size
- 208.77 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- ae700efc3b68c8ed709e2fa754eba37d

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0032.cec6.gz
- Size
- 207.81 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- a3ead93e85fba841b220ff2187f6b58b

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0035.cec6.gz
- Size
- 206.37 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 1b4dccbc7411aeb9653e5744d2f43982

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0037.cec6.gz
- Size
- 205.82 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 53ad0b508e9e1039ee823b83bb2c3a48

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0036.cec6.gz
- Size
- 207.98 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 183a4eb57e131f1ba8c299415bf1b03d

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0038.cec6.gz
- Size
- 206.86 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 9e0a90960df3c99cc343602a3c8794c4

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0028.cec6.gz
- Size
- 208.32 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- af534b98bf73a751a28c64b74502dae3

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0027.cec6.gz
- Size
- 204.14 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 6af86751bf6c59e66abf4efc80f81e01

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0033.cec6.gz
- Size
- 206.36 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- daf6073532b3173d251fdfb1a2ac017b

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0034.cec6.gz
- Size
- 207.78 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 18201f43ee2e849c0feb2d5d3835c181

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0040.cec6.gz
- Size
- 208.25 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 6afa6c59a66811df5208468ef65b4afd

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0039.cec6.gz
- Size
- 205.07 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 2af67b191b7445098a48693df06b85a0

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0041.cec6.gz
- Size
- 208.18 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- d7983b0e23a3c7a1a5aaf215709e2919

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0042.cec6.gz
- Size
- 206.75 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 4a5ca51035be2afdf40a8168282d3c70

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0043.cec6.gz
- Size
- 207.37 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 7ff1f302df394e3109f77ea26b4df9d1

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0044.cec6.gz
- Size
- 206.19 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 50494a110a0127ecae02c81cc2264acb

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0045.cec6.gz
- Size
- 206.84 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- ddeacafdfdf4c6dd017f04f8a1bfe597

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0093.cec6.gz
- Size
- 204.51 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 4d8afc3e1e5ec26e3e024221c1e64d92

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0092.cec6.gz
- Size
- 208.95 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 0b212045acaeac4b361f35e476eaa0a7

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0094.cec6.gz
- Size
- 207.44 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- d3d6708b7c7c45f954a7a45a7fa087a5

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0095.cec6.gz
- Size
- 205.41 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 964443ce4c6353938904d5ea72547473

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0091.cec6.gz
- Size
- 207.04 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- b57627e38fcba21e56f62a6300f1cc26

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0053.cec6.gz
- Size
- 206.05 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 38dda98324a53b048e4662dd7b41fb9c

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0055.cec6.gz
- Size
- 208.29 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 4454426077b7c8a8fe20c629b4403184

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0056.cec6.gz
- Size
- 208.35 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- e5f40fd89bc057824a2cbf766abe619e

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0057.cec6.gz
- Size
- 209.45 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 81efc831839be3ca8b1e8d6a2e87e907

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0058.cec6.gz
- Size
- 207.11 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- e503eb69dafc741f1e573d1a9b2bfc30

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0059.cec6.gz
- Size
- 207.58 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- bd35ab29b4fc2ca7ccf4b95dffdf2be2

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0060.cec6.gz
- Size
- 205.59 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 1f02f394dd0cbbff75e5b2a1a5c6dd46

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0064.cec6.gz
- Size
- 203.72 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- f116728909310fc6cad6bc900817e6cd

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0070.cec6.gz
- Size
- 206.63 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 4d40f130eaeebc7af652529588b01b04

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0071.cec6.gz
- Size
- 202.45 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- e96223ec3f0feface5ce66198863adad

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0090.cec6.gz
- Size
- 206.57 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 89f2bc5f76359cbe96d4d289dbffeb85

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0054.cec6.gz
- Size
- 208.41 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- a53f5f053dcae646e35eca25d70b6c54

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0061.cec6.gz
- Size
- 208.63 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 678c1dd02662a3495f63845be59f4bd9

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0062.cec6.gz
- Size
- 207.59 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 59bd6160a4872ddd1316ceee06d4d2ba

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0063.cec6.gz
- Size
- 206.72 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 7fbbfd9f53d04e01d2cddd8f4d44e642

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0072.cec6.gz
- Size
- 207.7 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 545708c440b7b508a5733c87592bb64a

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0073.cec6.gz
- Size
- 206.5 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- cc917aaecce4b0ee0a84cdb6fa2f177d

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0075.cec6.gz
- Size
- 206.82 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- faa1a3ab580f1d42f3a09edc5cc573b3

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0076.cec6.gz
- Size
- 205.55 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- f84b746131712efc62a715cadb3a7a72

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0077.cec6.gz
- Size
- 206.47 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 5abf171ce394a5267b135d434779eef9

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0078.cec6.gz
- Size
- 206.26 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- a0e1e626f478d076a58b735853393776

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0079.cec6.gz
- Size
- 207.23 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 5d154193255779864e656af05d609fbb

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0080.cec6.gz
- Size
- 208.04 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 95db5116e7da02e743f590016dd8e207

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0081.cec6.gz
- Size
- 206.64 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- a1c1db6a3a45a145bc49205d388db3e1

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0096.cec6.gz
- Size
- 207.67 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- c5904673973dead4b6244b70493a8aef

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0099.cec6.gz
- Size
- 132.67 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- a4c75d576bd2f65b518c0241cb9c8a11

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0097.cec6.gz
- Size
- 208.41 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 5a00b8b5b89def45bdf1c70f0f96bd6c

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0098.cec6.gz
- Size
- 206.88 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 5bf13a4354153790f2b126c27cad50b3

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0051.cec6.gz
- Size
- 206.14 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- e353c27a712f8f5df60dc141e49950f5

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0052.cec6.gz
- Size
- 207.41 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 7d18a1f02042ed82a00885227a42be0f

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0065.cec6.gz
- Size
- 206.92 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 7804d6993510a1a31304690c1d20c06a

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0066.cec6.gz
- Size
- 201.31 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- dd4141676412fc50c8a529dc93c139dc

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0067.cec6.gz
- Size
- 207.56 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 4e21e3322a0af98406bbeb0e786f8de3

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0068.cec6.gz
- Size
- 205.35 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- f61eef97bfdfe2e09967bd0357e2dc75

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0069.cec6.gz
- Size
- 207.8 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 99dd135489104fa758872858309dc961

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0074.cec6.gz
- Size
- 208.76 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 9f4c585b423e3f85b6f45df4998c7170

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0082.cec6.gz
- Size
- 204.87 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- a8178a060fc2e78182f759e7ec2232a5

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0084.cec6.gz
- Size
- 206.46 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- aae5c8d300886b6ea22a2ddf423d873a

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0085.cec6.gz
- Size
- 205.28 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- e1bc6c6cd474d2fd37baedac6f3c8d10

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0086.cec6.gz
- Size
- 205.69 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- f1bb23b6c24235fb75ba082470fc042f

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0087.cec6.gz
- Size
- 208.15 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- fcc2da9819ba4738d84c53aaff2ef094

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0089.cec6.gz
- Size
- 206.67 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- d7ac246a91b7e376f0a3944b839fae90

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0083.cec6.gz
- Size
- 207.13 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- 2a093a7b5b6ff44c193854ddfb0b6922

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- 2014_0088.cec6.gz
- Size
- 210.16 MB
- Format
- application/x-gzip
- Description
- gzip Archive
- MD5
- a930de6b21dcecc873bf0c76b025dc21

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz

