Files in this item

 Download all files in item (118.53 MB)
This item is
Publicly Available
and licensed under:
Mozilla Public License 2.0
Icon
Name
sumeczech-1.0-update-230225.zip
Size
59.27 MB
Format
application/zip
Description
Updated release of the SumeCzech download script, including the original RougeRAW evaluation metric. The download script was modified to use the updated CommonCraw download URL and to support Python 3.10 and Python 3.11.
MD5
54e2c8215d8a5a4bc1733823b8e270f3
 Download file  Preview
 File Preview  
    • downloader.py5 kB
    • LICENSE16 kB
    • README.md4 kB
    • sumeczech-1.0-index.jsonl.xz59 MB
    • rouge_raw.py4 kB
    • downloader_extractor.py5 kB
    • requirements.txt55 B
    • downloader_extractor_utils.py12 kB
Icon
Name
sumeczech-1.0-obsolete-180213.zip
Size
59.27 MB
Format
application/zip
Description
SumeCzech dataset and RougeRAW evaluation metric. NOTE that the download script in this archive is **not working anymore**, because it uses obsolete CommonCrawl download URL, and it also does not support Python 3.10+.
MD5
832119c1236a5007b66d4b07676913b8
 Download file  Preview
 File Preview  
    • downloader.py4 kB
    • LICENSE16 kB
    • README.md4 kB
    • sumeczech-1.0-index.jsonl.xz59 MB
    • rouge_raw.py4 kB
    • downloader_extractor.py5 kB
    • requirements.txt55 B
    • downloader_extractor_utils.py12 kB