Natural Language Inference (NLI) datasets in Czech
Please use the following text to cite this item or export to a predefined format:
Javorský, Dávid and Popel, Martin, 2023,
Natural Language Inference (NLI) datasets in Czech, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-5227.
Authors
Item identifier
Date issued
2023-10-01
Size
1047485 sentences
Description
The goal of the Natural Language Inference (NLI) task is to determine whether a "hypothesis" is true (entailment), false (contradiction), or undetermined (neutral) given a "premise".
The repository contains three NLI datasets, namely snli (https://huggingface.co/datasets/snli), multi_nli (https://huggingface.co/datasets/multi_nli) and qnli (https://huggingface.co/datasets/glue/viewer/qnli/train). These datasets are in two versions, the original English version and our-added Czech translation using CUBBITT, the Charles University Block-Backtranslation-Improved Transformer Translation model (https://lindat.mff.cuni.cz/services/translation/). The record includes target labels for Czech datasets as well, however, note that they could no longer be correct for the Czech translation (because of errors made by the translation model).
The licence of this record (CC BY-SA) holds for the translated part of the dataset. For the original English datasets, follow their respective licence descriptions.
Subject(s)
Collections
Files in this item
- Name
- multi_nli.zip
- Size
- 58.34 MB
- Format
- application/zip
- Description
- Zip
- MD5
- bf4c2805fdb2579c14d71fb4334ed163

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- qnli.zip
- Size
- 21.33 MB
- Format
- application/zip
- Description
- Zip
- MD5
- 0611ed45d3bc9818d21662b415986c20

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- snli.zip
- Size
- 21.22 MB
- Format
- application/zip
- Description
- Zip
- MD5
- 3a46a1a2083f2ba6014c64df8660ee6a

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz

