This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.

Czech restaurant information dataset for NLG

Please use the following text to cite this item or export to a predefined format:
Dušek, Ondřej; et al., 2017, Czech restaurant information dataset for NLG, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/11234/1-2123.
Date issued
2017-01-13
Size
5192 sentences
Language(s)
Description
This is a dataset for natural language generation (NLG) in task-oriented spoken dialogue systems with Czech as the target language. It originated as a translation of the English San Francisco Restaurants dataset by Wen et al. (2015). It includes input dialogue acts and the corresponding output natural language paraphrases in Czech. Since the dataset is intended for recurrent neural network based NLG systems using delexicalization, inflection tables for all slot values appearing verbatim in the text are provided.
Acknowledgement
 Files in this item
Name
surface_forms.json
Size
121.85 KB
Format
application/octet-stream
Description
Additional morphology data
MD5
5fe93b1d9160e58ec25e1f67d2d9f8a4
Preview
  File Preview
Name
dataset.csv
Size
1.02 MB
Format
application/octet-stream
Description
The dataset, in CSV format
MD5
c30bdcc9c470e6041b8567873da0f394
Preview
  File Preview
Name
dataset.json
Size
1.5 MB
Format
application/octet-stream
Description
The dataset, in JSON format
MD5
4986a46d539bd4d91bbb56aca4aa4104
Preview
  File Preview
Name
README.md
Size
6.04 KB
Format
application/octet-stream
Description
Dataset description
MD5
03bc79c1b916020e07b0848484f05909
Preview
  File Preview