dc.contributor.author | Dušek, Ondřej |
dc.contributor.author | Jurčíček, Filip |
dc.contributor.author | Dvořák, Josef |
dc.contributor.author | Grycová, Petra |
dc.contributor.author | Hejda, Matěj |
dc.contributor.author | Olivová, Jana |
dc.contributor.author | Starý, Michal |
dc.contributor.author | Štichová, Eva |
dc.date.accessioned | 2017-04-04T07:43:20Z |
dc.date.available | 2017-04-04T07:43:20Z |
dc.date.issued | 2017-01-13 |
dc.identifier.uri | http://hdl.handle.net/11234/1-2123 |
dc.description | This is a dataset for natural language generation (NLG) in task-oriented spoken dialogue systems with Czech as the target language. It originated as a translation of the English San Francisco Restaurants dataset by Wen et al. (2015). It includes input dialogue acts and the corresponding output natural language paraphrases in Czech. Since the dataset is intended for recurrent neural network based NLG systems using delexicalization, inflection tables for all slot values appearing verbatim in the text are provided. |
dc.language.iso | ces |
dc.publisher | Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) |
dc.rights | Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) |
dc.rights.uri | http://creativecommons.org/licenses/by-sa/4.0/ |
dc.source.uri | https://github.com/UFAL-DSG/cs_restaurant_dataset |
dc.subject | natural language generation |
dc.subject | dialogue system |
dc.subject | morphological generation |
dc.title | Czech restaurant information dataset for NLG |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
dc.rights.label | PUB |
has.files | yes |
branding | LINDAT / CLARIAH-CZ |
contact.person | Ondřej Dušek odusek@ufal.mff.cuni.cz Charles University in Prague, UFAL |
sponsor | Grantová agentura Univerzity Karlovy v Praze GAUK 2058214 Adaptivní generátor přirozeného jazyka nationalFunds |
sponsor | Ministerstvo školství, mládeže a tělovýchovy České republiky LK11221 Vývoj metod pro návrh statistických mluvených dialogových systémů nationalFunds |
sponsor | Univerzita Karlova (mimo GAUK) SVV 260 333 Specifický vysokoškolský výzkum nationalFunds |
size.info | 5192 sentences |
files.size | 2773522 |
files.count | 4 |
Files in this item
Download all files in item (2.65 MB)This item is
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
- Name
- README.md
- Size
- 6.04 KB
- Format
- Unknown
- Description
- Dataset description
- MD5
- 03bc79c1b916020e07b0848484f05909
- Name
- dataset.json
- Size
- 1.5 MB
- Format
- Unknown
- Description
- The dataset, in JSON format
- MD5
- 4986a46d539bd4d91bbb56aca4aa4104
- Name
- dataset.csv
- Size
- 1.02 MB
- Format
- Unknown
- Description
- The dataset, in CSV format
- MD5
- c30bdcc9c470e6041b8567873da0f394
- Name
- surface_forms.json
- Size
- 121.85 KB
- Format
- Unknown
- Description
- Additional morphology data
- MD5
- 5fe93b1d9160e58ec25e1f67d2d9f8a4