Alex Context NLG Dataset
Please use the following text to cite this item or export to a predefined format:
Dušek, Ondřej and Jurčíček, Filip, 2016,
Alex Context NLG Dataset, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-1675.
Authors
Item identifier
Date issued
2016-04-05
Size
1859 entries,
5577 sentences
Language(s)
Description
A dataset intended for fully trainable natural language generation (NLG) systems in task-oriented spoken dialogue systems (SDS), covering the English public transport information domain. It includes preceding context (user utterance) along with each data instance (pair of source meaning representation and target natural language paraphrase to be generated).
Taking the form of the previous user utterance into account for generating the system response allows NLG systems trained on this dataset to entrain (adapt) to the preceding utterance, i.e., reuse wording and syntactic structure. This should presumably improve the perceived naturalness of the output, and may even lead to a higher task success rate.
Crowdsourcing has been used to obtain natural context user utterances as well as natural system responses to be generated.
Acknowledgement
Grantová agentura Univerzity Karlovy v Praze
Project code:GAUK 2058214
Project name:Adaptivní generátor přirozeného jazyka
Ministerstvo školství, mládeže a tělovýchovy České republiky
Project code:LK11221
Project name:Vývoj metod pro návrh statistických mluvených dialogových systémů
Univerzita Karlova v Praze (mimo GAUK)
Project code:SVV 260 333
Project name:Specifický vysokoškolský výzkum
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- README.md
- Size
- 11.38 KB
- Format
- application/octet-stream
- Description
- Dataset description and documentation
- MD5
- 2e8449b98200579becfd6d46b401741c

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- dataset.csv
- Size
- 1.12 MB
- Format
- application/octet-stream
- Description
- The dataset in CSV format
- MD5
- 81f85c82b5ca7f5e23605face62fd5fd

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- dataset.json
- Size
- 1.77 MB
- Format
- application/octet-stream
- Description
- The dataset in JSON format
- MD5
- 552ea0396c3184a74588b2c151b73bef

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz

