Show simple item record

 
dc.contributor.author Dušek, Ondřej
dc.contributor.author Jurčíček, Filip
dc.date.accessioned 2016-04-05T12:02:13Z
dc.date.available 2016-04-05T12:02:13Z
dc.date.issued 2016-04-05
dc.identifier.uri http://hdl.handle.net/11234/1-1675
dc.description A dataset intended for fully trainable natural language generation (NLG) systems in task-oriented spoken dialogue systems (SDS), covering the English public transport information domain. It includes preceding context (user utterance) along with each data instance (pair of source meaning representation and target natural language paraphrase to be generated). Taking the form of the previous user utterance into account for generating the system response allows NLG systems trained on this dataset to entrain (adapt) to the preceding utterance, i.e., reuse wording and syntactic structure. This should presumably improve the perceived naturalness of the output, and may even lead to a higher task success rate. Crowdsourcing has been used to obtain natural context user utterances as well as natural system responses to be generated.
dc.language.iso eng
dc.publisher Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
dc.rights Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
dc.rights.uri http://creativecommons.org/licenses/by-sa/4.0/
dc.source.uri https://github.com/UFAL-DSG/alex_context_nlg_dataset
dc.subject dialogue system
dc.subject natural language generation
dc.subject dialogue alignment
dc.subject entrainment
dc.title Alex Context NLG Dataset
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
dc.rights.label PUB
has.files yes
branding LINDAT / CLARIAH-CZ
contact.person Ondřej Dušek odusek@ufal.mff.cuni.cz Charles University in Prague, UFAL
sponsor Grantová agentura Univerzity Karlovy v Praze GAUK 2058214 Adaptivní generátor přirozeného jazyka nationalFunds
sponsor Ministerstvo školství, mládeže a tělovýchovy České republiky LK11221 Vývoj metod pro návrh statistických mluvených dialogových systémů nationalFunds
sponsor Univerzita Karlova v Praze (mimo GAUK) SVV 260 333 Specifický vysokoškolský výzkum nationalFunds
size.info 1859 entries
size.info 5577 sentences
files.size 3042834
files.count 3


 Files in this item

 Download all files in item (2.9 MB)
This item is
Publicly Available
and licensed under:
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
Distributed under Creative Commons Attribution Required Share Alike
Icon
Name
README.md
Size
11.38 KB
Format
Unknown
Description
Dataset description and documentation
MD5
2e8449b98200579becfd6d46b401741c
 Download file
Icon
Name
dataset.csv
Size
1.12 MB
Format
Unknown
Description
The dataset in CSV format
MD5
81f85c82b5ca7f5e23605face62fd5fd
 Download file
Icon
Name
dataset.json
Size
1.77 MB
Format
Unknown
Description
The dataset in JSON format
MD5
552ea0396c3184a74588b2c151b73bef
 Download file

Show simple item record