Alex Context NLG Dataset

Name: Alex Context NLG Dataset
License: http://creativecommons.org/licenses/by-sa/4.0/

Dušek, Ondřej; Jurčíček, Filip

dc.contributor.author	Dušek, Ondřej
dc.contributor.author	Jurčíček, Filip
dc.date.accessioned	2016-04-05T12:02:13Z
dc.date.available	2016-04-05T12:02:13Z
dc.date.issued	2016-04-05
dc.identifier.uri	http://hdl.handle.net/11234/1-1675
dc.description	A dataset intended for fully trainable natural language generation (NLG) systems in task-oriented spoken dialogue systems (SDS), covering the English public transport information domain. It includes preceding context (user utterance) along with each data instance (pair of source meaning representation and target natural language paraphrase to be generated). Taking the form of the previous user utterance into account for generating the system response allows NLG systems trained on this dataset to entrain (adapt) to the preceding utterance, i.e., reuse wording and syntactic structure. This should presumably improve the perceived naturalness of the output, and may even lead to a higher task success rate. Crowdsourcing has been used to obtain natural context user utterances as well as natural system responses to be generated.
dc.language.iso	eng
dc.publisher	Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
dc.rights	Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)
dc.rights.uri	http://creativecommons.org/licenses/by-sa/4.0/
dc.source.uri	https://github.com/UFAL-DSG/alex_context_nlg_dataset
dc.subject	dialogue system
dc.subject	natural language generation
dc.subject	dialogue alignment
dc.subject	entrainment
dc.title	Alex Context NLG Dataset
dc.type	corpus
metashare.ResourceInfo#ContentInfo.mediaType	text
dc.rights.label	PUB
has.files	yes
branding	LINDAT / CLARIAH-CZ
contact.person	Ondřej Dušek odusek@ufal.mff.cuni.cz Charles University in Prague, UFAL
sponsor	Grantová agentura Univerzity Karlovy v Praze GAUK 2058214 Adaptivní generátor přirozeného jazyka nationalFunds
sponsor	Ministerstvo školství, mládeže a tělovýchovy České republiky LK11221 Vývoj metod pro návrh statistických mluvených dialogových systémů nationalFunds
sponsor	Univerzita Karlova v Praze (mimo GAUK) SVV 260 333 Specifický vysokoškolský výzkum nationalFunds
size.info	1859 entries
size.info	5577 sentences
files.size	3042834
files.count	3