Question Dialogs Dataset
Please use the following text to cite this item or export to a predefined format:
Vodolán, Miroslav and Jurčíček, Filip, 2016,
Question Dialogs Dataset, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-1670.
Authors
Item identifier
Date issued
2016
Size
1900 items,
8533 turns
Language(s)
Description
Dataset collected from natural dialogs which enables to test the ability of dialog systems to interactively learn new facts from user utterances throughout the dialog. The dataset, consisting of 1900 dialogs, allows simulation of an interactive gaining of denotations and questions explanations from users which can be used for the interactive learning.
Acknowledgement
Ministerstvo školství, mládeže a tělovýchovy České republiky
Project code:LK11221
Project name:Vývoj metod pro návrh statistických mluvených dialogových systémů
Univerzita Karlova v Praze (mimo GAUK)
Project code:SVV 260 224
Project name:Specifický vysokoškolský výzkum
Grantová agentura Univerzity Karlovy v Praze
Project code:GAUK 1170516
Project name:Řízení dialogu v otevřených doménách s využitím znalostních grafů
Subject(s)
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- question_dialogs-dev.json
- Size
- 515.51 KB
- Format
- application/octet-stream
- Description
- Development part of the dataset.
- MD5
- 0540cbed16c80de5a72957fad5d46e53

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- simple_interactive_model.py
- Size
- 3 KB
- Format
- application/octet-stream
- Description
- Simple implementation of interactive model.
- MD5
- fcf070af58c18b8fb78bf2b7172540a8

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- question_dialogs-test.json
- Size
- 1.22 MB
- Format
- application/octet-stream
- Description
- Test part of the dataset.
- MD5
- 9356c5058bc890e49b580f0c69f4c1e7

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- question_dialogs-train.json
- Size
- 1.71 MB
- Format
- application/octet-stream
- Description
- Training part of the dataset.
- MD5
- d9c811d3a4067337ba1d1952f375fa83

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- interactive_model_base.py
- Size
- 3.95 KB
- Format
- application/octet-stream
- Description
- Base class for interactive model development.
- MD5
- 2d6b66a79ed16ba7df43049af6996ea1

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- README.txt
- Size
- 11.33 KB
- Format
- text/plain
- Description
- Readme describing the dataset.
- MD5
- 99f1ec5f83c43eece65d5ae7e2367665

==========================
QUESTION DIALOGS DATASET
==========================
For more details see the paper:
"Data Collection for Interactive Learning through the Dialog", 2016,
Vodolán Miroslav, Filip Jurčíček,
http://arxiv.org/abs/1603.09631
The dataset consists of standard data split into training, development and test files:
1) question_dialogs-train.json
2) question_dialogs-dev.json
3) question_dialogs-test.json
Dataset files contain one dialog per line. The dialogs are stored in json format.
Three python scripts are released with the dataset:
a) interactive_learning_evaluator.py - Evaluates given model in interactive manner on dialogs simmulated from conversations in dataset.
b) interactive_model_base.py - Base class which simplifies developement of interactive models by providing standard routines for communication with interactive_learning_evaluator.py simulator.
c) simple_interactive_model.py - Simple implementation of interactive model that can be used for testing of interactive_learning_evaluator.py environment.
==============SCRIPTS USAGE==============
The interactive_learning_evaluator.py script can be run with simple_interactive_model.py in this way (for more info about its parameters use --help):
> python interactive_learning_evaluator.py "python simple_interactive_model.py" --train_dialogs dataset/question_dialogs-train.json
which results in output similar to:
Training dialogs from files ['question_dialogs-train.json']
dialog_count: 950 # Count of dialogs in whole dataset
turn_count: 4237 # Count of turns in all dialogs
simulation_wall_time: 1s # How long the simulation took
avg_dialog_wall_time: 1ms # How long the simulation took per dialog in average
answer_correct_count: 3 # How many of answers provided by the model were correct
answer_incorrect_count: 2 . . .- Name
- interactive_learning_evaluator.py
- Size
- 19.71 KB
- Format
- application/octet-stream
- Description
- Script for evaluation of interactive models.
- MD5
- 54dba2acbee67b4b785bf1892624c5a7

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz

