This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.

Question Dialogs Dataset

Please use the following text to cite this item or export to a predefined format:
Vodolán, Miroslav and Jurčíček, Filip, 2016, Question Dialogs Dataset, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/11234/1-1670.
Date issued
2016
Size
1900 items,
8533 turns
Language(s)
Description
Dataset collected from natural dialogs which enables to test the ability of dialog systems to interactively learn new facts from user utterances throughout the dialog. The dataset, consisting of 1900 dialogs, allows simulation of an interactive gaining of denotations and questions explanations from users which can be used for the interactive learning.
Acknowledgement
 Files in this item
Name
question_dialogs-dev.json
Size
515.51 KB
Format
application/octet-stream
Description
Development part of the dataset.
MD5
0540cbed16c80de5a72957fad5d46e53
Preview
  File Preview
Name
simple_interactive_model.py
Size
3 KB
Format
application/octet-stream
Description
Simple implementation of interactive model.
MD5
fcf070af58c18b8fb78bf2b7172540a8
Preview
  File Preview
Name
question_dialogs-test.json
Size
1.22 MB
Format
application/octet-stream
Description
Test part of the dataset.
MD5
9356c5058bc890e49b580f0c69f4c1e7
Preview
  File Preview
Name
question_dialogs-train.json
Size
1.71 MB
Format
application/octet-stream
Description
Training part of the dataset.
MD5
d9c811d3a4067337ba1d1952f375fa83
Preview
  File Preview
Name
interactive_model_base.py
Size
3.95 KB
Format
application/octet-stream
Description
Base class for interactive model development.
MD5
2d6b66a79ed16ba7df43049af6996ea1
Preview
  File Preview
Name
README.txt
Size
11.33 KB
Format
text/plain
Description
Readme describing the dataset.
MD5
99f1ec5f83c43eece65d5ae7e2367665
Preview
  File Preview
    ==========================
     QUESTION DIALOGS DATASET
    ==========================
    For more details see the paper:
        "Data Collection for Interactive Learning through the Dialog", 2016,
            Vodolán Miroslav, Filip Jurčíček,  
            http://arxiv.org/abs/1603.09631
    
    
    The dataset consists of standard data split into training, development and test files:
        1) question_dialogs-train.json
        2) question_dialogs-dev.json
        3) question_dialogs-test.json
    
    Dataset files contain one dialog per line. The dialogs are stored in json format. 
    
    Three python scripts are released with the dataset:
        a) interactive_learning_evaluator.py - Evaluates given model in interactive manner on dialogs simmulated from conversations in dataset.
        b) interactive_model_base.py - Base class which simplifies developement of interactive models by providing standard routines for communication with interactive_learning_evaluator.py simulator.
        c) simple_interactive_model.py - Simple implementation of interactive model that can be used for testing of interactive_learning_evaluator.py environment.
    
    ==============SCRIPTS USAGE==============
    The interactive_learning_evaluator.py script can be run with simple_interactive_model.py in this way (for more info about its parameters use --help):
    
    > python interactive_learning_evaluator.py "python simple_interactive_model.py" --train_dialogs dataset/question_dialogs-train.json
        which results in output similar to:
            Training dialogs from files ['question_dialogs-train.json']
                dialog_count: 950               # Count of dialogs in whole dataset
                turn_count: 4237                # Count of turns in all dialogs
                simulation_wall_time: 1s        # How long the simulation took
                avg_dialog_wall_time: 1ms       # How long the simulation took per dialog in average
                answer_correct_count: 3         # How many of answers provided by the model were correct 
                answer_incorrect_count: 2 . . .
Name
interactive_learning_evaluator.py
Size
19.71 KB
Format
application/octet-stream
Description
Script for evaluation of interactive models.
MD5
54dba2acbee67b4b785bf1892624c5a7
Preview
  File Preview