Show simple item record

 
dc.contributor.author Korvas, Matěj
dc.contributor.author Plátek, Ondřej
dc.contributor.author Dušek, Ondřej
dc.contributor.author Žilka, Lukáš
dc.contributor.author Jurčíček, Filip
dc.date.accessioned 2014-02-21T10:41:36Z
dc.date.available 2014-02-21T10:41:36Z
dc.date.issued 2014-02-21
dc.identifier.uri http://hdl.handle.net/11858/00-097C-0000-0023-466F-C
dc.description Vystadial 2013 is a dataset of telephone conversations in English and Czech, developed for training acoustic models for automatic speech recognition in spoken dialogue systems. It ships in three parts: Czech data, English data, and scripts. The data comprise over 41 hours of speech in English and over 15 hours in Czech, plus orthographic transcriptions. The scripts implement data pre-processing and building acoustic models using the HTK and Kaldi toolkits. This is the scripts part of the dataset.
dc.description.sponsorship This research was funded by the Ministry of Education, Youth and Sports of the Czech Republic under the grant agreement LK11221.
dc.language.iso eng
dc.language.iso ces
dc.publisher Charles University, Faculty of Mathematics and Physics
dc.rights Apache License 2.0
dc.rights.uri http://opensource.org/licenses/Apache-2.0
dc.source.uri https://ufal.mff.cuni.cz/grants/vystadial
dc.subject ASR
dc.subject HTK
dc.subject Kaldi
dc.subject acoustic model
dc.title Vystadial 2013 – scripts
dc.type toolService
metashare.ResourceInfo#ContactInfo#PersonInfo.surname Korvas
metashare.ResourceInfo#ContactInfo#PersonInfo.givenName Matěj
metashare.ResourceInfo#ContactInfo#PersonInfo#OrganizationInfo.organizationName Faculty of Mathematics and Physics, Charles University in Prague, UFAL
metashare.ResourceInfo#DistributionInfo.availability unrestrictedUse
metashare.ResourceInfo#DistributionInfo#LicenseInfo.restrictionsOfUse evaluationUse
metashare.ResourceInfo#DistributionInfo#LicenseInfo.restrictionsOfUse commercialUse
metashare.ResourceInfo#DistributionInfo#LicenseInfo.restrictionsOfUse attribution
metashare.ResourceInfo#DistributionInfo#LicenseInfo.restrictionsOfUse shareAlike
metashare.ResourceInfo#DistributionInfo#LicenseInfo.distributionAccessMedium downloadable
metashare.ResourceInfo#ValidationInfo.validated True
metashare.ResourceInfo#ResourceCreationInfo#FundingInfo#ProjectInfo.projectName MŠMT LK11221 (Vývoj metod pro návrh statistických mluvených dialogových systémů)
metashare.ResourceInfo#ResourceCreationInfo#FundingInfo#ProjectInfo.fundingType National
metashare.ResourceInfo#ContactInfo#PersonInfo#OrganizationInfo#CommunicationInfo.email korvas@ufal.mff.cuni.cz
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent false
metashare.ResourceInfo#ContentInfo.detailedType tool
dc.rights.label PUB
has.files yes
branding LINDAT / CLARIAH-CZ
sponsor Ministerstvo školství, mládeže a tělovýchovy České republiky LK11221 Vývoj metod pro návrh statistických mluvených dialogových systémů nationalFunds
files.size 100281692
files.count 1


 Files in this item

This item is
Publicly Available
and licensed under:
Apache License 2.0
Icon
Name
scripts.tgz
Size
95.64 MB
Format
application/x-gzip
Description
Vystadial 2013 scripts and models, tgz archive
MD5
1697afb04360e9ac2a6c4a5a55c7f63f
 Download file  Preview
 File Preview  
  • scripts
    • README.rst987 B
    • kaldi
      • s5
        • cmd.sh701 B
        • utils0 B
        • steps0 B
        • run.sh7 kB
        • conf
          • train_conf.sh1 kB
          • decode.config45 B
          • mfcc.conf279 B
        • local
          • create_sample.sh429 B
          • results.py5 kB
          • vystadial_create_G.sh2 kB
          • score.sh1 kB
          • run_decode-lattice-faster.sh4 kB
          • backup.sh1 kB
          • run_cs_transcriptions.sh736 B
          • vystadial_create_LMs_dict.sh4 kB
          • save_check_conf.sh2 kB
          • phonetic_transcription_cs.pl7 kB
          • vystadial_data_split.sh3 kB
          • prepare_cmu_dict.sh2 kB
        • path.sh429 B
      • model_voip_cs
        • tri2b_bmmi.mat18 kB
        • tri2b_bmmi.tree166 kB
        • HCLG_tri2b_bmmi.fst24 MB
        • tri2b.mat18 kB
        • phones.txt1 kB
        • tri2b.tree166 kB
        • tri2b_bmmi.mdl14 MB
        • words.txt205 kB
        • tri2b.mdl6 MB
        • mfcc.conf290 B
        • tri2a.mdl6 MB
        • silence.csl10 B
        • tri2a.tree163 kB
      • README.rst5 kB
      • LICENSE-APACHE-2.0.TXT11 kB
      • model_voip_en
        • tri2b_bmmi.mat18 kB
        • tri2b_bmmi.tree133 kB
        • HCLG_tri2b_bmmi.fst1 MB
        • tri2b.mat18 kB
        • phones.txt1 kB
        • tri2b.tree133 kB
        • tri2b_bmmi.mdl14 MB
        • words.txt9 kB
        • tri2b.mdl6 MB
        • mfcc.conf290 B
        • tri2a.mdl6 MB
        • silence.csl10 B
        • tri2a.tree147 kB
    • htk
      • data_voip_en
        • dev0 B
        • train0 B
        • test0 B
      • README.rst6 kB
      • LICENSE-APACHE-2.0.TXT11 kB
      • model_voip_cs
        • export_models
          • config265 B
          • macros597 B
          • fulllist718 kB
          • julius_tiedlist4 MB
          • hmmdefs56 MB
          • monophones1136 B
          • julius_hmmdefs17 MB
          • tiedlist1 MB
          • trees141 kB
        • temp
          • config
            • monophones1136 B
            • monophones0132 B
            • macros36 B
            • tree_ques.hed6 kB
            • proto1 kB
          • log
            • train
              • test
              • nohup_train_voip_cs.sh42 B
              • data_voip_cs
                • dev0 B
                • test0 B
                • train0 B
              • env_voip_cs.sh1 kB
              • common
                • template_tune_lm_test678 B
                • mktri_cross.led69 B
                • mix18.hed52 B
                • mix8.hed51 B
                • configwi50 B
                • mkphones0.led314 B
                • mix10.hed52 B
                • sil.hed236 B
                • csdict1 B
                • merge_sp_sil.led349 B
                • configrawmit34 B
                • configcross50 B
                • cmudict.0.7a3 MB
                • mix12.hed52 B
                • mix2.hed50 B
                • cmudict.ext6 kB
                • mktri.led63 B
                • mix14.hed52 B
                • mix4.hed50 B
                • csdict.ext0 B
                • mix1.hed26 B
                • mix16.hed52 B
                • mix6.hed51 B
                • configwav36 B
                • config265 B
                • cmudict.0.6d3 MB
              • nohup_train_voip_en.sh42 B
              • env_voip_en.sh1 kB
              • train_voip_cs.sh5 kB
              • train_voip_en.sh5 kB
              • model_voip_en
                • export_models
                  • config265 B
                  • macros597 B
                  • fulllist798 kB
                  • julius_tiedlist4 MB
                  • hmmdefs33 MB
                  • monophones1144 B
                  • julius_hmmdefs9 MB
                  • tiedlist1 MB
                  • trees87 kB
                • temp
                  • config
                    • monophones1144 B
                    • monophones0140 B
                    • macros36 B
                    • tree_ques.hed9 kB
                    • proto1 kB
                  • log
                    • train
                      • test
                      • bin
                        • CreateMLF.py2 kB
                        • OutputEvery.pl489 B
                        • tune.sh886 B
                        • FindReplace.pl391 B
                        • eval_test.sh1 kB
                        • train_tied.sh1 kB
                        • prep_tied.sh1 kB
                        • AddSp.pl1000 B
                        • ProcessNums.pl1 kB
                        • train_mono.sh1 kB
                        • FixCMUDict.pl2 kB
                        • MakeClonedMono.pl634 B
                        • make_mlf_train.sh1 kB
                        • prep_cs_dict.sh337 B
                        • DuplicateLine.pl448 B
                        • prep_param_train.sh1 kB
                        • build_lm_cs.sh5 kB
                        • DuplicateSilence.pl840 B
                        • flat_start.sh2 kB
                        • CreateWordList.py1 kB
                        • prep_tri.sh1 kB
                        • CreateFullList.pl1 kB
                        • PhoneticTranscriptionCS.pl7 kB
                        • RemovePrunedFiles.pl1 kB
                        • train_iter.sh2 kB
                        • PruneWithIndex.pl2 kB
                        • align_mlf.sh1 kB
                        • MakeClusteredTri.pl1 kB
                        • make_mlf_test.sh1010 B
                        • LatticeToDot.pl1 kB
                        • train_mixup.sh3 kB
                        • eval_test_no_lat.sh1 kB
                        • CreateHMMDefs.pl752 B
                        • WordsToDictionary.pl1 kB
                        • eval_test_hd_no_lat.sh1 kB
                        • realign.sh2 kB
                        • build_lm_en.sh5 kB
                        • train_tri.sh436 B
                        • prep_cmu_dict.sh487 B
                        • ConvertToMono.pl849 B
                        • MergeDict.pl1 kB
                        • export_models.sh1 kB
                        • StripStress.pl479 B
                        • prep_param_test.sh1 kB
                        • WordListFromARPALM.py446 B

                  Show simple item record