dc.contributor.author | Pražák, Aleš |
dc.contributor.author | Šmídl, Luboš |
dc.date.accessioned | 2012-03-28T14:45:25Z |
dc.date.available | 2012-03-28T14:45:25Z |
dc.date.issued | 2012-03-28 |
dc.identifier | ZCU_CZ_Parliament |
dc.identifier.uri | http://hdl.handle.net/11858/00-097C-0000-0005-CF9C-4 |
dc.description | The corpus consists of recordings from the Chamber of Deputies of the Parliament of the Czech Republic. It currently consists of 88 hours of speech data, which corresponds roughly to 0.5 million tokens. The annotation process is semi-automatic, as we are able to perform the speech recognition on the data with high accuracy (over 90%) and consequently align the resulting automatic transcripts with the speech. The annotator’s task is then to check the transcripts, correct errors, add proper punctuation and label speech sections with information about the speaker. The resulting corpus is therefore suitable for both acoustic model training for ASR purposes and training of speaker identification and/or verification systems. The archive contains 18 sound files (WAV PCM, 16-bit, 44.1 kHz, mono) and corresponding transcriptions in XML-based standard Transcriber format (http://trans.sourceforge.net) The date of airing of a particular recording is encoded in the filename in the form SOUND_YYMMDD_*. Note that the recordings are usually aired in the early morning on the day following the actual Parliament session. If the recording is too long to fit in the broadcasting scheme, it is divided into several parts and aired on the consecutive days. |
dc.language.iso | ces |
dc.publisher | University of West Bohemia, Department of Cybernetics |
dc.rights | Attribution-NonCommercial-NoDerivs 3.0 Unported (CC BY-NC-ND 3.0) |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/ |
dc.subject | speech corpus |
dc.subject | acoustic model |
dc.subject | speaker identification |
dc.subject | speaker verification |
dc.title | Czech Parliament Meetings |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | audio |
dc.rights.label | PUB |
has.files | yes |
branding | LINDAT / CLARIAH-CZ |
contact.person | Pavel Ircing ircing@kky.zcu.cz University of West Bohemia, Department of Cybernetics |
files.size | 28212817896 |
files.count | 37 |
featuredService.kontext | search|http://lindat.mff.cuni.cz/services/kontext/first_form?corpname=czechparl_2012_03_28_cs_w |
Files in this item
This item is
Attribution-NonCommercial-NoDerivs 3.0 Unported (CC BY-NC-ND 3.0)
Publicly Available
and licensed under:Attribution-NonCommercial-NoDerivs 3.0 Unported (CC BY-NC-ND 3.0)
- Name
- SOUND_110212_003546.trs
- Size
- 857.13 KB
- Format
- XML
- MD5
- 923015c9d2c88ac48a4698fdfc13cdc9
- Name
- SOUND_110212_003546.wav
- Size
- 1.43 GB
- Format
- WAV audio
- MD5
- 506c415fcb7b5b696e3cb2a77379f213
- Name
- SOUND_110325_010705.trs
- Size
- 524.9 KB
- Format
- XML
- MD5
- 840ccaa2f1d6bce167288970053f0b8e
- Name
- SOUND_110325_010705.wav
- Size
- 1.28 GB
- Format
- WAV audio
- MD5
- 8311713e9695cda77f0d42c942d699df
- Name
- SOUND_110427_005836.trs
- Size
- 553.24 KB
- Format
- XML
- MD5
- 218f4b6cb50e943e57b5da5b4c21e5b6
- Name
- SOUND_110427_005836.wav
- Size
- 1.47 GB
- Format
- WAV audio
- MD5
- 154a0007631e23cc797089d70e77832d
- Name
- SOUND_110428_005954.trs
- Size
- 659.81 KB
- Format
- XML
- MD5
- bd26673fb984d0efb303a9cf835a3f8f
- Name
- SOUND_110428_005954.wav
- Size
- 1.45 GB
- Format
- WAV audio
- MD5
- 63204a582abd41f1d93adcf7d469e5c8
- Name
- SOUND_110429_005820.trs
- Size
- 932.46 KB
- Format
- XML
- MD5
- f0010f19c3e84aeecfc429e260edc334
- Name
- SOUND_110429_005820.wav
- Size
- 1.47 GB
- Format
- WAV audio
- MD5
- e3c32c7c1ac476efedbf1e052fbfe98d
- Name
- SOUND_110504_005351.trs
- Size
- 612.66 KB
- Format
- XML
- MD5
- 150103c0a7416dc7108c58b73977071b
- Name
- SOUND_110504_005351.wav
- Size
- 1.49 GB
- Format
- WAV audio
- MD5
- b3713e351b4024956656156512155cde
- Name
- SOUND_110505_005813.trs
- Size
- 786.26 KB
- Format
- XML
- MD5
- 44a583375e52be7cc06a7cbb6d81e6ab
- Name
- SOUND_110505_005813.wav
- Size
- 1.47 GB
- Format
- WAV audio
- MD5
- 2e96fc330ea24f897f330ea265080989
- Name
- SOUND_110506_005821.trs
- Size
- 596.67 KB
- Format
- XML
- MD5
- a0e29f40fe8b0031f1d4ee1c69c079a1
- Name
- SOUND_110506_005821.wav
- Size
- 1.47 GB
- Format
- WAV audio
- MD5
- ad9fdc451d1d2fc4edbbcacfa80bc1ad
- Name
- SOUND_110507_005803.trs
- Size
- 662.77 KB
- Format
- XML
- MD5
- d3dbe84ef877d2c61c348ef2cf495e29
- Name
- SOUND_110507_005803.wav
- Size
- 1.47 GB
- Format
- WAV audio
- MD5
- 6c7811256e31f2588ae8c69edcc4d018
- Name
- SOUND_110511_005438.trs
- Size
- 801.1 KB
- Format
- XML
- MD5
- def233edd2d38d9b2523c28d51cb1aa1
- Name
- SOUND_110511_005438.wav
- Size
- 1.3 GB
- Format
- WAV audio
- MD5
- c5d22f3ef79868577b678b12b7d865e5
- Name
- SOUND_110609_005958.trs
- Size
- 537.05 KB
- Format
- XML
- MD5
- 6f84b1cab6cb4d1ed4a8cf662d2bd120
- Name
- SOUND_110609_005958.wav
- Size
- 1.47 GB
- Format
- WAV audio
- MD5
- 108b68f57cf7c66d663249939ae0bfa6
- Name
- SOUND_110611_010015.trs
- Size
- 646.92 KB
- Format
- XML
- MD5
- e1f7374483875946cfe71cd7e0735f6c
- Name
- SOUND_110611_010015.wav
- Size
- 1.48 GB
- Format
- WAV audio
- MD5
- f45c1c477e11bd418da137e514a72385
- Name
- SOUND_110617_010946.trs
- Size
- 552.6 KB
- Format
- XML
- MD5
- d9a1b7b1e9874d31d6032495b312bfee
- Name
- SOUND_110617_010946.wav
- Size
- 1.42 GB
- Format
- WAV audio
- MD5
- 882c99431e77432219bfa0a78e70f00d
- Name
- SOUND_110618_010259.trs
- Size
- 621.36 KB
- Format
- XML
- MD5
- a79979782169aedede65c4e93115504a
- Name
- SOUND_110618_010259.wav
- Size
- 1.46 GB
- Format
- WAV audio
- MD5
- 75167059ab0c91442008d08d5a7fa3c7
- Name
- SOUND_110619_001010.trs
- Size
- 720.47 KB
- Format
- XML
- MD5
- 93ffd60d719a907f67cefe6c6fdd383d
- Name
- SOUND_110619_001010.wav
- Size
- 1.72 GB
- Format
- WAV audio
- MD5
- 96d5ca5d56134b386bcebe82e3bdb284
- Name
- SOUND_110714_010025.trs
- Size
- 731.2 KB
- Format
- XML
- MD5
- 1cb47c4b229a62eb363fdeceb8ab9168
- Name
- SOUND_110714_010025.wav
- Size
- 1.47 GB
- Format
- WAV audio
- MD5
- f69d271e91f83d304a9ab24604921379
- Name
- SOUND_110715_005912.trs
- Size
- 904.26 KB
- Format
- XML
- MD5
- 50ecff541ab61b03a78d40994d183c6c
- Name
- SOUND_110715_005912.wav
- Size
- 1.47 GB
- Format
- WAV audio
- MD5
- 9828719eaf3045561b954f8a97d04800
- Name
- SOUND_110831_005909.trs
- Size
- 587.89 KB
- Format
- XML
- MD5
- 35452aa527d71fe503eee99ab0029451
- Name
- SOUND_110831_005909.wav
- Size
- 1.47 GB
- Format
- WAV audio
- MD5
- 18cec3a52c0de6dad2b6706a0d3ebf02
- Name
- README.pdf
- Size
- 209.35 KB
- Format
- Description
- Documentation
- MD5
- ce1f7e9c2a178db492b2887b7e860a69