Zobrazit minimální záznam

 
dc.contributor.author Šmídl, Luboš
dc.date.accessioned 2013-01-01T14:56:06Z
dc.date.available 2013-01-01T14:56:06Z
dc.date.issued 2013-01-01
dc.identifier ZCU_CZ_ ATCC-LM4ASR
dc.identifier.uri http://hdl.handle.net/11858/00-097C-0000-000D-EC92-F
dc.description The corpus contains pronunciation lexicon and n-gram counts (unigrams, bigrams and trigrams) that can be used for constructing the language model for air traffic control communication domain. It could be used together with the Air Traffic Control Communication corpus (http://hdl.handle.net/11858/00-097C-0000-0001-CCA1-0).
dc.description.sponsorship Technology Agency of the Czech Republic, project No. TA01030476
dc.language.iso eng
dc.publisher University of West Bohemia, Department of Cybernetics
dc.rights Attribution-NonCommercial 3.0 Unported (CC BY-NC 3.0)
dc.rights.uri http://creativecommons.org/licenses/by-nc/3.0/
dc.subject pronunciation lexicon
dc.subject n-gram counts
dc.subject language model
dc.title ATCC: Pronunciation lexicon and n-gram counts for ASR module
dc.type lexicalConceptualResource
metashare.ResourceInfo#ContactInfo#PersonInfo.surname Šmídl
metashare.ResourceInfo#ContactInfo#PersonInfo.givenName Luboš
metashare.ResourceInfo#ContactInfo#PersonInfo#OrganizationInfo.organizationName University of West Bohemia
metashare.ResourceInfo#DistributionInfo.availability restrictedUse
metashare.ResourceInfo#DistributionInfo#LicenseInfo.restrictionsOfUse academic-nonCommercialUse
metashare.ResourceInfo#DistributionInfo#LicenseInfo.restrictionsOfUse attribution
metashare.ResourceInfo#DistributionInfo#LicenseInfo.distributionAccessMedium downloadable
metashare.ResourceInfo#ValidationInfo.validated True
metashare.ResourceInfo#ResourceCreationInfo#FundingInfo#ProjectInfo.projectName Inteligentní technologie pro zvýšení bezpečnosti letového provozu
metashare.ResourceInfo#ResourceCreationInfo#FundingInfo#ProjectInfo.fundingType nationalFunds
metashare.ResourceInfo#ContentInfo.mediaType text
metashare.ResourceInfo#TextInfo#SizeInfo.size 236500
metashare.ResourceInfo#TextInfo#SizeInfo.sizeUnit other
metashare.ResourceInfo#ContactInfo#PersonInfo#OrganizationInfo#CommunicationInfo.email ircing@kky.zcu.cz
metashare.ResourceInfo#ContentInfo.detailedType other
dc.rights.label PUB
has.files yes
branding LINDAT / CLARIAH-CZ
sponsor Technologická agentura České republiky TA01030476 Inteligentní technologie pro zvýšení bezpečnosti letového provozu nationalFunds
size.info 236500 other
files.size 7896750
files.count 7


 Soubory tohoto záznamu

 Stáhnout všechny soubory záznamu (7.53 MB)
Licenční kategorie:
Publicly Available

Licence: Attribution-NonCommercial 3.0 Unported (CC BY-NC 3.0)
Distributed under Creative Commons Attribution Required Noncommercial
Icon
Název
v01_dict_mix.txt
Velikost
169.64 KB
Formát
Textový soubor
Popis
Pronunciation lexicon
MD5
8de3a9ec38eda5c1ad1624c9e61de967
 Stáhnout soubor  Náhled
 Náhled souboru  
&.	d eh s ih m l
&0	z ia r ow
&0	z ih r ow
&0	z iy r ow
&1	w ah n
&1+	w ah n
&10	t eh n
&100	hh ah n d r ah d
&100	hh ah n d r ih d
&1000	th aw z n d
&11	ih l eh v n
&12	t w eh l v
&13	th er t iy n
&15	f ih f t iy n
&1500	f ih f t iy n hh ah n d r ah d
&1500	w ah n th aw z n d f ay v hh ah n d r ah d
&1500	w ah n th aw z n d f ay v hh ah n d r ih d
&18	ey t iy n
&19	n ay n t iy n
&2	t uw
&200	t uw hh ah n d r ah d
&200	t uw hh ah n d r ih d
&2000	t uw th aw z n d
&21	t w eh n t iy w ah n
&24	t w eh n t iy f ao r
&27	t w eh n t iy s eh v n
&28	t w eh n t iy ey t
&3	th r iy
&30	th er t iy
&300	th r iy hh ah n d r ah d
&300	th r iy hh ah n d r ih d
&31	th er t iy w ah n
&4	f ao
&4	f ao r
&4	f ow r
&5	f ay v
&50	f ih f t iy
&500	f ay v hh ah n d r ah d
&500	f ay v hh ah n d r ih d
&6	s ih k s
&60	s ih k s t iy
&7	s eh v n
&70	s eh v n t iy
&700	s eh v n hh ah n d r ah d
&700	s eh v n hh ah n d r ih d
&8	ey t
&80	ey t iy
&9	n ay n ah r
&9+	n a . . .
                                            
Icon
Název
v01_words_mix.1gram.counts
Velikost
52.17 KB
Formát
Neznámý
Popis
Unigram counts (non-speech events included)
MD5
fe04b2e566718c79733c924146089b1d
 Stáhnout soubor
Icon
Název
v01_words_mix.1gram.no-nse.counts
Velikost
52.13 KB
Formát
Neznámý
Popis
Unigram counts (non-speech events removed)
MD5
4e7cf3fd766462eae4fc892a63f161ef
 Stáhnout soubor
Icon
Název
v01_words_mix.2gram.counts
Velikost
718.53 KB
Formát
Neznámý
Popis
Bigram counts (non-speech events included)
MD5
32c5c56c74fe3b72d263c72f1e1dd035
 Stáhnout soubor
Icon
Název
v01_words_mix.2gram.no-nse.counts
Velikost
683.03 KB
Formát
Neznámý
Popis
Bigram counts (non-speech events removed)
MD5
587c12207b27f1d3a844c5d874d30990
 Stáhnout soubor
Icon
Název
v01_words_mix.3gram.counts
Velikost
3.08 MB
Formát
Neznámý
Popis
Trigram counts (non-speech events included)
MD5
674daa0835b58c7d885045d763331658
 Stáhnout soubor
Icon
Název
v01_words_mix.3gram.no-nse.counts
Velikost
2.81 MB
Formát
Neznámý
Popis
Trigram counts (non-speech events removed)
MD5
6ccbdc77f93b9925b1716d7e22402736
 Stáhnout soubor

Zobrazit minimální záznam