This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.

Speech Commands Dataset Enhanced for Direction-of-Arrival Estimation

Please use the following text to cite this item or export to a predefined format:
Beneš, David, 2023, Speech Commands Dataset Enhanced for Direction-of-Arrival Estimation, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/11234/1-5140.
Date issued
2023-04-25
Language(s)
Description
This dataset can serve as a training and evaluation corpus for the task of training keyword detection with speaker direction estimation (keyword direction of arrival - KWDOA). It was created by processing the existing Speech Commands dataset [1] with the PyroomAcoustics library so that the resulting speech recordings simulate the usage of a circular microphone array with 4 microphones having a distance of 57 mm between adjacent microphones. Such design of a simulated microphone array was chosen in order to match the existing physical microphone array from the Seeeduino series. [1] Warden, Pete. “Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition.” ArXiv.org, 2018, arxiv.org/abs/1804.03209
This item isPublicly Available
and licensed under:
 Files in this item
Name
train.7z
Size
52.44 GB
Format
application/octet-stream
Description
train
MD5
4537ebca6dd2917e76ac6a8f9e36205d
Preview
  File Preview
Name
validate.7z
Size
6.28 GB
Format
application/octet-stream
Description
validate
MD5
986e3819a577913cb6ede381cae9c576
Preview
  File Preview
Name
background_noise.7z
Size
37.88 GB
Format
application/octet-stream
Description
background noise
MD5
dcf3ff7d27a1dda0f261b2134799ad94
Preview
  File Preview
Name
test.7z
Size
6.82 GB
Format
application/octet-stream
Description
test
MD5
05dd84d07107761c35e006a1a9f51ee9
Preview
  File Preview
Name
README.pdf
Size
571 KB
Format
application/pdf
Description
README
MD5
0bc6603449b8e20b070026ced35cb824
Preview
  File Preview