PCVC Speech Dataset
This article, PCVC Speech Dataset, has recently been created via the Articles for creation process. Please check to see if the reviewer has accidentally left this template after accepting the draft and take appropriate action as necessary.
Reviewer tools: Inform author |
The PCVC Speech Dataset is a Modern Persian speech corpus for speech recognition. The dataset contains sound samples of Modern Persian combination of vowel and consonant phonemes from different speakers. Every sound sample contains just one consonant and one vowel So it is somehow labeled in phoneme level. This dataset contains of 23 Persian consonants and 6 vowels. The sound samples are all possible combinations of vowels and consonants (138 samples for each speaker). The sample rate of all speech samples is 48000 which means there are 48000 sound samples in every 1 second. Every sound sample is 276 seconds(138 two seconds samples). In each 2s sample, in average, 0.5 second of each sample is speech and the rest is silence. In each sound sample 0.25s of start and 0.25s of end of it is surely scilence.[1] Also in each 2s first consonant phoneme pronounced and then vowel is. All of sound samples are denoised with "Adaptive noise reduction" algorithm.[2] Compared to Farsdat speech dataset[3] and Persian Speech Corpus[4] it is more easy to use because it is prepared in .mat data files.[5] Also it is more based on phoneme based separation and also it is denoised.
Contents
The corpus is downloadable from its GitHub web page, and contains the following:
- .mat data files of sound samples in a 23*6*30000 matrix, in which 23 is number of consonants, 6 is the number of vowels and 30000 is the length of 2s sound sample.
See also
References
- ^ Saber MalekzadeH, Mohammad Hossein Gholizadeh, Seyed Naser Razavi "Full Persian Vowel recognition with MFCC and ANN on PCVC speech dataset" (PDF).
{{cite journal}}
: Cite journal requires|journal=
(help) 5th International conference of electrical engineering, computer science and information technology, Iran, Tehran, 2018. - ^ "PCVC GitHub page".
{{cite journal}}
: Cite journal requires|journal=
(help) - ^ Bijankhan, M., Sheikhzadegan, J., Roohani, M. R., Samareh, Y., Lucas, C., & Tebyani, M. (1994). FARSDAT-The Speech Database of Farsi Spoken Language. The Proceedings of the Australian Conference on Speech Science and Technology (Vol. 2, pp. 826โ831).
- ^ Halabi, Nawar (2016). Modern Standard Persian Phonetics for Speech Synthesis. University of Southampton, School of Electronics and Computer Science.
- ^ "Access and change variables directly in MAT-files, without loading into memory".
{{cite journal}}
: Cite journal requires|journal=
(help)
External links
Category:Corpora Category:Datasets in machine learning Category:Persian language