Speaker Variations and Vocal Disguise

Abhishek B.P.; Diya E.S. Dinesh; Chandana S.

doi:10.37591/rrjohp.v13i2.3259

Authors

Abhishek B.P.
Diya E.S. Dinesh
Chandana S.

DOI:

https://doi.org/10.37591/rrjohp.v13i2.3259

Keywords:

Variability, speaker recognition, formants, range, acoustic parameters

Abstract

The process of recognizing the speaker based on parameters like pitch, loudness and other acoustic attributes is called speaker recognition. Speaker recognition is considered to be challenging. The voice changing apps facilitating vocal disguise have further made this process even difficult. The voice changing apps can induce variations, some of these variations may be predominantly different, even though these apps can disguise a person’s voice, certain parameters may stay real to the original/habitual samples. The current study was carried out with the aim of determining intra and inter-speaker differences in vocal disguise in six adult speakers. The habitual voice of these individuals was recorded and three variations were induced using a voice-changing app. As the sample size was limited, descriptive analysis was carried out for all the six participants. The first three formant frequencies were determined and the intra and inter-speaker differences were determined. The intra speaker differences were more when compared to inter speaker differences.

References

RAJU K, Anil Kumar Vuppala. A study on the emotional state of a speaker in voice bio-metrics [Internet]. ResearchGate. unknown; 2020 [cited 2023 Aug 24]. Available from:https://www.researchgate.net/publication/341125250_A_study_on_the_emotional_state_of_a_speaker_in_voice_bio-metrics

Saeidi R., Huhtakallio I., & Alku P. Analysis of Face Mask Effect on Speaker Recognition. In Interspeech ((2016, September). pp. 1800-1804).

Zheng L, Li, J, Sun M., Zhang X. & Zheng, TF. When automatic voice disguise meets automatic speaker verification. IEEE Transactions on Information Forensics and Security, 2020, 16, 824-837.

San Segundo E., Alves H & Trinidad, M. F. CIVIL corpus: Voice quality for speaker forensic comparison. Procedia-Social and Behavioral Sciences, 2013, 95, 587-593.

Kajarekar, S. S., Bratt, H. Shriberg, E & R. Leon. A study of intentional voice modifications for evading automatic speaker recognition,” in 2006 IEEE International Workshop on The Speaker and Language Recognition (ODYSSEY’06), San Juan, 2006, pp. 1-6

Laroche J. Time and Pitch Scale Modification of Audio Signals [Internet]. ResearchGate. unknown; 2006 [cited 2023 Aug 24]. Available from: https://www.researchgate.net/publication/226135596_Time_and_Pitch_Scale_Modification_of_Audio_Signals

Kawahara T. Speech analysis, modification and synthesis foundation STRAIGHT and its applications”, Computer Software, vol. 32, no. 3, pp.23-28, 2015

Tan T . The effect of voice disguise on automatic speaker recognition,” in 2010 3rd IEEE International Congress on Image and Signal Processing (CISP’10), Yantai, 2010, pp. 3538-3541

Dutoit T. High quality text-to-speech synthesis: a comparison of four candidate algorithms [Internet]. Proceedings of ICASSP ’94. IEEE International Conference on Acoustics, Speech and Signal Processing. ; 2017 [cited 2023 Aug 24]. Available from: https://www.semanticscholar.org/paper/High-quality-text-to-speech-synthesis%3A-a-comparison-Dutoit/613e9432abbcccdd51fbc6f40ca2f2625164a4d2

Jiang DN, Zhang W, Shen L, Cai LH. Prosody Analysis and Modeling for Emotional Speech Synthesis [Internet]. ResearchGate. IEEE (Institute of Electrical and Electronics Engineers); 2005 [cited 2023 Aug 24]. Available from: https://www.researchgate.net/publication/4136856_Prosody_Analysis_and_Modeling_for_Emotional_Speech_Synthesis

Speaker Variations and Vocal Disguise

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

Developed By

Subscription

Language

Information