Research
I am a research scientist and engineer with 15 years of experience in speech technologies, deep neural networks and representation learning. My research is focused on detecting patterns in speech, including phoneme recognition, continuous speech recognition and voice activity detection. I’ve also worked on paralinguistic detection tasks, specifically speech emotion recognition and non-verbal vocalisation detection, focusing on the generalisation capability of such models. Lately I’m interested in self-supervision and acoustic-to-articulatory inversion.
I am a Principal Research Scientist and Tech Lead at Speech Graphics since 2017. In my work, I focus on developing and improve the audio-driven facial animation technology, which is a standard in the game industry. I have experience in leading long-term research directions and technical solutions, software development (C++, python), project and product management, operations and infrastructure.
I obtained a Ph.D. from EPFL in Switzerland in 2016, where I worked on applying deep learning methods to speech recognition. My doctoral studies were done at Idiap Research Institute under the supervision of Ronan Collobert, Mathew Magimai Doss and Hervé Bourlard.
Awards
2023 Eurasip Best Paper Award for Speech Communication Journal
“End-to-end acoustic modeling using convolutional neural networks for HMM-based automatic speech recognition”, Palaz D., Magimai-Doss M., and Collobert R., Volume 108, April, 2019
ISCA Award for Best Paper Published in Speech Communication (2018-2022)
“End-to-end acoustic modeling using convolutional neural networks for HMM-based automatic speech recognition”, Palaz D., Magimai-Doss M., and Collobert R., Volume 108, April, 2019