Dr. Dimitri Palaz
Researcher in Speech Technology and Deep Learning


Principal Research Scientist at Speech Graphics
News and announcements
GameFace Interview
November 2023
I was recently interviewed about my journey in speech technology, machine learning and the video game industry, check it out here.
About me
I am a research scientist with 14 years of experience in speech technologies, deep neural networks and representation learning. My research is focused on detecting patterns in speech, including phoneme recognition, continuous speech recognition and voice activity detection. I've also worked on paralinguistic detection tasks, specifically speech emotion recognition and non-verbal vocalisation detection, focusing on the generalisation capability of such models. Lately I'm interested in self-supervision and acoustic-to-articulatory inversion.
I am Principal Research Scientist and Tech Lead at Speech Graphics since 2017. In my research, I work with deep learning models to improve audio-driven facial animation. My role also implies strategic planning, project management, MLOps, product management, software development (C++) and infrastructure.
I obtained a Ph.D. from EPFL in Switzerland in 2016, where I worked on applying deep learning methods to speech recognition, focusing on using the raw speech signal as input. My doctoral studies were done at Idiap Research Institute under the supervision of Ronan Collobert, Mathew Magimai Doss and Hervé Bourlard.
Research interests
Deep learning and representation learning
Pattern recognition in speech
Generalization capability and biases of deep learning models
Speech processing
Automatic Speech Recognition
Signal processing