Veronica Khaustova | AI Research Engineer @ Eyes, JAPAN Co. Ltd

About Me

I am an AI Research Engineer at Eyes, JAPAN, Co. Ltd and hold a Ph.D. in Computer Science and Engineering from the University of Aizu, Japan (2024).

My research focuses on the intersection of speech technology, machine learning, and language education. My doctoral work centered on computational approaches to speech prosody analysis and computer-assisted pronunciation training (CAPT) with personalized assessment for learners from diverse first-language backgrounds. Currently, I work on voice conversion for pronunciation training and domain-specific speech recognition for aviation communication. I am always open to conversations and collaborations that cross disciplinary boundaries.

Previously, I earned a BSc in Air Traffic Control and Flight Support from the Belarusian State Academy of Aviation (2017).

Research Interests

Speech Prosody Analysis: suprasegmental feature extraction, prosody visualization, dynamic assessment
Computer-Assisted Pronunciation Training (CAPT): personalized feedback, accent classification, L2 pronunciation assessment
Voice Conversion: preserving speaker identity while modifying pronunciation
ASR for Specialized Domains: automatic speech recognition for air traffic control and pilot communications

News

[Oct. 2025] New paper on Japanese ATC communication corpus at the 159th Audio Engineering Society Convention.
[Mar. 2024] Paper on multimodal exercises in iCAPT systems presented at INTED2024.
[Mar. 2024] Completed my Ph.D. at the University of Aizu!
[Nov. 2023] Presented “CAPTuring Accents” at SPECOM 2023.
[Sept. 2022] Paper on accent classification presented at Interspeech 2022.

Scholarships & Awards

MEXT Scholarship from the Japanese Government (October 2017 – March 2024).
Aizu Area Foundation for the Promotion of Education and Science Award (2020).

Service & Activities

Student volunteer at AAAI Conference (Honolulu 2019, New York 2020, Vancouver 2024).
Assistant organizer at ICAIT Conference (Aizuwakamatsu, 2018).

Publications

Note: name changed in March 2023 from Veranika Mikhailava to Veronica Khaustova.

Journal Publications

Blake, J., Bogach, N., Kusakari, A., Lezhenin, I., Khaustova, V., Xuan, S. L., & Pyshkin, E. (2024). An Open CAPT System for Prosody Practice: Practical Steps towards Multilingual Setup. Languages, 9(1), 27. DOI
Mikhailava, V., Lesnichaia, M., Bogach, N., Lezhenin, I., Blake, J., & Pyshkin, E. (2022). Language Accent Detection with CNN Using Sparse Data from a Crowd-Sourced Speech Archive. Mathematics, 10(16), 2913. DOI

Conference Publications

Tang, L.; Khaustova, V.; Villegas, J. (2025, October). Construction of a Japanese air traffic control communication corpus assisted with automatic speech recognition. In Proc. 159 Audio Eng. Soc. Conv. [Link] (https://aesshow2025lb.sched.com/event/294Pt/construction-of-a-japanese-air-traffic-control-communication-corpus-assisted-with-automatic-speech-recognition)
Pyshkin, E., Blake, J., Khaustova, V., Lezhenin, I., Svechnikov, R., Efimov, D., Bogach, N. (2024, March). Multimodal Contextualizing and Targeting Exercises in iCAPT Systems. INTED2024. DOI
Khaustova, V., Pyshkin, E., Khaustov, V., Blake, J., & Bogach, N. (2023, November). CAPTuring Accents: An Approach to Personalize Pronunciation Training for Learners with Different L1 Backgrounds. In International Conference on Speech and Computer (pp. 59–70). Springer Nature Switzerland. DOI
Lesnichaia, M., Mikhailava, V., Bogach, N., Lezhenin, I., Blake, J., & Pyshkin, E. (2022, September). Classification of Accented English Using CNN Model Trained on Amplitude Mel-Spectrograms. Proc. Interspeech 2022, 3669–3673. DOI
Mikhailava, V., Blake, J., Pyshkin, E., Bogach, N., Chernonog, S., Zhuikov, A., Lesnichaya, M., Lezhenin, I., & Svechnikov, R. (2022, May). Dynamic Assessment during Suprasegmental Training with Mobile CAPT. In Proc. Speech Prosody (Vol. 2022, pp. 430–434). DOI
Mikhailava, V., Pyshkin, E., Blake, J., Chernonog, S., Lezhenin, I., Svechnikov, R., & Bogach, N. (2022, March). Tailoring computer-assisted pronunciation teaching: Mixing and matching the mode and manner of feedback to learners. In INTED2022 Proceedings (pp. 767–773). IATED. DOI
Mikhailava, V., Pyshkin, E., & Klyuev, V. (2020, February). Aesthetic evaluation of food plate images using deep learning. In 2020 22nd International Conference on Advanced Communication Technology (ICACT) (pp. 285–289). IEEE. DOI
Mikhailava, V., Khaustov, V., & Klyuev, V. (2018, November). Overview and Categorization of Recent Approaches to Microblog Classification. In Proceedings of the 3rd International Conference on Applications in Information Technology (pp. 127–130). DOI