Announcements
2026 Openings:
Announcements
We are looking for motivated students in speech processing, multimodal learning, and computer vision.
Please refer to this page for more information.
About MINT Lab
In our daily lives, humans naturally interact with the world by seeing and hearing. At MINT Lab, we aim to bridge the gap between humans and AI by developing technologies that enable natural communication. Our research focuses on speech processing and multimodal AI, moving toward a future where AI can fully engage with human environments just as we do.
Research Highlights
Speech Synthesis
Kim et al., "CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker Generation," IEEE Trans. Audio Speech Lang. Process., 2025
Voice Conversion
Kim et al., "AdaptVC: High Quality Voice Conversion with Adaptive Learning," Proc. ICASSP, 2025
Video-to-Speech
Kim et al., "From Faces to Voices: Learning Hierarchical Representations for High-Quality Video-to-Speech," Proc. CVPR (Highlight), 2025
Audio-Visual Speech Enhancement
Jung et al., "FlowAVSE: Efficient Audio-Visual Speech Enhancement with Conditional Flow Matching," Proc. Interspeech, 2024
Talking Face Generation
Jang et al., "Faces that Speak: Jointly Synthesising Talking Face and Speech from Text," Proc. CVPR, 2024
Interactive Head Generation
Kim et al., "TAVID: Text-Driven Audio-Visual Interactive Dialogue Generation," arXiv preprint, 2025
News
[Mar. 2026] MINT Lab @ Chung-Ang University opens!
84 Heukseok-ro, Dongjak-gu, Seoul, 06974, Korea
ⓒ 2026 MINT Lab @ CAU. All rights reserved.