Research

Research Overview

My research focuses on building intelligent, human‑centered systems at the intersection of computer vision, natural language processing, and human–computer interaction. I develop multimodal AI models that integrate visual, linguistic, and motion-based signals to enable natural, interpretable, and accessible interaction between humans and machines.

A central theme of my work is multimodal perception and interaction, spanning sign language technologies, AI‑driven human–computer interfaces, human activity understanding, and generative models for expressive human representation. I place particular emphasis on real‑time systems, robustness in unconstrained environments, and interpretability for safety‑critical and accessibility-oriented applications.

Key Contributions

CVPR 2026 (Main): Real-time vision-based fingertip contact detection for AI-driven human–computer interaction
ACL 2026 (Findings): Personalized emotion visualization and interpretable multimodal NLP systems
High-impact survey contributions in talking face generation and human-centered generative AI
Multimodal learning frameworks for sign language understanding and accessibility

Featured Project

Vision-based Fingertip Contact Detection for AR/VR Interfaces
A real-time RGB-based system that fuses monocular depth estimation and motion cues to achieve millimeter-level fingertip contact detection without dedicated depth sensors. (CVPR 2026, Main Conference)

Sentimentogram: Personalized Emotion Visualization Framework
A human-centered multimodal NLP system that learns individual emotion-visualization preferences via interpretable fusion and minimal user feedback. (ACL 2026, Findings)

AI-driven Human–Computer Interaction

Developing natural, real-time, and vision-based interaction systems for AR/VR and accessibility-focused human–computer interfaces.

Hand & Finger Tracking Virtual Keyboards AR/VR/XR Real-time Vision Accessibility

Featured Project

Vision-based Fingertip Contact Detection for AR/VR Interfaces
A real-time RGB-only system that fuses monocular depth estimation and motion cues to achieve millimeter-level fingertip contact detection without dedicated depth sensors. Validated in interactive VR keyboard scenarios. (CVPR 2026, Main Conference)

Generative AI for Human Interaction

Human-centered generative and multimodal AI systems for expressive communication, interactive visualization, and accessibility-oriented applications.

Multimodal NLP Affective Computing Interpretability Personalization Human-centered AI

Featured Project

Sentimentogram: Personalized Emotion Visualization Framework
A human-centered multimodal NLP framework that learns individual emotion-visualization preferences from minimal user feedback, supported by interpretable audio–text fusion and controlled human studies. (ACL 2026, Findings)

Multimodal Sign Language Technologies

Developing advanced systems for automatic sign language recognition and translation. This research combines multiple input modalities including RGB video, skeletal keypoints, and depth information to achieve robust sign language understanding.

Real-time Translation Cross-linguistic Corpora Wearable Sensors Keypoint Vectorization Vision Transformers Multimodal Fusion

Key Contributions

PhD Thesis: "Advancing Sign Language Recognition: A Multimodal Deep Learning Framework with Keypoint Vectorization"
Deep learning pathways for automatic sign language processing (Pattern Recognition, IF: 9.84)
Interpretable sign language recognition systems

Advanced Human Activity Recognition

Researching novel approaches to recognize and classify human activities using various sensing modalities. Special focus on Doppler radar-based recognition for privacy-preserving applications and skeleton-based pose estimation for sports analytics.

Doppler Radar Analysis Privacy-preserving Surveillance Healthcare Monitoring 3D CNN Skeleton-based Recognition Sports Analytics

Key Contributions

DDC3N: Doppler-driven convolutional 3D network for human action recognition (IEEE Access)
Privacy-preserving human identification in CCTV data
Athletes' action recognition through skeleton-based pose estimation

Generative AI for Human Interaction

Exploring generative models for realistic human face and body synthesis. Research includes talking face generation, speech-to-face synthesis, and 3D facial animation for virtual avatars and accessibility applications.

3D Facial Animation Speech-to-Face Synthesis GANs Diffusion Models Talking Head Generation Virtual Avatars

Key Contributions

Talking human face generation: survey (Expert Systems with Applications, IF: 9.29)
Generative adversarial networks and their application to 3D face generation: a survey (Image and Vision Computing)
Research on diffusion models and LLMs for sign language synthesis

AI-driven Human-Computer Interaction

Developing natural and intuitive interfaces for human-computer interaction. Research includes hand and finger detection for virtual input devices, AR/VR interfaces, and accessibility tools.

Hand Detection Finger Tracking Virtual Keyboards AR/VR/XR Smart Glasses Gesture Recognition

Key Contributions

Human pose, hand, and mesh estimation using deep learning: survey (Journal of Supercomputing, IF: 3.96)
VR keyboard project at KAIST SpaceTop Research Center (ITRC)
Research on natural interaction paradigms for accessibility

Medical AI & Smart Healthcare

Applying AI and deep learning techniques to medical imaging and healthcare applications. Focus on improving diagnostic accuracy and enabling smart healthcare monitoring systems.

Medical Image Processing Smart Healthcare Diagnostic AI Health Monitoring

Research Overview

Key Contributions

Featured Project

AI-driven Human–Computer Interaction

Featured Project

Generative AI for Human Interaction

Featured Project

Multimodal Sign Language Technologies

Key Contributions

Advanced Human Activity Recognition

Key Contributions

Generative AI for Human Interaction

Key Contributions

AI-driven Human-Computer Interaction

Key Contributions

Medical AI & Smart Healthcare

Research Affiliations

Intelligent Network & Embedded Systems Lab

Voice AI Research Institute

SpaceTop Research Center

XVoice AI Laboratory