As AI models continue to grow in size, they require vast amounts of energy, making sustainable AI unfeasible if current trends persist. Consequently, the importance of robust computing HW/SW infrastructure underpinning AI will become critical.

Our main research goal is to compute AI models in a faster and energy-efficient way through HW/SW co-design. Specifically, our research interests include:

  • Neural Processing Unit (NPU), domain-specific hardware, FPGA
  • Quantization, pruning, and knowledge distillation
  • Hardware-aware neural architecture search (HW-Aware NAS) and neural architecture accelerator search (NAAS)
  • Processing-in-memory (PIM)
  • Efficient LLM serving including KV Caching and other optimizations
  • On-Device AI

Latest Updates

  • News

    SHANGA CHOI has joined our group as an undergraduate intern. Welcome!

    2026-06-29

  • News

    Yeonsoo Kim has joined our group as an undergraduate intern. Welcome!

    2026-06-26

  • News

    Sung Eun Kwak has joined our group as an undergraduate intern. Welcome!

    2026-06-26

  • News

    Sihyun Lee has joined our group as an undergraduate intern. Welcome!

    2026-06-26

  • Paper

    Our paper "Token-Based Task-Aware Knowledge Distillation for Encoder Adaptation" has been accepted at IEEE Access.

    2026-06-15

Recent Publications

  • Token-Based Task-Aware Knowledge Distillation for Encoder Adaptation

    Eunjoung Yoo, Jieui Kang, Soeun Choi, Yeonhee Kim, Jaehyeong Sim

    ACCESSEarly Access2026
  • QubitCache: Quantum-Inspired Probabilistic Attention Preservation for KV-Cache Compression

    Jieui Kang, Jaeyoung Choi, Wonhui Roh, Jaehyeong Sim

    ACCESS2026
  • SHARP: Structured Hierarchical Attention Rank Projection for Efficient Language Model Distillation

    Jieui Kang, Eunjoung Yoo, Soeun Choi, Yeonhui Kim, Jaehyeong Sim

    ACCESS2026
  • ProgressiveServe: 서버리스 LLM 콜드 스타트 완화를 위한 점진적 모델 로딩 및 복구 기법

    박나담, 이나경, 이주원, 심재형

    KSC2025
  • LoRA-PIM: In-Memory Delta-Weight Injection for Multi-Adapter LLM Serving

    Soeun Choi, Jaehyeong Sim

    ISOCC2025