AI Computing Platform Laboratory
Department of Computer Science and Engineering, College of Artificial Intelligence, Ewha Womans University

As AI models continue to grow in size, they require vast amounts of energy, making sustainable AI unfeasible if current trends persist. Consequently, the importance of robust computing HW/SW infrastructure underpinning AI will become critical.
Our main research goal is to compute AI models in a faster and energy-efficient way through HW/SW co-design. Specifically, our research interests include:
- Neural Processing Unit (NPU), domain-specific hardware, FPGA
- Quantization, pruning, and knowledge distillation
- Hardware-aware neural architecture search (HW-Aware NAS) and neural architecture accelerator search (NAAS)
- Processing-in-memory (PIM)
- Efficient LLM serving including KV Caching and other optimizations
- On-Device AI
Latest Updates
- Paper
Our paper "SHARP: Structured Hierarchical Attention Rank Projection for Efficient Language Model Distillation" has been accepted at IEEE Access.
2026-03-22
- News
Yejin Lee continues research in our group by pursuing Ph.D. course (from M.S.).
2026-03-01
- News
Jaelin Lee has joined our group as an undergraduate intern. Welcome!
2026-03-01
- News
Jaeyoung Choi continues research in our group by pursuing M.S. course (from undergraduate).
2026-03-01
- News
Eunkyeol Hong and Yejin Lee has graduated. Congratulations!
2026-02-23
Recent Publications
SHARP: Structured Hierarchical Attention Rank Projection for Efficient Language Model Distillation
Jieui Kang, Eunjoung Yoo, Soeun Choi, Yeonhui Kim, Jaehyeong Sim
ACCESS
ProgressiveServe: 서버리스 LLM 콜드 스타트 완화를 위한 점진적 모델 로딩 및 복구 기법
박나담, 이나경, 이주원, 심재형
KSC2025
DS-CAE: a Dual-Stream Cross-Attentive Autoencoder for Robust and Cluster-Aware Retrieval-Augmented Generation
Soeun Choi, Yejin Lee, Juhee Kim, Minji Kim, Jaehyeong Sim
CCCI2025
GATHER: A Gated-Attention Accelerator for Efficient LLM Inference
Eunjin Lee, Eunseo Kim, Eunjoung Yoo, Jaehyeong Sim
ISOCC2025
LoRA-PIM: In-Memory Delta-Weight Injection for Multi-Adapter LLM Serving
Soeun Choi, Jaehyeong Sim
ISOCC2025