NPU and Domain-Specific Hardware

Designing Neural Processing Unit (NPU) and domain-specific processors

Research Description

We design a high-performance, energy-efficient Neural Processing Unit (NPU), a new type of processor dedicated for a wide range of AI workloads. It generally exploits the high degree of parallelism inherent in deep learning algorithms. We are also looking for an opportunity to design domain-specific processors for latest state-of-the-art algorithms.

Your Job

  • Don’t worry! We don’t fabricate a real silicon (Area of EE, not CS).
  • Understanding basic computer architecture and digital logics.
  • Evaluating existing NPUs and improving them.
  • Designing a novel microarchitecture for NPUs or domain-specific processors.
  • Studying a software stack (compilers, firmwares, device drivers) for accelerators.

Related Papers:

  1. Accepted
    An Energy-Efficient Hardware Accelerator for On-Device Inference of YOLOX
    Kyungmi Kim , Soeun Choi , Eunkyeol Hong , Yoonseo Jang , and Jaehyeong Sim
    In 2024 21st International SoC Design Conference (ISOCC)
  2. Accepted
    BS2: Bit-Serial Architecture Exploiting Weight Bit Sparsity for Efficient Deep Learning Acceleration
    Eunseo Kim , Subean Lee , Chaeyun Kim , HaYoung Lim , Jimin Nam , and Jaehyeong Sim
    In 2024 21st International SoC Design Conference (ISOCC)
  3. SCIE
    CREMON: Cryptography Embedded on the Convolutional Neural Network Accelerator
    Yeongjae Choi , Jaehyeong Sim, and Lee-Sup Kim
    IEEE Transactions on Circuits and Systems II: Express Briefs, vol.67, num.12, pp.3337–3341, 2020
  4. SCIE
    An Energy-Efficient Deep Convolutional Neural Network Training Accelerator for In Situ Personalization on Smart Devices
    Seungkyu Choi , Jaehyeong Sim, Myeonggu Kang , Yeongjae Choi , Hyeonuk Kim , and Lee-Sup Kim
    IEEE Journal of Solid-State Circuits, vol.55, num.10, pp.2691–2702, 2020
  5. Major
    A 47.4 uJ/epoch Trainable Deep Convolutional Neural Network Accelerator for In-Situ Personalization on Smart Devices
    Seungkyu Choi , Jaehyeong Sim, Myeonggu Kang , Yeongjae Choi , Hyeonuk Kim , and Lee-Sup Kim
    In 2019 IEEE Asian Solid-State Circuits Conference (A-SSCC)
  6. SCIE
    An Energy-Efficient Deep Convolutional Neural Network Inference Processor with Enhanced Output Stationary Dataflow in 65-nm CMOS
    Jaehyeong Sim, Somin Lee , and Lee-Sup Kim
    IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol.28, num.1, pp.87–100, 2019
  7. Major
    TrainWare: A Memory Optimized Weight Update Architecture for On-Device Convolutional Neural Network Training
    Seungkyu Choi , Jaehyeong Sim, Myeonggu Kang , and Lee-Sup Kim
    In 2018 ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED)
  8. SCIE
    Energy-Efficient Design of Processing Element for Convolutional Neural Network
    Yeongjae Choi , Dongmyung Bae , Jaehyeong Sim, Seungkyu Choi , Minhye Kim , and Lee-Sup Kim
    IEEE Transactions on Circuits and Systems II: Express Briefs, vol.64, num.11, pp.1332–1336, 2017
  9. Top-Tier
    A Kernel Decomposition Architecture for Binary-Weight Convolutional Neural Networks
    Hyeonuk Kim , Jaehyeong Sim, Yeongjae Choi , and Lee-Sup Kim
    In 2017 IEEE/ACM 54th Annual Design Automation Conference (DAC)
  10. Top-Tier
    A 1.42 TOPS/W Deep Convolutional Neural Network Recognition Processor for Intelligent IoE Systems
    Jaehyeong Sim, Jun-Seok Park , Minhye Kim , Dongmyung Bae , Yeongjae Choi , and Lee-Sup Kim
    In 2016 IEEE International Solid-State Circuits Conference (ISSCC)