NPU and Domain-Specific Hardware

Designing Neural Processing Unit (NPU) and domain-specific processors (신경처리장치 및 도메인 특화 프로세서 설계)

Research Description

We design a high-performance, energy-efficient Neural Processing Unit (NPU), a new type of processor dedicated for a wide range of AI workloads. It generally exploits the high degree of parallelism inherent in deep learning algorithms. We are also looking for an opportunity to design domain-specific processors for latest state-of-the-art algorithms.

저희는 다양한 AI 작업에 특화된 고성능, 에너지 효율적인 신경처리장치 (NPU)를 설계합니다. 이 프로세서는 딥러닝 알고리즘에 내재된 높은 수준의 병렬성을 활용합니다. 또한 최신 최첨단 알고리즘을 위한 도메인 특화 프로세서 설계 기회도 모색하고 있습니다.

Your Job

  • Don’t worry! We don’t fabricate a real silicon (Area of EE, not CS).
  • Understanding basic computer architecture and digital logics.
  • Evaluating existing NPUs and improving them.
  • Designing a novel microarchitecture for NPUs or domain-specific processors.
  • Studying a software stack (compilers, firmwares, device drivers) for accelerators.

  • 실제 칩을 제작하지는 않습니다 (전기공학 분야, 컴퓨터공학 아님).
  • 기본적인 컴퓨터 아키텍처와 디지털 논리 이해.
  • 기존 NPU를 평가하고 개선하기.
  • NPU 또는 도메인 특화 프로세서를 위한 새로운 마이크로아키텍처 설계.
  • 가속기를 위한 소프트웨어 스택 (컴파일러, 펌웨어, 디바이스 드라이버) 연구.

Related Papers:

  1. An Energy-Efficient Hardware Accelerator for On-Device Inference of YOLOX
    Kyungmi Kim, Soeun Choi, Eunkyeol Hong, Yoonseo Jang, and Jaehyeong Sim
    In 2024 21st International SoC Design Conference (ISOCC)
  2. BS2: Bit-Serial Architecture Exploiting Weight Bit Sparsity for Efficient Deep Learning Acceleration
    Eunseo Kim, Subean Lee, Chaeyun Kim, HaYoung Lim, Jimin Nam, and Jaehyeong Sim
    In 2024 21st International SoC Design Conference (ISOCC)
  3. SCIE
    CREMON: Cryptography Embedded on the Convolutional Neural Network Accelerator
    Yeongjae Choi, Jaehyeong Sim, and Lee-Sup Kim
    IEEE Transactions on Circuits and Systems II: Express Briefs, vol.67, num.12, pp.3337–3341, 2020
  4. SCIE
    An Energy-Efficient Deep Convolutional Neural Network Training Accelerator for In Situ Personalization on Smart Devices
    Seungkyu Choi, Jaehyeong Sim, Myeonggu Kang, Yeongjae Choi, Hyeonuk Kim, and Lee-Sup Kim
    IEEE Journal of Solid-State Circuits, vol.55, num.10, pp.2691–2702, 2020
  5. Major
    A 47.4 uJ/epoch Trainable Deep Convolutional Neural Network Accelerator for In-Situ Personalization on Smart Devices
    Seungkyu Choi, Jaehyeong Sim, Myeonggu Kang, Yeongjae Choi, Hyeonuk Kim, and Lee-Sup Kim
    In 2019 IEEE Asian Solid-State Circuits Conference (A-SSCC)
  6. SCIE
    An Energy-Efficient Deep Convolutional Neural Network Inference Processor with Enhanced Output Stationary Dataflow in 65-nm CMOS
    Jaehyeong Sim, Somin Lee, and Lee-Sup Kim
    IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol.28, num.1, pp.87–100, 2019
  7. Major
    TrainWare: A Memory Optimized Weight Update Architecture for On-Device Convolutional Neural Network Training
    Seungkyu Choi, Jaehyeong Sim, Myeonggu Kang, and Lee-Sup Kim
    In 2018 ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED)
  8. SCIE
    Energy-Efficient Design of Processing Element for Convolutional Neural Network
    Yeongjae Choi, Dongmyung Bae, Jaehyeong Sim, Seungkyu Choi, Minhye Kim, and Lee-Sup Kim
    IEEE Transactions on Circuits and Systems II: Express Briefs, vol.64, num.11, pp.1332–1336, 2017
  9. Top-Tier
    A Kernel Decomposition Architecture for Binary-Weight Convolutional Neural Networks
    Hyeonuk Kim, Jaehyeong Sim, Yeongjae Choi, and Lee-Sup Kim
    In 2017 IEEE/ACM 54th Annual Design Automation Conference (DAC)
  10. Top-Tier
    A 1.42 TOPS/W Deep Convolutional Neural Network Recognition Processor for Intelligent IoE Systems
    Jaehyeong Sim, Jun-Seok Park, Minhye Kim, Dongmyung Bae, Yeongjae Choi, and Lee-Sup Kim
    In 2016 IEEE International Solid-State Circuits Conference (ISSCC)