NPU and Domain-Specific Hardware

Designing Neural Processing Unit (NPU) and domain-specific processors (신경처리장치 및 도메인 특화 프로세서 설계)

Research Description

We design a high-performance, energy-efficient Neural Processing Unit (NPU), a new type of processor dedicated for a wide range of AI workloads. It generally exploits the high degree of parallelism inherent in deep learning algorithms. We are also looking for an opportunity to design domain-specific processors for latest state-of-the-art algorithms.

저희는 다양한 AI 작업에 특화된 고성능, 에너지 효율적인 신경처리장치 (NPU)를 설계합니다. 이 프로세서는 딥러닝 알고리즘에 내재된 높은 수준의 병렬성을 활용합니다. 또한 최신 최첨단 알고리즘을 위한 도메인 특화 프로세서 설계 기회도 모색하고 있습니다.

Your Job

  • Don’t worry! We don’t fabricate a real silicon (Area of EE, not CS).
  • Understanding basic computer architecture and digital logics.
  • Evaluating existing NPUs and improving them.
  • Designing a novel microarchitecture for NPUs or domain-specific processors.
  • Studying a software stack (compilers, firmwares, device drivers) for accelerators.

  • 실제 칩을 제작하지는 않습니다 (전기공학 분야, 컴퓨터공학 아님).
  • 기본적인 컴퓨터 아키텍처와 디지털 논리 이해.
  • 기존 NPU를 평가하고 개선하기.
  • NPU 또는 도메인 특화 프로세서를 위한 새로운 마이크로아키텍처 설계.
  • 가속기를 위한 소프트웨어 스택 (컴파일러, 펌웨어, 디바이스 드라이버) 연구.

Related Papers:

  1. An Energy-Efficient Hardware Accelerator for On-Device Inference of YOLOX
    Kyungmi Kim , Soeun Choi , Eunkyeol Hong , Yoonseo Jang , and Jaehyeong Sim
    In 2024 21st International SoC Design Conference (ISOCC)
  2. BS2: Bit-Serial Architecture Exploiting Weight Bit Sparsity for Efficient Deep Learning Acceleration
    Eunseo Kim , Subean Lee , Chaeyun Kim , HaYoung Lim , Jimin Nam , and Jaehyeong Sim
    In 2024 21st International SoC Design Conference (ISOCC)
  3. SCIE
    CREMON: Cryptography Embedded on the Convolutional Neural Network Accelerator
    Yeongjae Choi , Jaehyeong Sim, and Lee-Sup Kim
    IEEE Transactions on Circuits and Systems II: Express Briefs, vol.67, num.12, pp.3337–3341, 2020
  4. SCIE
    An Energy-Efficient Deep Convolutional Neural Network Training Accelerator for In Situ Personalization on Smart Devices
    Seungkyu Choi , Jaehyeong Sim, Myeonggu Kang , Yeongjae Choi , Hyeonuk Kim , and Lee-Sup Kim
    IEEE Journal of Solid-State Circuits, vol.55, num.10, pp.2691–2702, 2020
  5. Major
    A 47.4 uJ/epoch Trainable Deep Convolutional Neural Network Accelerator for In-Situ Personalization on Smart Devices
    Seungkyu Choi , Jaehyeong Sim, Myeonggu Kang , Yeongjae Choi , Hyeonuk Kim , and Lee-Sup Kim
    In 2019 IEEE Asian Solid-State Circuits Conference (A-SSCC)
  6. SCIE
    An Energy-Efficient Deep Convolutional Neural Network Inference Processor with Enhanced Output Stationary Dataflow in 65-nm CMOS
    Jaehyeong Sim, Somin Lee , and Lee-Sup Kim
    IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol.28, num.1, pp.87–100, 2019
  7. Major
    TrainWare: A Memory Optimized Weight Update Architecture for On-Device Convolutional Neural Network Training
    Seungkyu Choi , Jaehyeong Sim, Myeonggu Kang , and Lee-Sup Kim
    In 2018 ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED)
  8. SCIE
    Energy-Efficient Design of Processing Element for Convolutional Neural Network
    Yeongjae Choi , Dongmyung Bae , Jaehyeong Sim, Seungkyu Choi , Minhye Kim , and Lee-Sup Kim
    IEEE Transactions on Circuits and Systems II: Express Briefs, vol.64, num.11, pp.1332–1336, 2017
  9. Top-Tier
    A Kernel Decomposition Architecture for Binary-Weight Convolutional Neural Networks
    Hyeonuk Kim , Jaehyeong Sim, Yeongjae Choi , and Lee-Sup Kim
    In 2017 IEEE/ACM 54th Annual Design Automation Conference (DAC)
  10. Top-Tier
    A 1.42 TOPS/W Deep Convolutional Neural Network Recognition Processor for Intelligent IoE Systems
    Jaehyeong Sim, Jun-Seok Park , Minhye Kim , Dongmyung Bae , Yeongjae Choi , and Lee-Sup Kim
    In 2016 IEEE International Solid-State Circuits Conference (ISSCC)