WEB OF SCIENCE
SCOPUS
Many deep learning accelerators have been proposed and designed in both academia and industry for executing deep neural networks with better power efficiency. Recently, many studies focus on developing a system-on-chip including both host processor and accelerators. In this paper, we demonstrate a full software-hardware stack for accelerating deep learning benchmarks using a co-processor attached to RISC-V core. To do so, we extend the RISC-V instruction set and modified the compilation stack to show significant end-to-end performance boost compared to the CPU-only processing scenario. © 2024 IEEE.
더보기