Winograd Automatic Performance Optimization Based on TVM
Author:
Affiliation:

1.Southern University of Science and Technology;2.Shenzhen city Tencent computer system Co.Ltd.;3.Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences

Funding:

This work was supported by the Key Research and Development Project of Guangdong Province (2021B0101310002), National Natural Science Foundation of China (62272449), Shenzhen Basic Research Fundation (RCYX20200714114734194, KQTD20200820113106007, ZDSYS20220422103800001), and Youth Innovation Promotion Association, CAS (Y2021101)

Ethical statement:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
    Abstract:

    Convolutional Neural Networks (CNNs), as a quintessential representation of deep learning, are the most commonly used neural networks in tasks such as computer vision. However, convolution operations typically account for over 90% of the runtime in CNNs, becoming a bottleneck for performance. Additionally, due to the complexity of current hardware and the diversity of workloads, specific optimizations in previous work often lack performance portability. To address this, we introduce BlazerML, an open-source convolution computation library based on auto-generated code templates from TVM, capable of automatically generating high-performance convolution implementations for any input shape. BlazerML is implemented based on the Winograd algorithm, known for its high performance in fast convolution algorithms. Experimental results demonstrate that BlazerML significantly outperforms current state-of-the-art open-source libraries. On x86 CPUs, running common deep learning network forward inferences, it is faster by 1.18~2.47, 1.18~2.27, and 1.01~1.66 times compared to OnnxRuntime, MNN, and the TVM community version, respectively. On ARM CPUs, for single-layer inference of common deep learning networks, it surpasses ACL and FastConv by 1.26~6.11 and 1.04~4.28 times, respectively.

    Reference
    Related
    Cited by
Get Citation
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
History
  • Received:February 02,2024
  • Revised:February 02,2024
  • Adopted:
  • Online: April 01,2024
  • Published:
Baidu
map