Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Accelerator execution method and electronic equipment

An accelerator and register technology, applied in the field of electronics, can solve the problems of inability to optimize matrix multiplication calculation, programmers do not understand hardware matrix multiplication, execution efficiency and low tensor processing efficiency.

Active Publication Date: 2022-06-03
HEXAFLAKE (NANJING) INFORMATION TECH CO LTD
View PDF8 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Conventional matrix multiplication based on internal accelerators such as GPUs is usually not known to program programmers, so programmers usually do not understand the process of hardware performing matrix multiplication, and thus cannot optimize the calculation of matrix multiplication for hardware, which leads to program execution Efficiency and generally less efficient tensor processing

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Accelerator execution method and electronic equipment
  • Accelerator execution method and electronic equipment
  • Accelerator execution method and electronic equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052] Preferred embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While preferred embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

[0053] As used herein, the term "including" and variations thereof mean open-ended inclusion, ie, "including but not limited to". The term "or" means "and / or" unless specifically stated otherwise. The term "based on" means "based at least in part on". The terms "one example embodiment" and "one embodiment" mean "at least one example embodiment." The term "another embodiment" means "at least one additional embodiment." The terms "first", "se...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A method performed by an accelerator and an electronic device are described herein. The method comprises the following steps: receiving a first tensor multiplication instruction for a first thread of an accelerator; the first thread set broadcasts a second set of factors in the second tensor to the second thread set based on a memory logical address for the second tensor; a first thread in a second set of threads dot-product operates a first set of factors and the second set of factors based on the first factor register representation to generate a first dot-product set in a first row of a third tensor. According to the method, the matrix is decomposed, and the threads are distributed according to the rows, so that the multiple threads can process multiple rows of the matrix tensor in parallel, and the processing efficiency of matrix multiplication is improved. Besides, programming personnel know the row and column structure of the matrix tensor and the thread condition in the accelerator during programming, so that the threads can be flexibly used for processing matrix multiplication in parallel, and the programming flexibility is improved.

Description

technical field [0001] Embodiments of the present disclosure relate generally to the field of electronics, and more particularly to a method performed by an accelerator and an accelerator. Background technique [0002] Parallel high-performance multi-threaded multi-core processing systems such as graphics processing units (GPUs) can process data much faster than in the past. These processing systems can break down complex computations into smaller tasks and are processed in parallel by multiple cores to increase processing efficiency and reduce processing time. [0003] In some cases, multi-core processors such as GPUs are particularly advantageous for processing tensors with large amounts of data in the same or similar form. In the computer field, tensor data usually represents data of one-dimensional or multi-dimensional arrays. For example, image data is a conventional two-dimensional tensor data, which can be represented by two-dimensional arrays. For another example, ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/16G06F7/523
CPCG06F17/16G06F7/523Y02D10/00
Inventor 杨经纬葛建明李甲桑永奇谢钢锋姚飞仇小钢
Owner HEXAFLAKE (NANJING) INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products