Persistent scratchpad memory for data exchange between programs

A technology for data storage and sticky notes, applied in the field of shared memory, which can solve problems such as slow speed

Pending Publication Date: 2021-05-11
NVIDIA CORP
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Exchanging data through global memory is at least an order of magnitude slower than the latency of accessing scratchpad memory
In addition, since the sticky note memory is private to each SEC, different SECs executing the same kernel must also exchange data through the global memory

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Persistent scratchpad memory for data exchange between programs
  • Persistent scratchpad memory for data exchange between programs
  • Persistent scratchpad memory for data exchange between programs

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0017] A highly parallel processor, such as a graphics processing unit (GPU), is a computing device capable of executing very large numbers of threads in parallel. In the context of the following description, a thread is a process or execution context. In one embodiment, the highly parallel processor operates as a coprocessor to the main central processing unit (CPU) or host: in other words, the data-parallel, computationally intensive parts of the application running on the host are offloaded to the coprocessor device.

[0018] More precisely, a part of an application that executes multiple times but independently of different data can be isolated into a kernel function that is executed on a highly parallel processor as many different threads. To this end, such functions are compiled into the Parallel Thread Execution (PTX) instruction set, and the resulting cores are translated into the target highly parallel processor instruction set at installation time.

[0019] Batches...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a persistent scratchpad memory for data exchange between programs. Techniques are disclosed for sharing of data exchange among kernels (each a set of instructions) executing on a system having multiple processing units. In an embodiment, each processing unit includes an on-chip scratchpad memory that can be accessed by the kernels executing on the processing unit. All or a portion of the scratchpad memory can be allocated and configured, for example, such that the scratchpad is accessible to multiple kernels in parallel, to one or more kernels in serial, or a combination of both.

Description

technical field [0001] The present disclosure relates to sharing memory between programs, and more particularly, to persistent scratchpad memory for inter-program data exchange. Background technique [0002] Typically, an amount of on-chip scratch memory (e.g., 16KB, 32KB, or 64KB in size) is allocated from a scratch memory pool and assigned to a set of one or more execution contexts ("SECs") (e.g., Execution contexts are threads or processes) that execute kernels, which are sets of instructions, such as programs, in serial and / or parallel fashion. In one example, the set of one or more execution contexts is a Cooperative Thread Array (CTA). The scratchpad memory allocated to the SEC is private to the SEC, and data stored in the scratchpad memory is not persistent once the SEC completes kernel execution. In addition, the note memory does not automatically back up the memory. Thus, in conventional systems, data stored in scratch memory is resolved by having each SEC execut...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F9/24G06F9/54G06F15/167
CPCG06F9/24G06F9/54G06F15/167G06F2209/543G06F9/5022G06F9/5027G06F9/544G06F9/4843G06F9/52
Inventor R·达什J·H·肖凯特M·L·米尔顿S·琼斯C·F·兰姆
Owner NVIDIA CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products