Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

94 results about "Pipeline (computing)" patented technology

In computing, a pipeline, also known as a data pipeline, is a set of data processing elements connected in series, where the output of one element is the input of the next one. The elements of a pipeline are often executed in parallel or in time-sliced fashion. Some amount of buffer storage is often inserted between elements.

Optimization of map-reduce shuffle performance through shuffler I/O pipeline actions and planning

A shuffler receives information associated with partition segments of map task outputs and a pipeline policy for a job running on a computing device. The shuffler transmits to an operating system of the computing device a request to lock partition segments of the map task outputs and transmits an advisement to keep or load partition segments of map task outputs in the memory of the computing device. The shuffler creates a pipeline based on the pipeline policy, wherein the pipeline includes partition segments locked in the memory and partition segments advised to keep or load in the memory, of the computing device for the job, and the shuffler selects the partition segments locked in the memory, followed by partition segments advised to keep or load in the memory, as a preferential order of partition segments to shuffle.
Owner:IBM CORP

Computational fluid dynamics (CFD) coprocessor-enhanced system and method

The present invention provides a system, method and product for porting computationally complex CFD calculations to a coprocessor in order to decrease overall processing time. The system comprises a CPU in communication with a coprocessor over a high speed interconnect. In addition, an optional display may be provided for displaying the calculated flow field. The system and method include porting variables of governing equations from a CPU to a coprocessor; receiving calculated source terms from the coprocessor; and solving the governing equations at the CPU using the calculated source terms. In a further aspect, the CPU compresses the governing equations into combination of higher and / or lower order equations with fewer variables for porting to the coprocessor. The coprocessor receives the variables, iteratively solves for source terms of the equations using a plurality of parallel pipelines, and transfers the results to the CPU. In a further aspect, the coprocessor decompresses the received variables, solves for the source terms, and then compresses the results for transfer to the CPU. The CPU solves the governing equations using the calculated source terms. In a further aspect, the governing equations are compressed and solved using spectral methods. In another aspect, the coprocessor includes a reconfigurable computing device such as a Field Programmable Gate Array (FPGA). In yet another aspect, the coprocessor may be used for specific applications such as Navier-Stokes equations or Euler equations and may be configured to more quickly solve non-linear advection terms with efficient pipeline utilization.
Owner:VIRGINIA TECH INTPROP INC

Branch control memory

A branch control memory store branch instructions which are adapted for optimizing performance of programs run on electronic processors. Flexible instruction parameter fields permit a variety of new branch control and branch instruction implementations best suited for a particular computing environment. These instructions also have separate prediction bits, which are used to optimize loading of target instruction buffers in advance of program execution, so that a pipeline within the processor achieves superior performance during actual program execution.
Owner:RENESAS ELECTRONICS CORP

Convolutional neural network hardware accelerator for solidifying full network layer on reconfigurable platform

PendingCN112116084ASolve parallelismResolving Conflicts Between Homogeneous Hardware ParallelismsNeural architecturesPhysical realisationComputer hardwareHardware structure
The invention discloses a convolutional neural network hardware accelerator for solidifying a full network layer on a reconfigurable platform. The accelerator comprises a control module which is usedfor coordinating and controlling an acceleration process, including the initialization and synchronization of other modules on a chip, and starting the interaction of different types of data between each calculation core and an off-chip memory; a data transmission module which comprises a memory controller and a plurality of DMAs and is used for the data interaction between each on-chip data cacheand the off-chip memory; a calculation module which comprises a plurality of calculation cores for calculation, wherein the calculation cores are in one-to-one correspondence with different network layers of the convolutional neural network; wherein each calculation core is used as one stage of an assembly line, and all the calculation cores jointly form a complete coarse-grained assembly line structure, and each calculation core internally comprises a fine-grained computing pipeline. By implementing end-to-end mapping between hierarchical computing and a hardware structure, the adaptabilitybetween software and hardware features is improved, and the utilization efficiency of computing resources is improved.
Owner:UNIV OF SCI & TECH OF CHINA

Method of compressing and decompressing images

A method of compressing and decompressing images is disclosed for applications in compression and decompression chips with the JBIG standard. The pipeline of computing a pixel is divided into three parts: memory access, numerical operations, renormalization and byteout / bytein. Each steps takes a work cycle; therefore, three pixels are processed in parallel at the same time. In comparison, the work cycle of the prior art without pipeline improvement is longer. This method can effectively shorten the work cycle of each image data process, increasing the speed of compressing and decompressing image data.
Owner:PRIMAX ELECTRONICS LTD

Systems and methods for performing programmable smart contract execution

Systems and methods related to a fixed pipeline hardware architecture configured to execute smart contracts in an isolated environment separate from a computing processing unit are described herein. Executing a smart contract may comprise performing a set of distributed ledger operations to modify a ledger associated with a decentralized application. The fixed pipeline hardware architecture may comprise and / or be incorporated within a self-contained hardware device comprising electronic circuitry configured to be communicatively coupled or physically attached to a component of a computer system. The hardware device may be specifically programmed to execute, and perform distributed ledger operations associated with, particular smart contracts, or types of smart contracts, that administer different decentralized applications and / or one or more aspects of different decentralized applications.
Owner:ACCELOR LTD

Reconfigurable processor and configuration method

The invention discloses a reconfigurable processor and a configuration method. The reconfigurable processor comprises a reconfigurable configuration unit and a reconfigurable array; the reconfigurable configuration unit is used for providing reconfiguration information for reconfiguring a computing structure in the reconfigurable array according to an algorithm matched with the current application scene; the reconfigurable array comprises at least two stages of computing arrays, and the reconfigurable array is used for connecting the two adjacent stages of computing arrays into a data path pipeline structure meeting the computing requirements of an algorithm according to the reconfiguration information provided by the reconfigurable configuration unit; and in the same level of computing array, the pipeline depths of different computing modules accessed to the data path pipeline structure are equal, so that the different computing modules accessed to the data path pipeline structure synchronously output data. Therefore, the reconfigurable processor can configure adaptive assembly line depth according to different algorithms, and on this basis, the overall streamline of the data processing operation of the reconfigurable array is realized, and the data throughput rate of the reconfigurable processor is improved.
Owner:AMICRO SEMICON CORP

Computing resource scheduling method and device and electronic equipment

The invention provides a computing resource scheduling method and device, and electronic equipment. The method is executed by a scheduling device, and the method comprises the steps: in a process of performing computing processing on a to-be-processed object of a target computing task by using a computing flow chart, monitoring a current computing load of each computing node in the computing flowchart, wherein the calculation flow chart comprises a plurality of calculation nodes and data transmission pipelines among the mutually connected calculation nodes, the computing node is used for executing a sub-task of a target computing task through a thread in the scheduling equipment, and transmitting data after executing the sub-task to the downstream computing node through a data transmission pipeline; when the current computing load of the target computing node reaches a preset computing power bottleneck state, scheduling computing resources to the target computing node. According to the invention, the computing efficiency of the scheduling equipment can be improved under the condition of limited computing resources.
Owner:BEIJING KUANGSHI TECH

2-D parity checkup correction method and its realization hardware based on advanced encryption standard

The invention relates to a technical field of integrate circuit design, specifically to a two-dimension parity check error detecting method in an advanced encryption standard and an implement hardware thereof. The method, by performing a two-dimension parity check in a horizontal direction and a vertical orientation for the data in the advanced encryption standard, can completely covers the odd number errors by performing a two-dimension parity check in a horizontal direction and a vertical orientation for the data in the advanced encryption standard, has a skyhigh percentage of coverage for the even number errors, especially completely covers the two errors in the condition the number of the error is two, and is capable of effectively resisting error impact. The hardware for actualizing the invention uses a completely parallel construction between a principal operation module and a two-dimension parity check digit computation module, wherein, in the principal operation module, 128 bit data are divided into 4 groups of 32 bit data and uses 2-degree pipeline architecture, and the parity check digit computation module uses a 32 bit data computing mode. The hardware structure has no affect on the data throughput in the in an advanced encryption standard, the leading in extra hardware has a low cost, and the invention is fit for an application area with a high security requirement and a strict hardware area requirement.
Owner:FUDAN UNIV

Parallel computing method and device for natural language processing model, equipment and medium

The invention discloses a parallel computing method and device for a natural language processing model, equipment and a medium. In the scheme, a plurality of computing devices in different computing node groups are trained in an assembly line parallel mode, different computing node groups are subjected to gradient sharing in a data parallel mode, the assembly lines can be controlled in a certain number of nodes in a parallel mode, and the problem that in large-scale computing node training, the number of the nodes is too large is avoided. And the method can be effectively suitable for parallel training of a large-scale network model on large-scale computing nodes. Moreover, according to the scheme, the synchronous communication between the computing node groups is hidden in the pipeline parallel computing process, so that each computing node group can enter the next iterative computation as soon as possible after the iterative computation is finished, and the processing efficiency of the natural language processing model can be improved on the basis of ensuring the processing effect of the natural language processing model in this way. The calculation time of natural language processing model training is shortened, and the efficiency of distributed training is improved.
Owner:NAT UNIV OF DEFENSE TECH

Credible platform and method for controlling hardware equipment by using same

The invention relates to a credible platform and a method for controlling hardware equipment by using the same, belonging to the field of a computer. The credible platform comprises the hardware equipment and a credible platform control module with the function of active control, wherein hardware units, such as an active measure engine, a control ruling engine, a working mode custom engine, a credible control strategy configuration engine, and on the like, are arranged in the credible platform control module so as to realize the control functions of actively checking working mode configuration information, control strategy configuration information, firmware codes and circuit working states, and on the like for the hardware equipment. Through the identity legitimacy authentication and the active control of the credible hardware equipment, which are realized by the credible pipeline technology, the active control and active check function, the security control system of credible hardware equipment which can not be bypassed by the upper layer can be stilled provided for the accessor of credible platform in incredible or lower-credibility computing environment without modifying the computing platform system structure and obviously reducing the system working performance.
Owner:BEIJING UNIV OF TECH

Efficient zero-knowledge proof accelerator and method

The invention relates to an efficient zero-knowledge proof accelerator which can provide a high-computing-power and high-efficiency hardware carrier for zero-knowledge proof calculation. According to the method, a fine-grained pipeline architecture is adopted for multi-scalar multiplication, and a plurality of elliptic curve point addition architectures can be integrated into a large-number modular multiplication hardware circuit under the condition that the chip area is not increased, that is, pipeline calculation acceleration can be carried out on elliptic curve point addition calculation only through one large-number modular multiplication hardware circuit. Meanwhile, a plurality of large-number modular multiplication hardware circuits are further integrated, and parallel acceleration can be carried out on point addition calculation of a plurality of elliptic curves. Therefore, compared with the prior art, the method is more flexible and suitable for ASICs and FPGAs of different scales.
Owner:SHENZHEN INST OF ADVANCED TECH CHINESE ACAD OF SCI

Parallel pipeline computing device of CNC interpolation

The invention provides a parallel pipeline computing device of CNC interpolation, which is characterized by comprising a parallel/pipeline computing member CU3B which is formed by a plurality of computing units CU3; and a data memory. The computing units CU3 comprise six adders, two shifters for right shifting 1 bit, two shifters for right shifting 2 bits and a shifter for right shifting 3 bits; four data output ends, namely, a beta 0l data output end, a beta 1l data output end, a beta 2l data output end and a beta 3l data output end of the former computing unit CU3 and four data output ends, namely, a beta 0r data output end, a beta 1r data output end, a beta 2r data output end and a beta 3r data output end are respectively connected with four data input ends, namely, a beta 0 data input end, a beta 1 data input end, a beta 2 data input end, and a beta 3 data input end of the next two computing units CU3, so that 2n-1 computing units CU3 form the parallel/pipeline computing member CU3B; the beta (0.5) data output end of each computing unit CU3 is connected with the data memory. Compared with the prior art, the device has high speed calculation and accurate results; the device is suitable for reconfigurable computation of chip level parallel pipeline and meets the developing industrial requirements.
Owner:柏安美创新科技 (广州) 有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products