Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

241results about How to "Improve locality" patented technology

Parallel Array Architecture for a Graphics Processor

A parallel array architecture for a graphics processor includes a multithreaded core array including a plurality of processing clusters, each processing cluster including at least one processing core operable to execute a pixel shader program that generates pixel data from coverage data; a rasterizer configured to generate coverage data for each of a plurality of pixels; and pixel distribution logic configured to deliver the coverage data from the rasterizer to one of the processing clusters in the multithreaded core array. The pixel distribution logic selects one of the processing clusters to which the coverage data for a first pixel is delivered based at least in part on a location of the first pixel within an image area. The processing clusters can be mapped directly to the frame buffers partitions without a crossbar so that pixel data is delivered directly from the processing cluster to the appropriate frame buffer partitions. Alternatively, a crossbar coupled to each of the processing clusters is configured to deliver pixel data from the processing clusters to a frame buffer having a plurality of partitions. The crossbar is configured such that pixel data generated by any one of the processing clusters is deliverable to any one of the frame buffer partitions.
Owner:NVIDIA CORP

System and method for class loader constraint checking

An object-oriented computer system includes two or more class loaders for loading program class files into the system. A constraint checking mechanism is provided so that where a first class file loaded by a first class loader makes a symbolic reference to a second class file loaded by a second class loader, with said symbolic reference including a descriptor of a third class file, the constraint enforces that the first and second class files agree on the identity of the third class file. The constraint checking mechanism stores a list of constraints as a set of asymmetric relationships between class loaders. Each stored constraint, for a class loader which loaded a class file that contains a symbolic reference to another class file, includes a first parameter denoting the class loader which loaded the class file to which the symbolic references is made; and a second parameter denoting a class file which is identified by a descriptor in said symbolic reference.
Owner:IBM CORP

Systems and methods for parallelizing and optimizing sparse tensor computations

ActiveUS20150169369A1Efficiently executeFacilitate parallel executionProgram initiation/switchingInterprogram communicationTensorScheduling system
A scheduling system can schedule several operations for parallel execution on a number of work processors. At least one of the operations is not to be executed, and the determination of which operation or operations are not to be executed and which ones are to be executed can be made only at run time. The scheduling system partitions a subset operations that excludes the one or more operation that are not to be executed into several groups based on, at least in part, an irregularity of operations resulting from the one or more operation that are not to be executed. In addition, the partitioning is based on, at least in part, locality of data elements associated with the subset of operations to be executed or loading of the several work processors.
Owner:QUALCOMM TECHNOLOGIES INC

System and method for loop unrolling in a dynamic compiler

Provided is a method for performing loop-unrolling optimization during program execution. In one example, a method for loop optimization within a dynamic compiler system is disclosed. A computer program having a loop structure is executed, wherein the loop structure includes a loop exit test to be performed during each loop iteration. The loop structure is compiled during the execution of the computer program, and an unrolled loop structure is created during the compiling operation. The unrolled loop structure includes plurality of loop bodies based on the original loop structure. Further, the unrolled loop structure can include the loop exit test, which can be performed once for each iteration of the plurality of loop bodies.
Owner:ORACLE INT CORP

Method and apparatus for fast performing MMU analog, and total system simulator

The invention provides a method for executing quick MMU simulation for computer program in computer system, wherein, a destined size address injection space in which virtual page number and corresponding physical page number are stored is allotted in the computer system. The method comprises the following steps: for the loading / storing instruction on a code sect of the computer program, comparing the virtual page number of virtual address of loading / storing instruction with the virtual page number stored in the address injection space, if the virtual page numbers being identical, obtaining corresponding physical address according to the physical page number stored in the address injection space, otherwise, executing address conversion by-pass buffer searching, that is TLB searching to obtain corresponding physical address, and reading data from the obtained corresponding physical address or writing data into the obtained corresponding physical address. The invention also discloses a device and a total system simulator for realizing the above method.
Owner:INT BUSINESS MASCH CORP

Mine roof and floor water inrush monitoring and prediction system and method

The invention discloses a mine roof and floor water inrush monitoring and prediction system and a mine roof and floor water inrush monitoring and prediction method. The system comprises a ground control room host terminal, an underground site host, a comprehensive cable assembly and a plurality of detection terminals, wherein the underground site host is connected with the ground control room host terminal; the comprehensive cable assembly is connected with the underground site host; the detection terminals are arranged on a roadway wall, a roadway floor or a roadway roof along the mine roadway direction; each detection terminal comprises a controller, a three-dimensional vibration sensor, an electrode and a memory connected with the controller; signal output ends of the three-dimensional vibration sensor and the electrode are connected with a signal input end of the controller; a data communication end of each controller is connected with the underground site host through the comprehensive cable assembly. According to the system and the method, the accuracy and the real-time property of a mine roof and floor water inrush monitoring result can be obviously improved.
Owner:WUHAN CONOURISH COALMINE SAFETY TECH

Dynamic label matching scheduling method under Hadoop Platform

The present invention discloses a dynamic label matching scheduling method under a Hadoop platform, and belongs to the field of computer software. Targeting at problems of large performance difference of Hadoop cluster nodes, randomness of resource allocation and too long execution time, the present invention provides a scheduler for dynamically matching a node performance label (hereinafter referred to as a node label) and a job category label (hereinafter referred to as a job label). A node performs initial classification and assigns a label to an original node; the node detects a performance indicator of the node to generate a dynamic node label; a job performs classification according to partial operation information to generate a job label; and a resource scheduler allocates a node resource to a job corresponding to the label. As an experimental result shows, the scheduler provided by the present invention shortens job execution time compared with the scheduler carried by YARN.
Owner:BEIJING UNIV OF TECH

Method and apparatus for rasterizing in a hierarchical tile order

A method and apparatus for efficiently rasterizing graphics is provided. The method is intended to be used in combination with a frame buffer that provides fast tile-based addressing. Within this environment, frame buffer memory locations are organized into a tile hierarchy. For this hierarchy, smaller low-level tiles combine to form larger mid-level tiles. Mid-level tiles combine to form high-level tiles. The tile hierarchy may be expanded to include more levels, or collapsed to included fewer levels. A graphics primitive is rasterized by selecting an starting vertex. The low-level tile that includes the starting vertex is then rasterized. The remaining low-level tiles that are included in the same mid-level tile as the starting vertex are then rasterized. Rasterization continues with the mid-level tiles that are included in the same high-level tile as the starting vertex. These mid-level tiles are rasterized by rasterizing their component low-level tiles. The rasterization process proceeds bottom-up completing at each lower level before completing at higher levels. In this way, the present invention provides a method for rasterizing graphics primitives that accesses memory tiles in an orderly fashion. This reduces page misses within the frame buffer and enhances graphics performance.
Owner:MICROSOFT TECH LICENSING LLC

Method optimizing sparse matrix vector multiplication to improve incompressible pipe flow simulation efficiency

ActiveCN103984527AImprove data locality and cache hit ratioFew influencing factorsConcurrent instruction executionData transmissionDecomposition
The invention discloses a method optimizing sparse matrix vector multiplication to improve incompressible pipe flow simulation efficiency. The method uses a QCST storage structure to combine with the advantages of a quadtree structure and a CSR storage structure to operate recursion decomposition and rearrangement to a sparse matrix to realize the storage of the sparse matrix, so that the sparse matrix vector multiplication operating process has the universality to the matrix form, particularly is suitable for the matrix with the whole being sparse and the local part being provided with a plurality of dense sub-matrixes. The method realizes the sparse matrix vector multiplication based on the QCSR storage structure through four strategies of thread mapping optimization, data storage optimization, data transmission optimization and data reusing optimization in a CPU / GPU (central processing unit / graphics processing unit) heterogeneous parallel system. The method has the advantages that the data locality and the cache hit rate in the sparse matrix vector multiplication value calculating process are improved, and the better calculating acceleration and the whole acceleration effect are obtained, so that the incompressible pipe flow simulation efficiency is improved.
Owner:HANGZHOU DIANZI UNIV

Data block balancing method in operation process of HDFS (Hadoop Distributed File System)

The invention discloses a data block balancing method in an operation process of an HDFS (Hadoop Distributed File System). The method comprises the following steps of: at first, pre-processing local task lists of nodes, and dividing the local task list of each node into entirely local tasks and non-entirely local tasks, so as to provide the basis for starting data block balance judgment of the HDFS; secondly, carrying out estimation and task request prediction on an operation rate of each node; thirdly, designing and realizing an assignment process of each node after completing said steps; fourthly, selecting proper nodes to move a data block between the proper nodes, so that the distribution of the data block can be matched with a predicted node task request sequence; and finally, balancing the data block. With the adoption of the data block balancing method, non-local map task execution which is possible to occur is judged by predicting the node task request in advance, and the proper data block is moved between the corresponding nodes, so that the distribution response of the local map tasks can be obtained when the nodes send an actual task request. Therefore, the completion efficiency of a Map step can be improved.
Owner:XI AN JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products