Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

231 results about "Parallel optimization" patented technology

Software performance optimization method based on central processing unit (CPU) multi-core platform

The invention provides a software performance optimization method based on a CPU multi-core platform. The method comprises software characteristic analysis, parallel optimization scheme formulation and parallel optimization scheme implementation and iteration tuning. Particularly, the method comprises application software characteristic analysis, serial algorithm analysis, CPU multi-in / thread parallel algorithm design, multi-buffer design, design of communication modes among threads, memory access optimization, cache optimization, processor vectorization optimization, mathematical function library optimization and the like. The method is widely applicable to application occasions with multi-thread parallel processing requirements, software developers are guided to perform multi-thread parallel optimization improvement on prior software rapidly and efficiently with short developing periods and low developing costs, the utilization of system resources by software is optimized, data reading and computing and mutual masking of write-back data are achieved, the software running time is shortened furthest, the hardware resource utilization rate is improved apparently, and the software computing efficiency and the software whole performance are enhanced.
Owner:LANGCHAO ELECTRONIC INFORMATION IND CO LTD

System and method for assisting customers in choosing a bundled set of commodities using customer preferences

A system and method for assisting a customer in choosing a combination of commodities based on preferences of the customer. A combination is a set of related commodities, wherein bundling discounts may be applied to particular bundles of related commodities. Combination options are created by optimizing the categories within the combination in parallel, and then selecting the best value options from each category into a grouping. The effective cost of a grouping is calculated as a total of the effective costs of each option within the grouping. The effective costs consider weighted values of performance features in addition to the actual cost of a commodity. The groupings are ranked and presented to the user, so that the user may select a grouping as combination of commodities for purchase.
Owner:GLOBYS

Design optimization method and optimization device of power assembly mounting system

The invention provides a design optimization method and an optimization device of a power assembly mounting system. The design optimization method comprises the steps of establishing a differential equation of a space six-freedom degree vibration model of the power assembly mounting system; analyzing to obtain an inherent frequency, an inherent vibration mode and vibration energy coupling among six freedom degrees according to inherent characteristics of the differential equation for the power assembly mounting system; establishing a multiple target optimization function of the power assembly mounting system according to each order inherent frequency, inherent vibration mode and vibration energy coupling; and carrying out optimization design with a particle swarm optimization algorithm. A dynamical model and the optimization function of the power assembly mounting system are established, the multiple target optimization function with reasonable distribution of mounting modal frequency and decoupling degree of energy as targets is determined, and a parallel optimization multiple target algorithm is subsequently adopted to obtain a multiple target optimization scheme set of the power assembly mounting system so that the designed power assembly mounting system can best meet the performance requirements of energy decoupling and modal distribution.
Owner:BAIC MOTOR CORP LTD

Concurrent optimization of physical design and operational cycle assignment

Some embodiments provide a method of designing a configurable integrated circuit (“IC”) with several configurable circuits. The method receives a design having several different operations for the configurable circuits to perform in different operational cycles. The method assigns the operations concurrently to different operational cycles and different configurable circuits. In some embodiments, the method concurrently optimizes the assignment of the operations to different operation cycles and different configurable circuits. In some embodiments, the optimization includes moving the operations between different operational cycles and different configurable circuits in order to identify an assignment of the operations that satisfies a set of optimization criteria.
Owner:ALTERA CORP

Data parallel processing method and system

The invention provides a data parallel processing method. The data parallel processing method comprises the following steps that 1, a main management node receives data and acquires the incidence relation of the data; 2, the main management node calculates allocatable GPUs and GPU work loads of work computing nodes; 3, the main management node partitions the data and distributes the partitioned data to all the work computing nodes; 4, the work computing nodes perform parallel processing on the received data and transmit processing results back to the main management node; 5, the main management node merges the results and then outputs the results. The data parallel processing method has the following advantages that a master-slave architectural pattern is adopted to be used for high-performance large-scale data parallel processing, operation stage partition is performed on specific operations converted by application programs according to DNA feature modeling, node granularity grade operation deployment is performed according to a partition result, and the execution efficiency of a parallel task of data flow in a single node is improved by adopting a thread parallel optimization mechanism and fully utilizing multiple computing kernels.
Owner:SHANGHAI JIAO TONG UNIV +1

Treating method and system for MDX multidimensional data search statement

The present invention discloses the method and system of processing MDX multi-dimensional data inquiry statement. The present invention includes for the user to send MDX multi-dimensional inquiry task to the inquiry treatment subsystem; for the inquiry treatment subsystem to read the MDX multi-dimensional inquiry task, to analyze directly, to optimize serially and parallelly and to decompose the MDX multi-dimensional inquiry task into several son inquiry tasks; for the inquiry treatment subsystem to send the son inquiry plan to corresponding data warehouse servers; for the data warehouse servers to execute corresponding inquiry plan and to return the inquiry result to the inquiry treatment subsystem; and for the inquiry treatment subsystem to synthesize the inquiry results and to return to the user. The present invention realizes the direct analysis and treatment of MDX language, has high inquiry efficiency, and is especially suitable for multi-dimensional data inquiry.
Owner:陈红 +5

Parallel optimizer hints with a direct manipulation user interface

A method, apparatus, and article of manufacture for directly manipulating a query for a relational database management system (RDBMS). The query is transformed into an operator tree that is displayed on a monitor, wherein the operator tree includes nodes for data sources and operators referenced in the query, and lines between the nodes. The RDBMS alters an execution plan for the query in response to one or more manipulations made to the displayed operator tree by the user. Generally, these manipulations comprise hints for an optimizer function of the RDBMS that an efficient execution plan to be generated for the query. Specifically, the hints influence the optimizer to choose one execution plan over another when there is insufficient information for the optimizer function to make a choice on its own.
Owner:TERADATA US

FPGA parallel acceleration system based on CNN image quality enhancement algorithm

The invention discloses an FPGA parallel acceleration system based on a CNN image quality enhancement algorithm. The FPGA parallel acceleration system comprises a central processing unit, a DMA controller, a bus module, an accelerator IP core module, an on-chip memory BRAM and an off-chip memory SDRAM. The central processing unit performs fixed-point quantification on the weight data of the trained convolutional neural network model to obtain quantified weight data and stores the quantified weight data in the off-chip SRDAM; the DMA controller carries the weight data pre-stored in the off-chipSDRAM and the video image data to be processed to an on-chip memory BRAM for block storage; the accelerator IP core module adopts multiplier parallel optimization and dimension conversion and streamline line line caching and shared ping design optimization operation, the central processing unit starts the accelerator IP core module and obtains data from the BRAM to carry out forward calculation of a network, and a picture obtained through calculation is carried to the off-chip SDRAM. According to the invention, the power consumption is greatly reduced, the balance of FPGA resource utilizationand operation efficiency is realized, and the video image application requirement in an actual embedded scene can be met.
Owner:SOUTHEAST UNIV

Software hybrid measure method based on trusted computing

The invention relates to a software hybrid measure method based on trusted computing. The software hybrid measure method includes the steps: preprocessing software, analyzing and inserting program source codes, extracting behavior characteristics of the software, generating a software behavior characteristic library, embedding a software integrity measure strategy and generating an executable program to be measured; measuring the software, measuring integrity by a parallel optimization algorithm when starting the executable program to be measured according to the integrity measure strategy and the software behavior characteristic library, and dynamically measuring the executable program in real time in the running process. Static software measure and dynamic software measure can be simultaneously supported, software integrity measure and real-time dynamic behavior measure are combined by the aid of technologies such as parallel optimization, strategy embedding, inserting and system calling division, and the method has fine measure efficiency and low measure expenditure.
Owner:THE PLA INFORMATION ENG UNIV

Ultra-dimension fluvial dynamics self-adapting parallel monitoring method

The invention discloses a method of super-dimensional river dynamics self-adaptive parallel monitoring, which includes the steps as following: input super-dimensional data into a system and classify according to the different dimension where the data are; create a super-dimensional unstructured grid river dynamics model based on a characteristic-type high-resolution numerical algorithm; in terms of an efficient parallel algorithm in a super-dimensional fluid splitting scheme, perform intra-dimensional and inter-dimensional calculations; the calculation region is divided into a plurality of sub-regions, each sub-region is mapped on a calculation node on the parallel system structure, the communication between the nodes uses a standard message passing interface, the overlapped parallel optimization technique of calculation and communication in the self-adaptive grid, and the calculation of variables associated with the space is independent. The method in the invention puts the super-dimensional river dynamics into the adaptive grid to execute the efficient parallel calculation of splitting scheme, and simultaneously processes the change of dimension; the method realizes the monitoring of river conveniently, timely and high accurately.
Owner:SHENZHEN INST OF ADVANCED TECH

Improved fuzzy C-mean clustering method based on quantum particle swarm optimization

The invention relates to a clustering method, in particular relates to an improved fuzzy C-mean clustering method based on quantum particle swarm optimization, and belongs to the technical field of data mining and artificial intelligence. The improved fuzzy C-mean clustering method comprises the steps of: firstly, based on the conventional fuzzy C-mean clustering algorithm, improving the fuzzy accuracy of the conventional clustering algorithm by using a novel distance standard in place of a Euclidean standard; meanwhile classifying singly and quickly through using an AFCM (Adaptive Fuzzy C-means) algorithm to replace a randomly distributed initial clustering center to reduce the sensitivity of the clustering algorithm on the initial clustering center; and finally, introducing a QPSO (AQPSO (Adaptive-Quantum Particle Swarm Optimization)) parallel optimization concept based on distance improvement in a clustering process, so that the clustering algorithm has relatively strong overall search capability, relatively high convergence precision, and can guarantee the convergence speed and obviously improve the clustering effect.
Owner:重庆高新技术产业研究院有限责任公司

Quick digital image correlation measurement method based on stochastic parallel gradient descent optimization technology

The invention provides a quick digital image correlation measurement method based on stochastic parallel gradient descent optimization technology, comprising the following steps of comprehensively considering related parameters such as displacement, differential coefficient and the like of any one point in a speckle field of an object to be measured; and utilizing the stochastic parallel optimization technology to realize quick digital image correlation measurement. In the method, by adopting stochastic parallel disturbance on a deformation parameter, the correlation coefficient is convergentto a global unique extremum, thus obtaining the deformation parameter. The method has a simple principle, can be realized easily, is a DIC (digital image correlation) measurement method with totally new concept, can realize the aim of quickly measuring DIC with high precision and high reliability, and is expected to realize the real-time online measurement on the DIC.
Owner:NAT UNIV OF DEFENSE TECH

Acquisition method of three freedom-degree transportation industrial robot multiple-objective optimization design parameter

The invention relates to a method for multiobjective optimization design of industrial robots, which consists of three steps: firstly, obtaining four performance indexes representing mechanical arm working space, strength, control energy and control time and a computing method thereof through establishment of a mechanical arm kinematics model, a strength analysis model, a dynamic model based on an electromechanical coupling system and a systemic closed-loop model controlled by inverse dynamics; secondly, selectively optimizing robot design parameters related to the four performance indexes to establish a multiobjective optimization design model; and finally, through a control method, adjusting mechanical arm design parameters, optimizing four objectives parallelly, and finally obtaining design values of the design parameters meeting the four performance requirements simultaneously so as to provide a method for overall improving the performance of the industrial robots.
Owner:SOUTHEAST UNIV

Distributed database multi-join query optimization algorithm

The invention provides a query optimization algorithm in the technical field of database, and is mainly used for solving the problem of distributed database multi-join query optimization. The technical scheme adopted by the invention is as follows: 1. pre-optimizing over-ternary relation join, and reducing searching space optimized by the operation sequence of the relation join; 2. formulating a pretreatment rule, and merging all relation joins after pre-optimization; 3. loading database statistic information, evaluating load of each processor, taking balanced load and minimum transmission cost between processors as a target, and adopting a graph partitioning method to distribute the relation joins to multiple processors for optimization. The invention can reduce the searching space optimized by the operation sequence of the joins by pre-optimizing multi-join, uses a collateral mechanism to reduce the scale of the optimization subproblem, and improves the efficiency of the multi-join query optimization effectively.
Owner:山东省标准化研究院

Picture searching method based on content and parallel optimization technique thereof

InactiveCN102141994AIn line with visual habitsOptimize the key parts of the search systemSpecial data processing applicationsFeature vectorFeature extraction
The invention discloses a picture searching method based on content and a parallel optimization technique and relates to an internet picture search engine technique, which aims to accurately and rapidly search pictures which are similar to the content of a picture submitted by a user. The user searches pictures according to a picture, submits a picture to be queried, and a picture search system returns e pictures which are similar to the picture visually to the user. The picture searching method disclosed by the invention comprises two parts: front end query and rear end processing, wherein a front end comprises a user input interface and a result return interface; the rear end comprises characteristic extraction, similarity computation, characteristic vectors dimension reduction and indexing. By developing the parallelism of the search system, the performance of the whole picture search system is optimized on the aspects of serial and parallel, thereby enhancing the corresponding speed of query.
Owner:苗乾坤

Convolution neural network algorithm optimization method and device based on Neon instruction

The invention provides a convolution neural network algorithm optimization method based on the Neon instruction. The method comprises the steps that matrix processing is carried out on the convolutionkernel image of a convolution layer to acquire a corresponding A matrix, and the A matrix columns are aligned according to the multiple of four; a to-be-convoluted image is input; matrix processing is carried out on the input image to be convoluted to acquire a corresponding B matrix; the B matrix rows are aligned according to the multiple of four; the B matrix is transposed to acquire a transposed matrix Bt; the row and row dot product of the A matrix and the Bt matrix is calculated; and the Neon instruction is used for parallel optimization. Compared with the prior art, the method providedby the invention can effectively improve the computing performance of the convolution neural network.
Owner:BEIJING ICETECH SCI & TECH CO LTD

ETL (Extract Transform Load)-based data optimization method and equipment

The embodiment of the invention provides an ETL (Extract Transform Load)-based data optimization method and ETL-based data optimization equipment. The method comprises the following steps of: previously arranging a plurality of data processing units according to a data extract, transform and load process ETL; previously setting a communication mechanism for the data processing units; acquiring instruction information including source data input by a user; constructing a data processing flow corresponding to the instruction information according to the source data; and optimizing the data processing flow according to the data processing units and the preset communication mechanism. By previously setting the data processing units and the communication mechanism, simplified optimization, branch parallel optimization and parallel optimization between records of data are realized, the processing efficiency of data optimization is increased, and hardware resources are saved.
Owner:BEIJING JOIN CHEER SOFTWARE

Mud-rock flow disaster process rapid simulation and visualization analysis method in network environment

The present invention belongs to the geographic information system virtual geographical environment research field, and especially relates to a mud-rock flow disaster simulation and visualization analysis technology. The present invention provides a mud-rock flow disaster process rapid simulation and visualization analysis method in a network environment. The method performs tight integration of the model, the visualization and the analysis and provides a parameter visualization arrangement interface to facilitate obtaining and setting of parameters; and moreover, the mud-rock flow disaster process rapid simulation and visualization analysis method in the network environment employs a parallel optimization method and a scale optimization selection method to greatly improve the accuracy and efficiency of the simulation calculation, the visualization and the analysis of the mud-rock flow disaster, constructs the network service and provides sharing and publishing of the disaster situation information so as to effectively support the emergency disposal of the mud-rock flow disaster.
Owner:朱军

Bimodal fusion tomography method based on iterative shrinkage

The invention belongs to the field of medical molecular imaging, and relates to an autofluorescence tomography and computed tomography bimodal fusion method, in particular to a bimodal fusion tomography method based on iterative shrinkage. The technology is used for quantifying the intensity of a light source in a reconstructed target body and positioning the light source, and solving the problemof negative direction of all internal light intensity distributions acquired by inversing the limited light intensity distributions on the surface of the target body. The technical scheme has the main points that: the surface light intensity information obtained in autofluorescence tomography and the internal geometric structure information obtained in computed tomography are fused, a complicatedmulti-dimension optimization process in reconstruction is converted into a one-dimension parallel optimization high-efficiency cyclic process by iterative shrinkage, and accurate reconstruction results of regular parameter, lp norm, noise and initial value robustness are acquired integrally. The technology can be effectively applied to research on the systemic physiologic metabolism of a target body, has a high reconstruction efficiency and is suitable for a condition with lower imaging system performance.
Owner:INST OF AUTOMATION CHINESE ACAD OF SCI

Method and system for selecting optimal commodities based upon business profile and preferences

A system and method for assisting a customer in choosing a combination of commodities based on preferences of the customer. A combination is a set of related commodities, wherein bundling discounts may be applied to particular bundles of related commodities. Combination options are created by optimizing the categories within the combination in parallel, and then selecting the best value options from each category into a grouping. The effective cost of a grouping is calculated as a total of the effective costs of each option within the grouping. The effective costs consider weighted values of performance features in addition to the actual cost of a commodity. The groupings are ranked and presented to the user, so that the user may select a grouping as combination of commodities for purchase.
Owner:GLOBYS

Parallel optimization method of low-illumination image enhancement based on CUDA

InactiveCN104881848AImprove parallelismMeet the real-time processing effectImage enhancementIlluminanceTherapeutic effect
The invention discloses a parallel optimization method of low-illumination image enhancement based on CUDA. By adopting a CPU / GPU heterogeneous mode, low-illumination image enhancement algorithms are all performed on GPU, and input data and output data are copied between CPU and GPU. Three kernels are used on GPU; each kernel is provided with threads, the number of which is the same as that of image pixels; and the three kernels are responsible for image inversion and estimation of dark channel prior, estimation of overall atmospheric light, and calculation of transmissivity and haze-removal model and inversion operation of the image, respectively. In the low-illumination image enhancement algorithm based on dark channel prior haze-removal technology, calculation, not suitable to perform on GPU, of overall atmospheric light is improved; the atmospheric light value is estimated by using the brightness of dark channel prior and the image; and data relevance is reduced. According to the invention, a visual effect of a night image is improved, and a real-time treatment effect is achieved.
Owner:XIDIAN UNIV

A supercomputer-based optimization method for fluid machinery simulation program

The invention discloses a supercomputer-based optimization method for a fluid machinery simulation program. According to the architecture and programming characteristics of Shenwei-Taihu Light supercomputing system and the characteristics of fluid mechanical simulation program, a feasible optimization scheme of the system is proposed, which includes block-based multi-core parallel optimization, DMA transmission optimization, data layout optimization, double buffer optimization, SIMD vectorization optimization and register communication optimization in turn. This method provides a general optimization method for the developer who develops, transplants or optimizes the fluid mechanical simulation program for the Shenwei-Taihu Lake Light supercomputing platform, realizes the full utilizationof Shenwei-Taihu Light computing resources, improves the program performance, and shortens the simulation time.
Owner:XI AN JIAOTONG UNIV

Fast image interpolation method for mobile terminal

InactiveCN105023241AImprove interpolation qualitySmooth interpolationGeometric image transformationVideo monitoringImage resolution
The present invention discloses a fast image interpolation method for a mobile terminal. The fast image interpolation method comprises the steps of: firstly, performing edge detection on a low-resolution original image to obtain edge information, calculating the strength of an edge according to the edge information of the low-resolution original image and a human visual system, expanding the edge according to the strength, dividing the image into an edge region and a non-edge region according to the expanded edge, processing the edge region by adopting an interpolation algorithm with relatively high fidelity, and storing the edge information; and secondly, processing the non-edge region by adopting a faster bicubic interpolation algorithm, then further sharpening the edge of an interpolation image according to the existing edge information to reduce the blurriness of the edge and prompt the visual quality of the image, and at last combining the interpolation algorithm with an NEON parallel technology to obtain a high-resolution image subjected to parallel optimization. The result of the application of the technology to a mobile video monitoring system shows that the technology can ensure that multimedia applications of a mobile phone can smoothly interpolate high-resolution images for playing.
Owner:SOUTH CHINA UNIV OF TECH

OpenCL-based parallel optimization method of image de-noising algorithm

The invention discloses an openCL-based parallel optimization method of an image de-noising algorithm. According to an idea of image layering, an image is divided into a high-contrast base layer and a low-contrast detail layer by using a combined dual-side filtering algorithm and a combined WLS algorithm; de-noising processing is carried out on the detail layer by using stockham FF; and then image restoring is carried out by changing frequency spectrum contraction and image adding methods, thereby realizing the de-noising effect. According to the invention, on the basis of characteristics of large execution function processing data volume and high data parallelism of the base layer obtaining and detail layer de-noising processing, the openCL platform model is used and parallel calculation of the image de-noising algorithm is realized on GPU; and then details of the calculation process are modified, wherein the modification processing contains local internal memory usage, proper working team size selection, and parallel reduction usage and the like. The speed-up ratio of the de-noising algorithm that is realized finally can reach over 30 times; and thus the practicability of the algorithm can be substantially improved.
Owner:XIDIAN UNIV

Method for realizing automatic pipeline parallelism

InactiveCN101944014ABalance workloadAdded optimization capabilities for automatic parallel optimizationConcurrent instruction executionArray data structureThread scheduling
The invention belongs to the technical field of program compilation and in particular relates to a method for realizing automatic pipeline parallelism. The method of the invention mainly comprises the following steps of: (1) identification of the pipeline parallelism, namely judging a loop structure which is provided with cross-loop iteration dependence and a dependence distance vector is a constant; (2) synchronization among threads, namely inserting the synchronization according to the dependence distance vector and deleting the redundant synchronization with the same distance vector; and (3) thread scheduling in a static step length, namely self-defining a thread scheduling strategy for balancing the workload of each thread and reducing the communication expense. The type identification of the loop structure is depended on the conventional array data stream analysis and dependence tests, while the pipeline parallelism only processes the regular loop structure with backward cross-loop iteration. The synchronization expense of the pipeline parallelism is high, so the pipeline parallelism is only performed on the outmost layer of a nested loop. Profit of the pipeline parallelism depends on programs, the number of the cyclic iteration is larger and the dependence distance is longer, the performance promotion is greater. The method for realizing the pipeline parallelism improvesthe capacity of automatic parallel optimization and contributes to further improving the performance of scientific calculation programs.
Owner:FUDAN UNIV

Mixed intelligent boiler comprehensive combustion optimization method

The invention discloses a mixed intelligent boiler comprehensive combustion optimization method. A simple and practical index positively related with boiler efficiency is established aiming at the problems in boiler combustion efficiency and coal mill power consumption optimization, the index is combined with a coal mill power consumption index, the boiler comprehensive combustion optimization method with high learning capacity is provided, and economical efficiency is optimized. According to the technical scheme, through data acquisition of a boiler, a model is established aiming at the index of boiler combustion efficiency and the coal mill power consumption index, parallel optimization algorithm optimizing and other means are applied, and the boiler comprehensive combustion optimization method is determined; by means of the method, the efficiency of boiler comprehensive combustion optimization can be effectively improved, and offline optimization and online real-time combustion optimization can be carried out.
Owner:HANGZHOU DIANZI UNIV

Layering modeling and optimizing method targeting complicated manufacture system

The invention provides a layering modeling and optimizing method targeting a complicated manufacture system, which is used for the total target optimization of a layering manufacture system comprising a plurality of elements. The method comprises the following steps of: (1) dividing the manufacture system into a layering system comprising a plurality of elements, wherein each element comprises a computer and carries out data exchange with other elements by a computer interface; (2) confirming the reaction, a contact variable and a local variable of all the elements and leading all elements to contact mutually; (3) establishing models of all the elements, which comprise an optimizing design model and an analyzing model; and (4) carrying out optimization solving on the models of the layering manufacture system. The invention can realize the modeling and optimizing unification of the layering manufacture system comprising the elements and has the advantages of consistence with the traditional manufacture system topological structure, capability of parallel optimization, unlimited layering grade number, and the like, and all the elements can select different optimization algorithms, therefore, different optimization algorithms can be integrated into one system.
Owner:SOUTH CHINA UNIV OF TECH

Deep convex network with joint use of nonlinear random projection, restricted boltzmann machine and batch-based parallelizable optimization

A method is disclosed herein that includes an act of causing a processor to access a deep-structured, layered or hierarchical model, called deep convex network, retained in a computer-readable medium, wherein the deep-structured model comprises a plurality of layers with weights assigned thereto. This layered model can produce the output serving as the scores to combine with transition probabilities between states in a hidden Markov model and language model scores to form a full speech recognizer. The method makes joint use of nonlinear random projections and RBM weights, and it stacks a lower module's output with the raw data to establish its immediately higher module. Batch-based, convex optimization is performed to learn a portion of the deep convex network's weights, rendering it appropriate for parallel computation to accomplish the training. The method can further include the act of jointly substantially optimizing the weights, the transition probabilities, and the language model scores of the deep-structured model using the optimization criterion based on a sequence rather than a set of unrelated frames.
Owner:MICROSOFT TECH LICENSING LLC

Predicate-based automatic parallel optimizing method

InactiveCN101944040AEliminate data dependenciesAutomatic Parallel Optimization ImplementationProgram controlMemory systemsData streamArray data structure
The invention belongs to the technical field of program compilation, in particular to a predicate-based automatic parallel optimizing method. The method mainly comprises the following: (1) a step of predicate establishment, which is to establish a parallel predicate of a program by using different kinds of known information of a user program, and remove simple dependence of the program; and (2) a step of parallel loop structure establishment, which is to perform subsequent parallel analysis and judge whether to use the parallel predicate or not under the restriction of a predicate condition. The parallel predicate establishment is based on the traditional array data flow analysis and loop dependence test. The loop simple dependence caused by imprecise loop information is eliminated through predicate establishment, so that the analysis range and the parallel optimization effect of the traditional automatic parallel optimization are widened and improved respectively. In the actual execution of the program, if the predicate is not satisfactory, the program executes the original serial version and the increased judgment operation and skip operation hardly influence the overall performance of the program; and if the predicate is satisfactory, the parallel version of the loop structure is executed, thus the program performance is obviously improved.
Owner:FUDAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products