Text topic model LDA high-performance computing method based on CPU-GPU collaborative parallelism

A high-performance computing and topic model technology, applied in computing, unstructured text data retrieval, text database clustering/classification, etc., can solve problems such as low computing efficiency and single platform

Active Publication Date: 2019-11-05
WUHAN UNIV
View PDF3 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0007] In view of this, the present invention provides a high-performance computing method based on CPU-GPU cooperative parallel text topic model LDA, which is used to solve or at least partially solve the problem of single implementation platform and low computing efficiency in the existing methods in the prior art. technical problem

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text topic model LDA high-performance computing method based on CPU-GPU collaborative parallelism
  • Text topic model LDA high-performance computing method based on CPU-GPU collaborative parallelism
  • Text topic model LDA high-performance computing method based on CPU-GPU collaborative parallelism

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0064] This embodiment provides a high-performance computing method based on CPU-GPU cooperative parallel text topic model LDA, please refer to figure 1 , the method includes:

[0065] Step S1: Based on the dynamic programming algorithm, optimize the allocation of two heterogeneous computing resources, CPU and GPU, and obtain an optimal allocation plan for resources.

[0066] Specifically, in a CPU-GPU heterogeneous system, reasonable resource allocation is crucial to efficiently utilizing the system's computing power. When the present invention uses a dynamic programming algorithm for resource allocation, on the CPU side, the calculation threads and task allocation threads can be reasonably allocated according to the number of threads supported by the CPU; on the GPU side, GPU hardware resource constraints, algorithm storage requirements and general The GPU programming optimization rules transform the problem of optimal allocation of GPU computing resources into a problem of...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a text topic model LDA high-performance computing method based on CPU-GPU collaborative parallelism, and the method comprises the steps: firstly achieving the optimal configuration of two types of heterogeneous computing resources: a CPU and a GPU then, completing GPU performance evaluation based on the logarithm function model, and completing text data optimal granularitydivision; realizing CPU-GPU cooperative parallel computing of the hidden Dirichlet allocation model based on an exponential random cellular automaton algorithm; and furthermore, carrying out self-adaptive heterogeneous scheduling between the CPU and the GPU based on an improved greedy strategy to realize load balancing. According to the method, high-performance modeling of the text topic model isachieved, topic information implied in the text can be found quickly, and therefore the efficient processing requirements of applications such as massive document set classification and text data streaming computing are met.

Description

technical field [0001] The invention relates to the technical field of high-performance computing in heterogeneous environments, in particular to a high-performance computing method based on CPU-GPU cooperative parallel text topic model LDA. Background technique [0002] With the rapid development of the Internet, a large amount of network texts rich in implicit information (such as Weibo, product reviews, and news reports) are constantly being produced, which has become a kind of basic data that is widely valued. Text topic extraction is an important step in text data mining. Among them, hidden Dirichlet allocation model (LDA) is a classic topic model, and a large number of model variants have been produced, which are widely used in text topic extraction, document collection classification and other calculations. scene. However, the standard LDA model requires a large number of iterative calculations, and the computational complexity is proportional to the amount of data. ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/35G06F9/48G06F9/50
CPCG06F9/4843G06F9/5088G06F16/35
Inventor 李锐王鸿琰舒时立
Owner WUHAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products