A parallel inference method for large-scale Bayesian networks based on mapreduce

A technology of Bayesian network and reasoning method, applied in the direction of electrical digital data processing, special data processing applications, instruments, etc., can solve the problems of low reasoning efficiency and large amount of calculation, and achieve efficient reasoning, easy method, good scalability sexual effect

Active Publication Date: 2017-02-01
云南云商汇网络科技有限公司
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Aiming at the shortcomings of low reasoning efficiency and large amount of calculation caused by the large number of nodes or the large number of conditional probability parameters of each node in a large-scale Bayesian network, with the main goal of overcoming the efficiency bottleneck, the distributed database HBase is used to store large-scale Bayesian networks. Yesnet, establish the relationship between HBase query processing and Bayesian network reasoning tasks, and realize the parallel reasoning of Bayesian network based on MapReduce

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A parallel inference method for large-scale Bayesian networks based on mapreduce
  • A parallel inference method for large-scale Bayesian networks based on mapreduce
  • A parallel inference method for large-scale Bayesian networks based on mapreduce

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0032] Example: "Credit Card Fraud Detection" Bayesian Network Inference

[0033] (1) Distributed storage of Bayesian network

[0034] Bayesian Networks for Storage "Credit Card Fraud Detection" T BN nodes in F , G , J , A , S , use the Map function to read each value in the conditional probability parameter table of the node in parallel, and store it in the form of to T BN middle. for F , if any P ( F = f 1 )=0.1 and P ( F = f 2 )=0.9, then store in the HBase database T BN The rows in the table start with F As the row identifier, the column family is ( F = f 1 || 0.1) and ( F = f 2 || 0.9). Store "credit card fraud detection" Bayesian network T BN As shown in Table 1.

[0035] Breaking down probabilistic reasoning tasks

[0036] If the evidence node value is known ( A = a 4 , J = j 1 ), query node value F = f 1 , the reasoning task is: calculate P ( F = f 1 | A = a 4 , J = j 1 ). because , thus transforming this inferenc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a large-scale Bayesian network parallel inference method based on the MapReduce. The large-scale Bayesian network parallel inference method based on the MapReduce aims at overcoming the defects of low inference efficiency, the large calculated amount and the like brought by a large number of nodes in the Bayesian network or multiple conditional probability parameters of all the nodes; breaking the efficiency bottleneck is taken as the main objective, the large-scale Bayesian network is stored by utilizing a distributed database HBase, the relation between HBase query processing and a Bayesian inference task is established, and parallel inference of the Bayesian network is achieved based on the MapReduce. The method better conforms to features of practical problems in the field of data analysis, medical diagnosis, industrial control, economic forecast and the like, has the better goodness of fit, can remove the limitation on the number of the nodes of the Bayesian network, and provides a supporting technology for expression, inference, application and the like of uncertain knowledge.

Description

technical field [0001] The invention discloses a large-scale Bayesian network parallel reasoning method based on MapReduce, which involves storing large-scale Bayesian networks in a distributed database HBase based on MapReduce, and converting probabilistic reasoning of Bayesian networks into data on HBase Query processing, and a method for implementing Bayesian network probabilistic inference based on MapReduce. It belongs to the field of artificial intelligence and information processing. Background technique [0002] With the increasing diversity of data collection methods and data formats, and the rapid growth of data scale, the expression, understanding and application of knowledge contained in data have attracted more and more attention. Bayesian Network (Bayesian Network) uses a graphical model to express uncertainty knowledge that includes both probability distribution and causality, and expresses the interdependence between random variables in a qualitative and qua...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
CPCG06F16/24532G06F16/27
Inventor 岳昆徐娟方启宇张骥先田凯琳刘惟一
Owner 云南云商汇网络科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products