Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Text Phrase Weight Calculation Method Based on Semantic Network

A technology of semantic network and weight calculation, which is applied in computing, semantic analysis, semantic tool creation, etc., can solve limitations and other problems, and achieve the effects of reducing text noise, improving text signal-to-noise ratio, and improving accuracy and robustness

Active Publication Date: 2017-09-19
ZHEJIANG UNIV
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Its characteristics are simple and intuitive and the processing speed is fast, but this method has great limitations in theory and practical application

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Text Phrase Weight Calculation Method Based on Semantic Network
  • A Text Phrase Weight Calculation Method Based on Semantic Network
  • A Text Phrase Weight Calculation Method Based on Semantic Network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0075] A method for calculating weights of text phrases based on semantic networks, comprising the following steps:

[0076] 1) Read in a text, remove the stop words in the text, regard the phrase in the document as a node of the semantic network, and construct a two-way semantic network according to the line pair position of the phrase in the text.

[0077] 2) The connection between two phrases is regarded as the edge of the semantic network, and the weight of the edge can be calculated using the following formula:

[0078]

[0079] In the formula, Edge(i,j) is the weight of the edge connecting node i and node j, 1( ) is an indicator function, it takes 1 when the condition is met, and takes 0 when the condition is not met, and N is the number of phrases in the text . At this time you can get figure 1 Bidirectional Semantic Network with Edge information added as shown.

[0080] 3) Normalize the obtained matrix Edge by row,

[0081]

[0082] In the formula, M is the n...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a text phrase weight calculation method based on a semantic network. The text phrase weight calculation method comprises the following steps of firstly removing a stop word in a text, constructing the semantic network according to the text obtained after word selection, taking one phrase in the text as a node of the semantic network, then, using a random walk method for calculating the probability of randomly walking from one node to another node within the limited steps to obtain the probability of all the nodes, finally removing one node, calculating the probability of all the nodes again, calculating the difference between the two probabilities, and using the difference value as the weight of the phrase in the text. According to the text phrase weight calculation method based on the semantic network, on the basis of the graph theory and the Markov chain theory, the text is converted into a graph and modeled as a Markov chain for analysis, the relative position information of the phrase in the text is utilized, and the accuracy of phrase weight calculation is improved. According to the text phrase weight calculation method based on the semantic network, the weight of the phrase can be effectively calculated according to the actual text, the noise reduction function of the text is achieved, and the signal to noise ratio is improved.

Description

technical field [0001] The invention belongs to the field of text classification and relates to a method for calculating the weight of phrases in the text. Background technique [0002] Text classification is one of the important branches in the field of data mining. However, how to represent a text in the vector space, that is, how to adjust the weight of phrases in the text, restricts the accuracy of text classification. Since the actual document has high noise, simply using word frequency to describe the document will submerge part of the information in the noise. An excellent phrase weighting method must be able to effectively improve the signal-to-noise ratio of the text and realize the noise reduction function of the text. In recent years, many phrase weight constructors have been proposed, but mainly based on Vector Space Model (VSM). [0003] The basic idea of ​​the vector space method is to use the bag-of-words model to represent text, treat each phrase in the cor...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/35G06F16/36G06F40/30
Inventor 于慧敏孙孟孟
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products