Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Short text classification method based on semantic graphs

A classification method and semantic map technology, applied in special data processing applications, instruments, electronic digital data processing, etc., can solve the problems of lack of root features, loss of semantic connection of concepts, and no distinction of contribution.

Inactive Publication Date: 2012-07-18
XIDIAN UNIV
View PDF2 Cites 26 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

As an effective representation method for English text, VSM has certain defects and deficiencies in the representation method of Chinese text: (1) lack of root features, the text is usually represented as a high-dimensional sparse vector; (2) the information entropy contained in different words and its There is no distinction between the contribution of the document topic; (3) the rich meaning of Chinese words makes there are not many identical words in documents with the same or similar semantics, and the natural semantic connection between concepts is lost in the text representation
But these methods are more or less flawed

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Short text classification method based on semantic graphs
  • Short text classification method based on semantic graphs
  • Short text classification method based on semantic graphs

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0048] See Figure 1-Figure 4 , a short text classification method based on semantic graph, the steps are as follows:

[0049] Step A, constructing a text semantic graph model for each piece of text information, and merging each text semantic graph model;

[0050] Step B, using a similarity calculation method for the text semantic graph model to compare the similarity between different texts;

[0051] Step C, according to the degree of text similarity, use a text semantic graph classifier to classify.

[0052] In this embodiment, step A includes the following steps:

[0053] Step A-1, constructing the core words of each statement and tabulating statistics;

[0054] Step A-2: Based on the sentence core word list, construct the text semantic graph model corresponding to each sentence, and then merge the text semantic graph models of each sentence to output the text semantic graph model of the entire article.

[0055] Step A-2 comprises the following steps:

[0056] Step A-2...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a short text classification method based on semantic graphs, which is characterized by including the steps: A, constructing a semantic graph model for each piece of text information, and combining all semantic graph models; B, comparing similarity level among different texts according to the semantic graph models and by means of a similarity computing method; and C, according to the text similarity level, classifying the texts by the aid of a text semantic graph classifier. The short text classification method based on the semantic graphs has the advantages that semantic connotations of documents can be highlighted to a maximum degree by using the graph models to represent the texts, latent semantic information and theme features in the texts can be accurately described to a great extent by the aid of the TSG (text semantic graph) models constructed by the method, and the TSG classification method can be more reliable and efficient in use as compared with other classification methods by means of the feature, so that human cost is greatly reduced, artificial arrangement of the text information is avoided to a great extent, and the text information is automatically organized by a computer.

Description

technical field [0001] The invention relates to the field of language processing, representation and text classification, in particular to a semantic graph-based short text classification method. Background technique [0002] The rapid change of Internet technology has brought human society into an era of extremely rich and rapidly updated information. Especially in recent years, with the emergence of various social networks, massive text information is continuously generated and disseminated every day. These text information are usually composed of short discourse composition. People have to deal with massive information resources every day, but using manual labor is very inefficient. Therefore, the problem we urgently need to solve is: how to make better use of the potential semantic information in the massive information on the Internet to efficiently organize and classify the text information, so as to manage and maintain the massive text more effectively. In recent ye...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 宋胜利陈平
Owner XIDIAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products