Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and system for classifying short message text

A classification method and classification system technology are applied in the field of classification methods and systems for text messages, which can solve the problems of large amount of calculation and storage space occupied, and it is difficult to meet the real-time performance of classification of text messages, and achieve the effect of real-time classification.

Active Publication Date: 2017-11-10
CHINA UNITED NETWORK COMM GRP CO LTD
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In the existing SMS text classification methods, the class library is formed by manual accumulation or clustering. Due to the large number of SMS samples in the class library, the amount of calculation and the occupied storage space in the process of forming the class library are relatively large. Large, so it is difficult to meet the real-time performance of SMS text classification

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for classifying short message text
  • Method and system for classifying short message text
  • Method and system for classifying short message text

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0034] In order to facilitate the understanding of those skilled in the art, the present invention will be further described below in conjunction with the accompanying drawings, which cannot be used to limit the protection scope of the present invention.

[0035] see figure 1 , the present invention proposes a kind of short message classification method, comprising:

[0036] Step 100, calculating feature vectors of all short message samples on a distributed file system (HDFS, Hadoop Distributed File System).

[0037] In this step, the original storage format of the SMS sample on HDFS is mobile phone number + SMS content. For example, 1309461xxxx|The set starts from 7380 yuan / ㎡ to buy a thousand-acre large-scale property in Jiangbei [Peking University Resources·Jiangshan Famous Gate] 104-143㎡ first-line view of the river house in the set, with light rail, famous schools, and view of Jiangshan 81958888.

[0038] In this step, when storing the SMS samples on HDFS in the origina...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and system for classifying short message texts. The method includes the steps of calculating feature vectors of all short message samples on an HDFS, judging unclassified short message samples on the HDFS and classifying the unclassified short message samples according to the feature vectors obtained through calculation, saving categories of the short message samples and the feature vectors obtained through calculation to form the first kind of library on the HDFS, converting the first kind of library into the second kind of library supported by a steam-oriented computation system, and classifying short messages to be classified through the steam-oriented computation system according to the second kind of library. The short message texts can be classified in real time through the method and system.

Description

technical field [0001] The invention relates to short message text processing technology, in particular to a short message text classification method and system. Background technique [0002] In the era of Internet big data, real-time processing and analysis of user behavior is an important application aspect. Taking text message processing as an example, due to the proliferation of spam text messages, including fraudulent text messages, advertising sales, reactionary information, etc., which have brought great harm to users, operators need to filter spam text messages by identifying the content of text messages. The timeliness of the short message determines that it must be processed and sent within a relatively short period of time, which puts forward higher requirements for the real-time performance of the processing system. [0003] The existing methods for classifying short message texts are as follows: a class library of short message samples is formed in advance, and...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
CPCG06F16/182G06F16/35
Inventor 李浩罗云彬王志军王伟华
Owner CHINA UNITED NETWORK COMM GRP CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products