Method and system for filtering and classifying short messages

A classification method, SMS technology

Inactive Publication Date: 2010-07-21
北京炎黄新星网络科技有限公司
View PDF3 Cites 52 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Moreover, the currently commonly used short message filtering function is to completely filter the overall spam short messages without distinction, and cannot be customized for users. Should not be treated as spam

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for filtering and classifying short messages
  • Method and system for filtering and classifying short messages
  • Method and system for filtering and classifying short messages

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0020] The invention provides short message filtering and the method step of secondary classification as follows:

[0021] Step 1, preprocessing the short message text (keyword processing, black and white list processing).

[0022] Before word segmentation, it is first necessary to preprocess the text message content, including deletion, standardization, marking and other processing content. Preprocessing can play the role of semantic segmentation, improve the accuracy of word segmentation, mark some important features of spam content, and lay the foundation for subsequent analysis.

[0023] Firstly, delete or mark the invalid part in the SMS content, reduce interference and improve the efficiency of follow-up processing.

[0024] Unified conversion for SMS content, such as converting full-width digital symbols into unified half-width standard digital symbols, and identifying some special changes in SMS content, such as "O" for "0", "I" for "1" and so on.

[0025] Extract an...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention creatively provides a spam short message filtering method which is based on a mode of transmission quantity characteristics and short message content characteristics and combines a Chinese character regular expression and an improved bayesian algorithm on the basis of traditional short message filtering. At the same time of improving the identification accuracy rate of spam short messages, the false report rate and the missing report rate of the spam short messages are reduced, and meanwhile, the spam short messages are classified for a second time so as to be convenient for the personalized setting of users. The method comprises the following steps of: (1) preprocessing short message texts; (2) matching transmission quantity: matching transmission content and a transmission quantity; (3) carrying out morphology word segmentation by using the Chinese character regular expression and a dictionary and word property method; (4) classifying by using a spam short message classifier: calculating the probability through the improved bayesian algorithm and identifying the spam short messages and non-spam short messages by using a short message characteristic rule defined by the Chinese character regular expression; and (5) using the classification of a short message type affiliation classifier to classify and process the identified spam short messages.

Description

Technical field: [0001] The invention is used for intercepting spam short messages, and in particular relates to a method and a system for filtering and secondary classification of short messages in a short message center of a telecommunication operator. Background technique: [0002] Mobile text messages have become a very important form of communication for Chinese people. However, we have to face the harassment of "spam text messages" at any time while enjoying the convenience between our thumbs. Spam text messages not only bring us harassment, but more seriously, spam text messages have become a tool for some criminals to disseminate and disseminate illegal and criminal information. [0003] At present, the commonly used SMS filtering methods and mechanisms mainly include: filtering based on keywords, filtering based on content, filtering based on SMS sending volume and sender analysis, etc. Most of the filtering methods follow the general spam processing methods, such ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): H04W4/14G06F17/30
Inventor 柳呈文
Owner 北京炎黄新星网络科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products