Classification and identification method and device for junk short messages, computer equipment and storage medium
A spam message classification and identification technology, applied in the field of data processing, can solve problems such as poor effect of spam text messages, poor classification results of spam text messages, irregular writing of spam text messages, etc., and achieve the effect of accurate classification identification and accurate extraction
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0023] figure 1 It is a flow chart of a method for classifying and identifying spam text messages provided by Embodiment 1 of the present invention. This embodiment is applicable to identifying spam text messages in massive text messages, classifying spam text messages, and extracting entity information in spam text messages In the case of the situation, the method can be executed by a device for classifying and identifying junk messages, which can be implemented by software and / or hardware, and generally integrated in computer equipment.
[0024] Such as figure 1 As shown, the technical solution of the embodiment of the present invention specifically includes the following steps:
[0025] S110. Perform text filtering on the short message text collection to obtain a spam short message text collection.
[0026] Wherein, the short message text collection includes a plurality of short message texts obtained from the short message platform, and the short message text collection ...
Embodiment 2
[0044] figure 2 It is a flow chart of a method for classifying and identifying spam text messages provided by Embodiment 2 of the present invention. On the basis of the above-mentioned embodiments, the embodiment of the present invention performs the process of text filtering and classifies the text collection of spam text messages into multiple categories of spam text messages The process of text collection and entity information extraction is further specified, and the process of whitelist and / or blacklist filtering is added before spam classification, and the process of text preprocessing is added after text filtering .
[0045] Correspondingly, such as figure 2 As shown, the technical solution of the embodiment of the present invention specifically includes the following steps:
[0046] S210. According to the tagged training short message text collection and the constructed variant font library, train the machine learning model to obtain an entity information extractio...
Embodiment 3
[0086] image 3 It is a schematic structural diagram of a device for classifying and identifying spam messages provided by Embodiment 3 of the present invention. The device can be implemented by software and / or hardware, and is generally integrated into computer equipment. The device includes: a text filtering module 310 , a category junk short message text collection acquiring module 320 and an entity information extracting module 330 . in:
[0087] Text filtering module 310, is used for carrying out text filtering to short message text collection, obtains junk short message text collection;
[0088] Category spam text set acquisition module 320, for inputting the spam text set into the primary classification model and the secondary classification model successively, to obtain a plurality of categories spam text collections;
[0089] The entity information extraction module 330 is configured to input the text sets of various types of spam messages into the entity informatio...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com