Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A method for intelligently extracting tags from text

A label and intelligent technology, which is applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problem of inaccurate label extraction and achieve the effect of accurate label content

Inactive Publication Date: 2011-12-21
BEIJING JINHER SOFTWARE
View PDF0 Cites 33 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Another object of the present invention is to solve the problem of inaccurate label extraction, make

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method for intelligently extracting tags from text
  • A method for intelligently extracting tags from text

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] The present invention will be further described below in conjunction with the accompanying drawings, so that those of ordinary skill in the art can implement it after referring to this specification.

[0024] Such as figure 1 Shown, a kind of method of text intelligent extraction label of the present invention comprises the following steps:

[0025] Step 1. The system receives the text string input by the user and stores it in memory.

[0026] Step 2, performing keyword splitting on the text string using a Chinese word segmentation algorithm.

[0027] Step 3. Define a data structure for storing keywords, which contains attributes such as word frequency, word length, and part of speech. For each word disassembled by the Chinese word segmentation algorithm, the attribute information such as word length, word frequency, and part of speech must be extracted. , and digitize it to form an attribute value. If the total number of words in a file is 100, and the word "mobile ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for intelligently extracting labels from text. This method uses the Chinese word segmentation algorithm for text strings in memory to split keywords, and then calculates the weight of words based on word frequency, word length, part of speech, etc., sorts them in reverse order according to the weight of words, and takes a specified number of words as the result. output. The characteristic of this method is that all algorithms are processed in memory, the algorithm design is concise and efficient, and the analysis speed is effectively guaranteed. For a piece of text, after a series of processes such as word segmentation, weighting, word formation, filtering, and sorting, it has achieved a certain effect on the accuracy of the label. The invention can carry out independent packaging of algorithms and package of components, has certain universality, can be applied to all products that require text label extraction, and has certain universality in use.

Description

technical field [0001] The invention relates to the text mining technology in the field of artificial intelligence, in particular to the text mining technology applied to the label extraction of text in Internet products. Background technique [0002] With the rapid development of the Internet, the information on the network is increasing day by day, and the Internet has become an important source for people to obtain information. With the development of the Internet and information technology, we are facing a dilemma of excess information and poor knowledge. How to quickly and effectively discover valuable and usable information in massive amounts of information, accurately locate the required information and do a good job of filtering information has become the mainstream technology in the information field. At present, in many places in Internet products, tags are used to describe the core ideas expressed in this paragraph of text, such as blogs, microblogs, etc. These p...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30G06F17/27
Inventor 李军锋吕福军李跃海
Owner BEIJING JINHER SOFTWARE
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products