Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Electric text similarity processing method and system convenient for query

An electronic text and text technology, which is applied in the fields of electronic digital data processing, special data processing applications, instruments, etc., can solve the problems that the search results are not precise enough or convenient enough, take too much calculation time, cannot identify differences or whether the commonality is important, etc.

Inactive Publication Date: 2008-08-20
刘二中
View PDF7 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this technique has serious drawbacks
[0007] The first is that the amount of calculation is too large, especially when the content of each text to be compared is large and the number of texts is large, more calculation time will be required
Some targeted improvement technologies that have been proposed, such as Yahoo's No. 6990628 US Patent related to "measurement of electronic text similarity" technology, IBM's Chinese patent CN1112647 C's "response query to classify documents in document collections "system and method" technology of Fudan University's Chinese patent CN1220159C "a high-dimensional vector data fast similarity retrieval method" technology, Hewlett-Packard's Chinese patent CN1269064 C technology on "document and information retrieval method and equipment" , Baidu's Chinese patent CN1209726C about "a method for identifying mirror image and quasi-mirror image websites on the Internet" only performs a similar comparison of the home page, and has made very limited improvements to the first defect above
[0008] The second defect is that the results of similarity processing are often of limited help to the queryer, because although similar files have obvious commonalities, there are also certain differences, and the information that the queryer is interested in is likely to be Where there are differences, key differences tend to significantly affect the class of the text
Existing technologies, including U.S. Patent No. 6990628, cannot identify whether a certain difference or similarity between two texts is important, so the search results provided by such technologies are neither rigorous nor convenient

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Electric text similarity processing method and system convenient for query
  • Electric text similarity processing method and system convenient for query
  • Electric text similarity processing method and system convenient for query

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0079] Hereinafter, a method for processing multiple electronic texts provided by a computer provided by the present invention will be specifically described by way of example.

[0080] If use the method of the present invention, need at first

[0081] [i] Obtain multiple electronic texts containing the same keyword query item.

[0082] The electronic text or text refers to files, texts or web pages or abstracts or catalogs or titles or indexes in devices such as computers or databases or information storage devices or the Internet or servers or search engine databases or data processors or chapters or paragraphs or information containing text or character content.

[0083] Further [ii] determine the same delineation range of the adjacent content of the keyword query item in each text content, and the adjacent content of the keyword query item is a division adjacent to it other than the keyword query item in the text content Get the contents of the range. Specifically, the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

Disclosed is an electronic text processing method convenient for query and search, and retrieval or search system of similar contents comparing process device comprising keyword search. The invention makes a comparison among similar contents in the delimitation limit of various text keyword search, determining and classifying similar contents depending on whether which have similarities, so as to perform the executions such as separating subsets, arranging various sequences or forming directories, sorting, displaying interface and the like. The invention can considerably improve the convenience and tightness of information retrieval or network information search.

Description

(1) Technical field [0001] The present invention relates to computers and search engines related to electronic text processing and retrieval or search technology. (2) Background technology [0002] In the past 20 years, computer database retrieval technology has developed greatly, especially the progress of network technology such as the Internet, which has made the scale of databases that people can share reach astronomical figures. In order to make it easier for users to find the information or files they need, classification or directory retrieval systems have emerged. This technology is more suitable in the mature classification field that people are very familiar with, but in the wider field of massive information, it is difficult to establish and difficult to master and use. [0003] Keyword search as the core retrieval technology and search engine technology brings convenience to users. The system can obtain the keyword query request of the inquirer through the inte...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 刘二中
Owner 刘二中
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products