Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Fuzzy keyword query method and system based on weighing edit distance

A technology of editing distance and query method, which is applied in the field of keyword query and search, can solve the problems of considering the error probability of distance, easy to be wrongly inputted, and wrongly input of adjacent keys, etc., to achieve the effect of high time efficiency and interactivity

Inactive Publication Date: 2010-12-15
WUHAN UNIV
View PDF3 Cites 23 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the existing fuzzy search technology does not consider the specific error probability for the distance
For example, due to the layout of the keys on the keyboard, some adjacent keys are more likely to be mistakenly entered than characters that are far apart, and because some characters are similar in shape, they are also easy to be mistakenly entered

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Fuzzy keyword query method and system based on weighing edit distance
  • Fuzzy keyword query method and system based on weighing edit distance
  • Fuzzy keyword query method and system based on weighing edit distance

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0126] Suppose q is the query entered by the user, k is the maximum number of results returned by the system to the user, δ represents the threshold of the edit distance between all returned results and q, η represents the basic weight of the weighted edit distance, and W is the Set, TR represents the Trie tree built on W.

[0127] The specific method flow is as follows:

[0128] ①Assume that are stored in sets P and P′. Among them, t represents a node on the Trie tree, and also represents a corresponding prefix string; ed represents the edit distance; wed represents the weighted edit distance. Initialize P={|t∈W^length(t)≤δ^ξ←length(t)^θ←length(t)*(1-η)}, where length(t) means the length of the string t, Variable i ← 1.

[0129] ②If the length of the string q is less than i, go to ⑦; otherwise, set c←q[i], go to ③.

[0130] ③If P is empty, go to ⑥; otherwise, go to ④

[0131] ④ Take an element from P, and delete the element from P. If ξ to P' (delete operation). Fo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the technical field of keyword query search, in particular to a fuzzy keyword query method and a system based on weighing edit distance. The traditional information retrieval system asks users to provide a precise query word to search a result. The existing fuzzy retrieval system overcomes the shortage and can carry out fault-tolerant search. But the systems do not consider the situation that bigger input fault probability happens between adjacent key characters and shape similar characters when returning results are sorted, thereby greatly lowering user degree of satisfaction. Thus, the invention provides the weighing edit distance to measure the adjacent key characters and the shape similar characters and improves proper weight for the similar matching keyword which conforms to the two situations so as to enable the keyword to rank in the more front position. Based on the weighing edit distance, the search algorithm provided by the invention adopts Trie tree structure and has the characteristics of real time and interaction. The invention can more effectively return data queried truly by users and improves user degree of satisfaction.

Description

technical field [0001] The invention relates to the technical field of keyword query and search, in particular to a fuzzy keyword query method and system based on weighted edit distance. Background technique [0002] In traditional information retrieval systems, users need to input a precise query word in order to retrieve the desired information. When the input query information is incomplete or wrong, the system often fails to feed back any results, which leads to a great decline in user satisfaction (see Document 1, Document 2). [0003] In order to improve the drawbacks of traditional information retrieval systems, the most commonly used solution is to use the method of automatic completion (see literature 8, literature 9). When the user inputs part of the subtitle of the query word, the system can display the query word that the user may need according to the internal data of the system. When the required query word has appeared in the list box, the user can directly s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 李石君顾小燕江会福方传云
Owner WUHAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products