Bit mark character string retrieval technique

A retrieval technology, string technology, applied in digital data information retrieval, unstructured text data retrieval, electronic digital data processing, etc.

Active Publication Date: 2009-07-22
徐文新
View PDF4 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] On October 19, 2004, I applied for the patent of "Prime Number Substitution String Retrieval Technology", application number 200410067258.X. String retrieval technology", in order to achieve better results, more space is needed to store the prime number product value

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Bit mark character string retrieval technique
  • Bit mark character string retrieval technique
  • Bit mark character string retrieval technique

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0154] The present invention has obtained good realization in the fuzzy retrieval of Chinese character string database and English character string database, below is to the Chinese character string database in SQL SERVER2000, with the vb6.0 code that " 1 " is marked, other programming language, database The fuzzy retrieval of bit-marked strings can refer to the implementation.

[0155] 1. Create a database

[0156] Suppose the database shuku has a table biao, which has a field shuming, the data type is nvarchar, and the length is 40. Another field wei is created, the data type is "long integer", that is, 4 bytes, with 32 bits, one of which is a sign bit, and the remaining 31 bits can be used for marking.

[0157] 2vb6.0 does not have a command to directly set the "bit" to "1" or "0", so the "or" operation of the bit is used to "mark" the database string

[0158] dim shuzu(30)As Long

[0159] 'Define an array of long integers with 31 elements.

[0160] shuzu(0)=1

[0161]...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a string retrieval technique using a bit to correspond to a plurality of character elements and uses n bits to correspond to all character elements. All character elements are divided into n groups, and n bits of one data which are all 0 and noted as W are used for marking the character element information of string groups. If a character element P1 of string S belongs to the n group, the n bit of W is marked as 1, and similarly, W is marked according to the groups of other character elements P2, P3, P3, and the like of S to finish marking W of all character elements. Information recording S is called a bit value of S and the mode is called 1 marking. According to logic algebra principle, n bits of one data which are all 1 and noted as W can also be used for marking the character element information forming the strings. If one character element P of S belongs to the n group, the corresponding n bit of the data W is marked as 0, and the mode is called 0 marking. Whether Sb does not include, includes or may include all character elements of the retrieval key word Sb can be judged by comparing bit value Wa of Sa, Wa and bit value Wb of Sb, and Wb of Sb. For example, bit implication calculation is carried out on Wa and Wb. If all bits have implication relation, Sb includes or may include all character elements of Sa. If necessary, such result can be got by a bit-by-bit character comparison method. Bit marking can be used for normal retrieval, as well as for reverse retrieval. If bit n used for marking is over two times of the average length m of the string, combination of bits corresponding to a group of character elements can be used for marking to improve screening efficiency, and the method is called multi-bit marking. For multi-bit marking, bit-by-bit character comparing method can also be used for finally judging whether the Sb includes Sa.

Description

technical field [0001] The invention is a character string retrieval technology, and the purpose is to improve the speed of character string fuzzy retrieval. One bit (bit) corresponds to several character elements, and n bits correspond to all character elements, that is, all character elements are divided into n groups, and n bits of a data are all 0, recorded as W F , to mark the meta-information of the characters that make up the string. If a character element P of several strings S 1 Belongs to the nth group, correspondingly mark the nth bit of W as 1, similarly, according to S other character elements P 2 ,P 3 ,P 4 ...The group to which it belongs marks W, and after completing all character meta-marks, W records the information of S, which is called the "bit value" of S, and this method is called 1 mark. According to the principle of logic algebra, it is also possible to use n bits of data that are all 1, denoted as W T , to mark the meta-information of the characte...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F17/30613G06F16/31
Inventor 徐文新
Owner 徐文新
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products