Index tree building method, Chinese vocabulary searching method and related device

An index and sub-index technology, applied in the search field, can solve problems such as hash conflicts and high repetition rates, and achieve the effects of small hash conflicts, simple construction, and fast search

Inactive Publication Date: 2014-01-15
SHENZHEN LONG VISION
View PDF2 Cites 8 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

And Chinese is a kind of phonetic writing or called morpheme-syllable writing, word-syllable writing. It is a writing system in which graphic symbols represent not only morphemes but also syllables. If sorted and searched according to pinyin, the repetition rate is too high
If the Hash table is used, due to the continuous increase of Chinese vocabulary, the hash collision problem is difficult to solve

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Index tree building method, Chinese vocabulary searching method and related device
  • Index tree building method, Chinese vocabulary searching method and related device
  • Index tree building method, Chinese vocabulary searching method and related device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0047] In order to make the object, technical solution, and advantages of the present invention clearer, the present invention will be further described in detail below with reference to the accompanying drawings and examples.

[0048] Such as figure 1 As shown, the flow chart of Embodiment 1 of a method for constructing an index tree provided by the present invention includes:

[0049] Step S101, establishing keywords and index information corresponding to the keywords, where the keywords include at least one Chinese character.

[0050] Specifically, when building a vertical search engine in a certain professional field, it is necessary to collect metadata from the network, analyze the metadata, and obtain the original keywords in the metadata. In order to realize fuzzy search, the original keyword can be expanded and split to generate a keyword set, each keyword in the keyword set is an approximate word of the original keyword, and the index information of each keyword is ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an index tree building method and device. The index tree building method comprises the steps that a keyword and corresponding index information of the keyword are built, wherein the keyword comprises at least one Chinese character; the keyword is divided into a first etymon sequence, searching is conducted on an index tree built in advance according to the first etymon sequence, whether a path which starts from a root node and enables an etymon sequence composed of passed nodes to be matched with the first etymon sequence exists in the index tree is judged; if the path does not exist, a last matched node of the first etymon sequence in the index tree is obtained, a corresponding sub index tree is built for an etymon sequence which is not matched successfully under the last matched node, and the keyword and the corresponding index information of the keyword are stored into the last node of the sub index tree. The invention further provides a Chinese vocabulary searching method and device; a Chinese vocabulary is divided into CXME to build a vertical index tree; the Chinese vocabulary searching method and device have the advantages of being easy to build, fast to search, and small in Hash collision.

Description

technical field [0001] The invention relates to the search field, in particular to a method for constructing an index tree, a method for searching Chinese words and related devices. Background technique [0002] Search engines use specific computer programs to collect information from the Internet according to certain strategies, organize and process the information, provide retrieval services for users, and display the retrieved relevant information to users. Existing search engines include full-text index, directory index, meta search engine, vertical search engine and so on. [0003] With the development of the Internet, information is growing explosively, and search technology is becoming more and more important to netizens. The full-text search led by Google, Yahoo, and Baidu is well known to everyone. However, because the full-text search engine uses pure keyword matching, the recall rate and precision rate of information are still quite low. When a user enters a keyw...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/907G06F16/951
Inventor 李勇
Owner SHENZHEN LONG VISION
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products