Method for conducting term matching on basis of double-array lexicographic tree

A double array and dictionary technology, applied in the field of computer communication, can solve the problems of slow terminology indexing, slow word search efficiency, slow query, etc., to achieve good user experience, improve performance, and speed up matching effects

Active Publication Date: 2017-05-10
IOL WUHAN INFORMATION TECH CO LTD
View PDF4 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The technical problem to be solved by the present invention is that the current term matching engine based on the database is relatively slow in word search efficiency, and the way to improve this problem is to build a fast index for the terms in the database, and the introduction of a double-array dictionary tree can solve a large number of problems. The problem of slow terminology indexing and slow query

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for conducting term matching on basis of double-array lexicographic tree
  • Method for conducting term matching on basis of double-array lexicographic tree
  • Method for conducting term matching on basis of double-array lexicographic tree

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0035] The technical solutions of the present invention will be further specifically described below in conjunction with the accompanying drawings and specific embodiments.

[0036] In order to solve the above technical problems, the present invention provides a method for term matching based on a double-array dictionary tree, which is characterized in that it includes the step of building an index, and the step of using the index to perform term query matching, such as figure 1 shown;

[0037] The step of building an index with a double-array dictionary tree includes the following three steps, as figure 2 Shown:

[0038] (1) Generate the positioning of the double array dictionary tree

[0039] According to the number of specified double-array dictionary trees, use the hash algorithm to calculate the hash value of the term to be inserted, and then take the modulus of the number of double-array dictionary trees to calculate the position number of the double-array dictionary ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method for conducting term matching on the basis of a double-array lexicographic tree. The method is characterized by comprising the steps of index creating, wherein a location of the double-array lexicographic tree is generated, an ID of a secondary index of a memory cache system is calculated, and an index is created for items; term querying and matching through the index, wherein the location of the double-array lexicographic tree is generated, word segmentation is conducted, and term matching is conducted on the basis of the index. Not only can multiple querying requirements of term matching be met, but also the overall matching performance is improved.

Description

technical field [0001] The invention belongs to the field of computer communication, in particular to a method for term matching based on a double-array trie. Background technique [0002] At present, computer-assisted translation is an important means to improve the consistency and efficiency of translation. It requires the software to continuously memorize the latest terms and corpus, and to call out terms or corpus that meet the corresponding conditions in a timely manner for selection in the subsequent translation process. . With the continuous expansion of terminology and corpus, the efficiency of retrieving complete translation information directly from the original text or translation based on traditional relational databases or newer non-relational databases will decrease significantly. For relatively large manuscripts to be translated, the speed is naturally unacceptable. Taking mongo database as an example, each document record contains document ID, original text...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/28G06F17/27G06F17/30
CPCG06F16/334G06F40/289G06F40/58
Inventor 冯泽康
Owner IOL WUHAN INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products