Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and system for realizing inverted index based on character string segmentation on SQL On HBase

An inverted index and string technology, which is applied in the field of SQLOnHBase database, can solve the time-consuming problems of full table scanning, etc., and achieve the effect of improving efficiency, improving query efficiency, and reducing the scope of query

Pending Publication Date: 2020-09-04
贵州易鲸捷信息技术有限公司
View PDF7 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In this process, even if the user-defined function and Solr Api are processed quickly, the fate of full table scan on the source HBase table cannot be avoided. If there are many records in the original table, full table scan will be very time-consuming.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for realizing inverted index based on character string segmentation on SQL On HBase

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] The technical solutions in the embodiments of the present invention are clearly and completely described below in conjunction with the drawings in the embodiments of the present invention.

[0047] The embodiment of the present invention discloses an implementation method and system of an inverted index based on character string segmentation on SQL On HBase, which narrows the scope of the query, and improves the query efficiency by further accurately querying the base table based on the reduced number of records; It does not rely on third-party components, and is completely based on the database's own architecture. It realizes the inverted index that is only available on traditional relational databases. It supports front fuzzy, middle fuzzy post fuzzy and front and rear fuzzy queries, which greatly improves fuzzy queries. The efficiency also makes the support for unstructured data better.

[0048] The invention solves the problem of creating an inverted index on the st...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and a system for realizing an inverted index based on character string segmentation on SQL On HBase. The method comprises the following specific steps: generating an index structure, optimizing grammar of an index, optimizing a database, inserting records, updating the records, forbidding the index, deleting data, generating new index data, and querying a data table according to the index data to obtain a corresponding query result. According to the method, the query range can be reduced, and the query efficiency is improved by adopting a method of further accurately querying on the base table based on the reduced record number; compared with the prior art, the method does not depend on components of a third party, is completely based on the architecture ofthe database, realizes inverted indexing, support of front fuzzy, middle fuzzy, rear fuzzy and front and rear fuzzy queries on the traditional relational database, greatly improves the efficiency offuzzy query, and makes the support of unstructured data better.

Description

technical field [0001] The present invention relates to the technical field of SQL On HBase database, and more specifically relates to a method and system for realizing an inverted index based on character string segmentation on SQL On HBase. Background technique [0002] The SQL On HBase database itself has the ability to store structured and unstructured data, but it does not support fuzzy queries, especially before and after fuzzy queries (such as: like'%abc%'). For the full-text search engines Solr and Elasticsearch, although they support many data formats, can handle a large amount of data, and are very efficient, they cannot directly integrate with the SQL engine of the database. In addition, the data in the SQL On HBase database is stored on HBase. If you want to integrate the database with a full-text search engine, such as: SQL On HBase+Solr, the user can use the user-defined function to call the interface of Solr to retrieve the data transmitted from HBase. The up...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/22G06F16/2458G06F16/28
CPCG06F16/2228G06F16/2468G06F16/284Y02D10/00
Inventor 杨永锋
Owner 贵州易鲸捷信息技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products