Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Chinese full-text searching method based on database

A search method and database technology, which are applied in the fields of electronic digital data processing, special data processing applications, natural language data processing, etc., can solve the problems of not supporting retrieval function, only providing, not supporting full-text search methods, etc., and achieve fast full-text This search, the effect of high efficiency

Inactive Publication Date: 2018-04-06
HANGZHOU ANHENG INFORMATION TECH CO LTD
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, existing mainstream databases, such as MySQL and PostgreSQL databases, do not support Chinese full-text search methods. At the same time, SCWS, as one of the most convenient open source and free Chinese word segmentation plug-ins, can divide a whole paragraph of Chinese characters into words, but the plug-in itself does not support the search function, only provides the function of Chinese word segmentation

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Chinese full-text searching method based on database

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] The present invention will be described in further detail below in conjunction with the examples, but the protection scope of the present invention is not limited thereto.

[0023] The invention relates to a database-based Chinese full-text search method. In the implementation process of the present invention, the application of multiple software function modules will be involved. For example, after carefully reading the application documents and accurately understanding the realization principle and the purpose of the present invention, in combination with the existing known technology, this field Technicians can fully implement the present invention by using their software programming skills. The aforementioned software functional modules include but are not limited to: Chinese word segmentation plug-ins, interpreters, GIN indexes, triggers, etc. All mentioned in the application documents of the present invention fall into this category, and the applicant will not lis...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a Chinese full-text searching method based on a database. A Chinese word segmentation module is integrated in the database, an interpreter is generated, word segmentation is performed on Chinese data to be searched, the Chinese data having undergone word segmentation and the Chinese data before the word segmentation are stored into the database, and a correlation is established; indexes are established on fields of the stored Chinese data having undergone word segmentation, searching is performed on the Chinese data having undergone word segmentation, and searching results are obtained; and through the correlation between the Chinese data having undergone word segmentation and the Chinese data before the word segmentation, the Chinese data before the word segmentation is obtained, that is, full-text searching results are obtained. According to the Chinese full-text searching method based on the database, quick full-text searching can be performed on plenty of data, and the efficiency is high. Through testing, it takes 20166.568 milliseconds for LIKE wildcard characters to perform a full-text search for 10 million data volumes, while it only takes 0.651 millisecond through the method. The method makes up for the blank of Chinese full-text search of a current mainstream database, including MySQL and PostgreSQL.

Description

technical field [0001] The present invention relates to the technical field of data identification; data representation; record carrier; record carrier processing, in particular to a database-based Chinese full-text search method. Background technique [0002] With the continuous and rapid development of Internet technology, human society has entered an unprecedented information age. Data has penetrated into every industry and business function field today and has become an important factor of production. The era of big data has come. In the era of big data, the data that people master is growing at an explosive rate. At the same time, the form of data is also undergoing fundamental changes. The storage and analysis methods of big data have become the key to processing big data. Research on how to deal with large-scale data It has become the only way to solve how people can quickly obtain effective information in the era of big data. [0003] Full-text retrieval is a very i...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30G06F17/27
CPCG06F16/313G06F16/3335G06F40/284G06F40/289
Inventor 徐顺格范渊
Owner HANGZHOU ANHENG INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products