Data query method and device, electronic equipment and storage medium

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of a data query device and a query method, which is applied in the field of electronic equipment, storage media, devices, and real-time data query methods under large-scale data volumes, and can solve the shortcomings, consumption, and multiple resources of large-scale data statistics and data deduplication and other issues to achieve the effect of increasing value and significance

Pending Publication Date: 2021-08-24

西安交大捷普网络科技有限公司

View PDF0 Cites 2 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] Elasticsearch (ES for short) is a distributed full-text search engine based on the underlying technology of Lucene. By improving the mechanism of data storage and filtering performance, it can achieve fast query to a certain extent. There are obvious shortcomings in the above. In the face of large-scale data volume, searching, filtering and aggregation analysis of data according to different businesses will consume more resources. Therefore, in order to ensure the normal operation of the business, the entire aggregation analysis needs to be optimized. To achieve better query service

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0032] Such as figure 1 Shown, a kind of data real-time query method, described method comprises

[0033] Obtain query requests for real-time data and determine the size of the query target data;

[0034] If the amount of target data pointed to by the query request is less than the first threshold, the query content is obtained from ElasticSearch (for convenience of description, hereinafter referred to as ES) or ClickHouse;

[0035] If the amount of target data pointed to by the query request is greater than the first threshold, ClickHouse deduplicates and counts the total data amount, takes out the data, and inputs them into ES one by one for filtering and sub-aggregation, and summarizes the aggregation results and returns them.

[0036] Nested aggregation is the data aggregation of multiple fields in sequence. For example, the "gender" field is aggregated first, and then the "age" field is nested (sub-aggregation), that is, one aggregation is nested within another aggregati...

Embodiment 2

[0044] Such as figure 2 As shown, before obtaining the query request described in the first embodiment, the real-time collected target data is split and stored in different Kafka topics according to the data type. Topic is the basic unit of Kafka data writing operation. Producers (such as various network security devices) can publish data (such as security event logs) to the selected Topic (topic), and each record published to Topic is assigned For each consumer instance in the subscription consumer group, where the consumer instance can be distributed in multiple processes or on multiple machines. ClickHouse and ES, as the data consumers in this embodiment, consume data from the same topic through the Flink data flow processing engine and store them separately. ClickHouse only stores field data that participates in aggregation analysis.

[0045] Kafka is a distributed, partition-supporting, and multi-copy distributed message system. Its biggest feature is that it can proces...

Embodiment 3

[0055] Such as image 3 As shown, a data query device is provided, comprising:

[0056] The query receiving module obtains the real-time query request initiated by the data, and parses to obtain the aggregation analysis dimension;

[0057] A query judging module, configured to judge whether the amount of target data pointed to by the query request is greater than a preset first threshold;

[0058] The query processing module is used for initiating corresponding data aggregation analysis according to the amount of target data pointed to by the query request, and returning the aggregation result.

[0059] Preferably, the query processing module is used for:

[0060] If the amount of target data pointed to by the query request is less than the first threshold, the query content is obtained from ElasticSearch or ClickHouse;

[0061] If the amount of target data pointed to by the query request is greater than the first threshold, ClickHouse deduplicates and counts the total amou...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a real-time data query method, device and equipment and a storage medium. ES and ClickHouse consume data from the same source data at the same time and store the data respectively, and different engines are adopted to perform data response according to the target data volume pointed by different query requests, so that the obvious disadvantages of Es in data deduplication and counting are overcome, the flexibility of the ES in aggregation nesting is fully utilized, rapid aggregation analysis is carried out on huge-scale data, a result is returned, an approximate real-time effect is achieved, and the value and significance of a data query result are improved.

Description

technical field [0001] The invention belongs to the technical field of data analysis, and in particular relates to a real-time data query method, device, electronic equipment and storage medium under large-scale data volume. Background technique [0002] With the advent of the era of big data, on the one hand, due to the explosive growth of data volume, and on the other hand, due to the increase of data types, traditional data analysis methods are facing great challenges. Efficient request response is crucial to the effective implementation of big data services. In order to be able to meet the rapid processing of some specific queries and data mining applications, the database needs to perform statistical analysis on some data fields according to various dimensions or combinations of multiple dimensions, such as summing, counting, and maximum values for grouping data. Minimum values, or other custom statistical functions, are aggregated to obtain specific overviews of som...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F16/2455G06F16/215G06F16/2458

CPCG06F16/215G06F16/2455G06F16/2462

Inventor 李福宜赵彦林李周王平陈宏伟何建锋

Owner 西安交大捷普网络科技有限公司

Who we serve

R&D Engineer
R&D Manager
IP Professional

Why Patsnap Eureka

Industry Leading Data Capabilities
Powerful AI technology
Patent DNA Extraction

Social media

Patsnap Eureka Blog

Learn More

PatSnap group products

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Data query method and device, electronic equipment and storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

Embodiment 3

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology