A data query processing method

A processing method and data query technology, applied in the field of data query statistics, can solve the problems of consuming system resources and long data processing time, and achieve the effect of saving computing resources and avoiding long waiting

Active Publication Date: 2017-02-08
中科曙光国际信息产业有限公司
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For example, if you count the per capita income of a certain area, you can get an estimated value after counting the average income of some people. Although there is a certain deviation from the result of counting all the people, this approximate statistical result still has a certain reference value.
[0006] 3) When the data processing time is long, if the user quits the calculation halfway, all the previous execution operations will become invalid calculations, which especially consumes system resources when processing large data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A data query processing method
  • A data query processing method
  • A data query processing method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0069] The scenario is to count the consumption data of all customers of a large e-commerce website. The website has more than ten million customer data and one billion consumption records. When it is necessary to count data such as the total consumption of customers within a certain period of time, partial statistics can be used to continuously revise the statistical results.

[0070] First, the consumption records are split according to the size of the data volume. After the statistics of the first data subset are completed, the data of the total consumption of some of the customers that have been counted (that is, the data in the first data subset) can be obtained; After the second data subset is processed, the results of the total consumption of the customers of the first two data subsets can be obtained. In this way, each data subset is processed one by one, and the statistical results are continuously revised until all data processing is completed, and the final statist...

Embodiment 2

[0073] The scenario is to count the average monthly call time of users of a mobile communication operator in a certain year. First, all the call records of the operator in the current year are divided into 12 data subsets in units of months. The query process is as follows: After the system queries the call records in January (the first data subset), suppose 1 Monthly average user talk time R 1 is 300 minutes, then the approximate monthly average call time of users is T 1 is 300 minutes, and the query progress percentage is 1 / 12*100%=8.3%; the system continues to query, after querying the call records in February (the second data subset), it is assumed that the average call duration R2 of users in February is 460 minutes, then the approximate monthly average call time of users is T 2 =T 1 +Δ2=T 1 +(R 2 -T 1 ) / 2=300 minutes+(460 minutes-300 minutes) / 2=380 minutes, the query progress percentage is 2 / 12*100%=16.7%; ) call records, assuming that the average call duration R3...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data query processing method. The data query processing method comprises the steps of dividing an original dataset into a plurality of data subsets, querying a first data subset, enabling a query result to serve as an approximate result, sequentially querying every data subset, and utilizing the query result to correct the previous approximate result; finishing query of all the data subsets according to the operation, and acquiring a final query result. By adopting the technical scheme, users can suspend the query at any time, an accurate approximate value can be obtained before all the data are processed, long-time waiting is avoided, and plenty of computing resources are also saved to a certain extent.

Description

technical field [0001] The invention relates to the technical field of data query statistics. Background technique [0002] With the continuous improvement of data acquisition technology and data processing requirements, today's society has entered the era of information explosion, which the industry calls the "big data" era. Big data has the following 4V characteristics: huge data volume (Volume), diverse data types (Variety), high processing speed requirements (Velocity), and huge value (Value). According to different types of data, the existing technologies mainly include: parallel database processing technology for massive structured data, and Hadoop / MapReduce processing technology for massive unstructured data processing. What these technologies have in common is to start multiple parallel processes / threads on multiple servers, and perform data reading and writing and computing operations at the same time, so as to achieve the purpose of improving data processing. The...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
Inventor 宋怀明苗艳超刘新春邵宗有
Owner 中科曙光国际信息产业有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products