A database health scoring method and scoring system based on machine learning

A machine learning and database technology, applied in database design/maintenance, electronic digital data processing, structured data retrieval, etc., can solve problems such as heavy analysis workload, complex relationship, and difficulty in finding rules by manual analysis

Active Publication Date: 2019-01-25
INFORMATION & COMM BRANCH OF STATE GRID JIANGSU ELECTRIC POWER +2
View PDF8 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, it is difficult to deal with various difficulties in database operation and maintenance only relying on the experience of expert DBAs
The number of monitoring indicators in the database is very large, and the cost of manual analysis is too high; the correlation between different indicators is very complicated, and it is difficult to find the law through manual analysis; it is easy to find problems by relying on manual work, but it is difficult to locate them; the monitoring of different database software Metrics vary; system complexity grows, relationships more complex
To sum up, the main shortcomings of the expert model are as follows: 1. The indicators are selected by experts relying on experience, and there are still a large number of indicators that have not been selected. Whether these unselected indicators are important to the health of the database, experts cannot give an answer. The analysis workload of the selected indicators is very huge, which cannot be completed manually; 2. The expert model cannot analyze the correlation between various indicators; each indicator is isolated; 3. The expert model cannot give the future period of time. For health score prediction, the current score can only be calculated based on the currently obtained indicators

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A database health scoring method and scoring system based on machine learning
  • A database health scoring method and scoring system based on machine learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0027] A method for scoring database health based on machine learning, such as figure 1 shown, including the following steps:

[0028] Step 1. Collect database monitoring indicators, and obtain health scores through expert models; the collected raw data and scores are used as sample sets;

[0029] In this embodiment, 250 indicators of database operation are collected as monitoring indicators, including database connection status, CPU usage rate, memory usage rate, disk read and write, cache size, delay, response time, etc.; the score of health score ranges from 0 to 100; Score by expert model as manually marked sample set data;

[0030] Step 2. Perform preprocessing such as denoising and normalization on the data in the sample set, and divide the data in the sample set into training data, verification data and test data;

[0031] Denoise the data in the sample set, including removing outliers and missing values, and removing indicators with only a single value. Among the va...

Embodiment 2

[0046] The difference between this embodiment and embodiment 1 is: in step (3), adopt random forest algorithm to construct random forest regression forecasting model; In described random forest regression forecasting model, comprise p decision tree, the determination step of the depth q of decision tree is :

[0047] Set the upper limit value Q of the decision tree depth, let t perform Q training from 1 to Q, and calculate the loss function value of each training, take the value of t with the smallest loss function value in Q training as the decision tree depth q .

[0048] In this embodiment, the random forest regression prediction model includes 100 decision trees, and the upper limit of the depth of the decision tree is set to 10. After 10 training tests, the optimal depth of the decision tree is 3.

[0049] Those skilled in the art should understand that the embodiments of the present application may be provided as methods, systems, or computer program products. Accordin...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a database health scoring method and scoring system based on machine learning, wherein, the scoring method comprises the following steps: 1, collecting database monitoring indexes and obtaining health scoring through an expert model; using collected raw data and scores as sample sets; 2, preprocessing the data in the sample set, such as denoising and normalization, and dividing the data into training data, verification data and test data; 3, establishing a regression prediction model by adopting a regression prediction algorithm, train model parameters by using trainingdata, adjusting model parameter by using verification data, and testing that effect of the model by using test data; 4. reading the monitoring indexes of the database for a period of time and preprocess them. As the input of the regression prediction model, the output of the model is the health score result of the database for the current or future period of time. This method can analyze a largenumber of database monitoring indicators, and get the current or future database health scoring results through the establishment of the regression prediction model.

Description

technical field [0001] The invention belongs to the field of operation and maintenance of databases, and in particular relates to a method and system for scoring and predicting the health degree of databases by using an artificial intelligence method. Background technique [0002] At present, the operation and maintenance of large-scale database systems is mainly maintained by high-end DBAs (Database Administrators, database administrators). DBAs can score the health of the overall operation of the database by viewing various indicators of the database. This method is called " expert model". The expert model relies on experts with many years of experience in database operation and maintenance, manually selects the indicators that have the greatest impact on the health of the database, uses manually set thresholds, scores each indicator, and finally sums up the scores to obtain the final health Score. However, it is difficult to deal with various difficulties in database op...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/21G06F11/36G06N99/00
CPCG06F11/3688
Inventor 王会羽钱琳俞俊朱广新李凡
Owner INFORMATION & COMM BRANCH OF STATE GRID JIANGSU ELECTRIC POWER
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products