A Statistical Analysis-Based Model Validation Method for Small-Sample Data

A verification method and data model technology, applied in special data processing applications, complex mathematical operations, design optimization/simulation, etc., can solve the problems of low accuracy of estimation results, and the distribution of reproduced samples deviates from the true distribution, so as to improve the accuracy, The effect of improving the accuracy and extending the range

Active Publication Date: 2022-07-08
HARBIN INST OF TECH
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The purpose of the present invention is to solve the problem that the scope of the traditional Bootstrap method regenerated samples is limited to the original sample range; especially in the case of a small sample size, it may cause the distribution of the regenerated samples to deviate from the true distribution, making the estimation result low in accuracy. There is a certain risk problem, and a sample data model verification method based on statistical analysis is proposed

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Statistical Analysis-Based Model Validation Method for Small-Sample Data
  • A Statistical Analysis-Based Model Validation Method for Small-Sample Data
  • A Statistical Analysis-Based Model Validation Method for Small-Sample Data

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach 1

[0023] Embodiment 1: The specific process of a method for verifying a sample data model based on statistical analysis in this embodiment is as follows:

[0024] Step 1. Perform normality test on the reference sample and the simulated sample. If the reference sample and the simulated sample obey the normal distribution, perform step 2. Otherwise, use the nonparametric test method to analyze the similarity of the cumulative probability distribution of the reference sample and the simulated sample. degree;

[0025] The reference sample is experimental data of a real physical system, such as experimental data obtained by an aircraft system;

[0026] The simulation sample is experimental data obtained from a simulation model corresponding to a real physical system, such as experimental data of an aircraft simulation model;

[0027] The nonparametric test method includes K-S test, signed rank test, and runs test;

[0028] Step 2: Determine the reference sample size n, and select t...

specific Embodiment approach 2

[0035] Embodiment 2: The difference between this embodiment and Embodiment 1 is that in the step 1, the normality test is performed on the reference sample and the simulated sample, and the specific process is as follows:

[0036] Described normality test adopts W test method, and W test method selects index quantity to be:

[0037]

[0038] Among them, n is the sample size, when n is an even number, k=n / 2; when n is an odd number, k=(n-1) / 2;

[0039] X (1) ≤X (2) ≤...X (n) Sort the samples in ascending order;

[0040] a k is the calculation coefficient (available from the table);

[0041] The rejection domain of the W-test method is W≤W a ,

[0042] W a is the alpha quantile (available from the table), and alpha is the significance level;

[0043] An example of a normality test is given below:

[0044] For example, there are 10 groups of data: 2.7, -1.2, -1.0, 0, 0.7, 2.0, 3.7, -0.6, 0.8, -0.3, use the W test to test whether the group of data obeys a normal distr...

specific Embodiment approach 3

[0050] Embodiment 3: This embodiment differs from Embodiment 1 or 2 in that: in step 2.1, when the reference sample size n≥30, the U-test method of the two-normal population mean is used to check the difference between the reference sample and the simulated sample. Consistency analysis is carried out to obtain whether the mean values ​​of the reference sample and the simulation sample are consistent; the specific process is as follows:

[0051] Let the reference sample X = (X 1 ,…,X n ) follows a normal distribution N(μ 1 ,σ 1 2 ), simulation sample Y=(Y 1 ,…,Y m ) obey the normal population N(μ 2 ,σ 2 2 );

[0052] (X 1 ,…,X n ) is the experimental data of n real physical systems, that is, the reference sample; (Y 1 ,…,Y m ) is the experimental data output by the m simulation model, that is, the simulation sample; n is the reference sample size, m is the simulation sample size; m, n are both positive integers; μ 1 is the mean of the overall experimental data of ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

A small sample data model verification method based on statistical analysis relates to a small sample data model verification method. The purpose of the present invention is to solve the problem that the range of the regenerated samples in the traditional Bootstrap method is limited to the range of the original samples; especially in the case of a small sample size, the distribution of the regenerated samples may deviate from the real distribution, making the estimation results inaccurate, and there are a certain risk. The process is: 1. Carry out the normality test on the reference sample and the simulated sample. If it obeys the normal distribution, perform 2; 2. When n ≥ 30, use the U test; when 10 < n < 30, use t or F-test method; when 3<n≤10, use and conduct a single normal population parameter test for a simulated sample respectively; obtain whether the mean and variance of the reference sample and the simulated sample are consistent; when n<3, do not conduct Model validation. The invention is used in the field of simulation model verification.

Description

technical field [0001] The invention relates to a small sample data model verification method. Background technique [0002] Model verification is an important means to ensure whether the simulation model can correctly replace the real system for experiments, and it is one of the key issues in the field of simulation research. The main idea of ​​model verification is to analyze the consistency of the reference data output by the real physical system experiment and the simulation data output by the simulation model experiment under the same input conditions; according to whether the simulation sample is consistent with the reference sample, it is determined whether the simulation model is credible or not. . In practical application engineering, such as aircraft simulation model, due to the limitation of test conditions, test funds and other factors, it is impossible to carry out a large number of repeated tests, so that the data sample size output by the real system is small...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F30/20G06F17/18
CPCG06F30/20Y02T90/00
Inventor 马萍周玉臣宋婷方可杨明
Owner HARBIN INST OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products