LOF outlier detection method and system based on grid pruning
An outlier detection and outlier technology, applied in the field of data processing, can solve the problems of poor practicability and high complexity
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0067] This embodiment provides a method for detecting LOF outliers based on grid pruning.
[0068] Such as figure 1 As shown, the method includes the following steps:
[0069] S1: Input the data set and preprocess the data set;
[0070] S2: Suppose the data set has s equal-length intervals, divide each dimension of the data set into equal distances according to the input s value, and at the same time, calculate the boundary range of each grid, and number the grids;
[0071] In a high-dimensional space, multiple dimensions are divided into s segments, and the data set is divided by the dividing point line marked along each dimension. The irregular section cut out is the grid boundary. The specific boundary value needs to be determined according to the dimension of the data, the size of the data set, and the given number of division intervals s.
[0072] S3: Compare each data object in the dataset with the boundary range of the grid to find the grid to which it belongs;
...
Embodiment 2
[0107] This embodiment provides a detection system applying the grid pruning LOF outlier detection method described in Embodiment 1.
[0108] Such as figure 2 As shown, the system includes: data preprocessing module, data storage module, data cleaning module, Spark distributed computing module;
[0109] The input end of the data preprocessing module is connected with the external data source, the output end of the data preprocessing module is connected with the data storage module, the output end of the data storage module is connected with the data cleaning module, and the data cleaning module is connected with the Spark distributed computing module, The Spark distributed computing module is finally connected to the data storage module;
[0110] The data preprocessing module is responsible for data import and preprocessing, and outputs the preprocessed data to the data storage module;
[0111] The data storage module includes a distributed file system, and the data storage...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com