The invention relates to a high sliding window
data stream anomaly detection method based on layered clustering, and aims to solve the problem that the accuracy of a
data stream anomaly detection result is reduced due to influences of stale data and historical data. According to the method, by means of a layered clustering
algorithm, the final clustering result cannot be considered during clustering, arrival data are processed at a higher speed, and a data volume of an off-line layer is greatly smaller than the number of
original data due to the fact that the off-line layer only utilizes a clustering structure to respond to a user query result, so that the data can be effectively stored, and a more accurate clustering result can be obtained. As for a sliding
window model, a clustering characteristic index
histogram structure is adopted, so that
insertion of new data and deletion of stale data can be better finished. A cosine coefficient is taken as a metric function, so that good clustering and
anomaly detection results can be obtained. The high sliding window
data stream anomaly detection method is applicable to fields of sensors, network click
stream, share dealing and the like.