Real-time log clustering analysis method based on reverse table

A cluster analysis and log technology, applied in the computer field, can solve problems such as the inability to meet the requirements of enterprises, and achieve the effect of improving versatility and speed.

Active Publication Date: 2020-01-14
上海擎创信息技术有限公司
View PDF3 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, in the face of the ever-increasing log information, using traditional methods to analyze logs can no longer meet the requirements of enterprises

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Real-time log clustering analysis method based on reverse table
  • Real-time log clustering analysis method based on reverse table
  • Real-time log clustering analysis method based on reverse table

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] Embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings, as figure 1 As shown, the implementation steps are as follows:

[0034] Step 1: Initialization: Define the encapsulation structure of each word in the log, including four types of encapsulation structures: normal, regex, important and verb;

[0035] Step 2: Raw log preprocessing:

[0036] The first step is regularized replacement; use regular expressions to replace the IP address, port number, and time in the original log with strings such as $IP, $IPPort, and DateTime, and perform simple encapsulation;

[0037] In the second step, the sensitive word library is proposed; according to the semantic analysis and the set sensitive word pattern, the sensitive words in the original log are processed, and their type is set as important;

[0038] The third step is the word segmentation of the word breaker; in order to perform accurate word segmentation on the ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a real-time log clustering analysis method based on a reverse table. The method comprises the following specific steps: 1, initialization: defining a packaging structure of eachword in a log; 2, preprocessing the original log, including regularization replacement, sensitive word bank extraction, word segmentation by a word segmentation device, part-of-speech tagging and public variable extraction; and 3, obtaining a template, including log grouping, inverted table scoring, template obtaining, template display layer content updating and inverted table updating. The method has a real-time log clustering function, the universality of the template is improved, the logs can be processed in parallel, and the analysis and processing speed is increased.

Description

technical field [0001] The present invention relates to a technology in the computer field, in particular to a real-time log cluster analysis method based on an inverted list. Background technique [0002] Log analysis is particularly important for an enterprise. If the operation and maintenance personnel of the enterprise cannot understand the security status of the server in real time, it will cause inestimable losses to the enterprise. Analyzing the logs can not only understand the operating status of the software and hardware equipment, but also understand the source of the error log, determine whether the error is caused by the application or the system itself, etc., so as to make timely remedies and better improve the enterprise High availability of hardware and software equipment. In short, there are two most direct and obvious purposes of log analysis, one is website security self-inspection to understand the security incidents happening on the server, and the other...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/31G06F16/35G06F16/36
CPCG06F16/319G06F16/35G06F16/374Y02D10/00
Inventor 杨辰葛晓波殷传旺
Owner 上海擎创信息技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products