Log carrier format extraction method and device based on natural language

A natural language and extraction method technology, applied in the field of natural language-based log carrier format extraction, can solve problems such as lack of, unrecognizable and extracted information, and achieve the effect of reducing manual intervention and improving analysis

Active Publication Date: 2020-10-02
北京安帝科技有限公司
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the log format of each device may be different. For example, when the log string of some devices obtained is "date=1972-03-29,time=12:30:33, devname=S124DN3W16007342,device is up,server is not down", because the keywords "device" and "server" are missing in the log carrier format, resulting in failure to identify and extract relevant information

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Log carrier format extraction method and device based on natural language
  • Log carrier format extraction method and device based on natural language
  • Log carrier format extraction method and device based on natural language

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0045] figure 1 A method for extracting a log carrier format based on natural language in an embodiment of the present invention may include:

[0046] S101. Split the accessed original log stream into streams corresponding to each log data segment through context word segmentation. Preferably, in this step, character strings separated by delimiters in the original log stream are extracted as streams corresponding to each log data segment.

[0047] The present invention is applicable to an original log stream in a predetermined format, preferably a log in which a plurality of log data segments are separated by delimiters, and each log data segment includes a data field (key), a connector or an operator, and a data value (value ). For example, the power plant equipment status information log is referred to as the power plant equipment log for short. The log data segments contained in the power plant equipment log include but are not limited to: log date, log time, power plant...

Embodiment 2

[0084] On the basis of Embodiment 1, the present invention also provides a power plant equipment log parsing method, the flow chart of which is as follows figure 2 shown.

[0085]The power plant equipment log parsing method of the second embodiment includes the following steps:

[0086] S1, access to the original log stream;

[0087] S2. Obtain the stored log carrier format;

[0088] S3. Use the stored log carrier format to match and analyze the accessed original log stream. If the matching and analysis is successful, go to step S6; otherwise, go to step S4; in this step, you can use all stored log carrier formats to match the original log stream in turn. Parsing, as long as a log carrier format can be successfully matched, the matching and parsing is considered successful; if all log carrier formats cannot be matched, the matching and parsing is considered to have failed;

[0089] In this step, the following regular expressions can be used to match and analyze the incomin...

Embodiment 3

[0097] On the basis of the first embodiment, the present invention also provides a method for judging abnormality of a power plant equipment log.

[0098] The log carrier format extraction method based on natural language in the embodiment 3 can be used to extract the log carrier format of the power plant equipment during the normal operation period and save it; and then use the saved log carrier format to update The obtained logs are matched and analyzed. If the matching analysis is successful, it is judged that the power plant equipment is normal; if the matching analysis is unsuccessful, it is judged that the power plant equipment is faulty, and an alarm message is generated. Preferably, the above matching method is regular matching, for example, the following regular expression is used:

[0099] $pattern = ' / date = (.* ),time=(.* ), devname=(.* ),device is (.* ), sever is not (.* ) / ';

[0100] preg_match_all($pattern, original log, $matches), $matches is the ma...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a log carrier format extraction method and device based on a natural language, and relates to the technical field of log processing, and the method comprises the following steps: splitting an accessed original log stream into shunts corresponding to log data segments through context word segmentation; obtaining variables and constants in each shunt, deleting character strings of the variables, and reserving the character strings of the constants; and combining the constant character strings in each shunt in a character string splicing mode to obtain a log carrier formatfor storage. The invention further provides a power plant equipment log analysis method, a power plant equipment log exception judgment method and a power plant regional equipment exception judgment method. According to the method, the log carrier format can be extracted under the condition that keywords are not set in advance, and the method is used for log analysis and exception judgment.

Description

technical field [0001] The invention relates to the technical field of log processing, in particular to a natural language-based log carrier format extraction method and device, a power plant equipment log parsing method, a power plant equipment log abnormality judgment method, and a power plant area equipment abnormality judgment method. Background technique [0002] At present, when various devices are in operation, they will generate logs for recording events, and each row of logs records a description of related information such as date, time, and device information. Log analysis plays a very important role in troubleshooting and performance analysis. [0003] Usually, the log is parsed directly using the log carrier format with preset keywords. For example, the keywords date (log date), time (log time), devname (power plant equipment number), BaseTrapSeverity (baseline average value) have been preset, and the format of the log carrier is the following string consisting...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/103G06F40/279
CPCG06F40/103G06F40/279
Inventor 王晓辉姜双林周磊饶志波
Owner 北京安帝科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products