Malicious webpage identification and detection method based on static field, computer and storage medium

A technology for malicious web pages and detection methods, applied in the field of malicious web page identification and detection, can solve the problems of fingerprints occupying too much space resources, inability to apply real-time detection, and one-sided plain text information, etc. short time effect

Pending Publication Date: 2022-04-19
HARBIN INST OF TECH +1
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The fingerprint used in the deduplication algorithm based on web page fingerprints is composed of feature keywords and their position vectors extracted from web pages, and feature words are extracted from plain text information in web pages. If the text size is too large, it may As a result, fingerprints take up too much space resources in the storage process; only considering the plain text information displayed on the page is too one-sided, the web page fingerprint extraction technology proposed in this algorithm is only suitable for web page identification with a large amount of text content in the web page, and does not have general suitability
[0007] In the detection method based on machine learning, the feature extraction and model training required by the machine learning method need to consume a lot of resources, which cannot be applied to the needs of real-time detection in practical applications.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Malicious webpage identification and detection method based on static field, computer and storage medium
  • Malicious webpage identification and detection method based on static field, computer and storage medium
  • Malicious webpage identification and detection method based on static field, computer and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0053] Embodiment 1, with reference to Figure 1-3 To illustrate this embodiment, the present invention provides a method for identifying and detecting malicious webpages based on static domains, including the following steps:

[0054] Step 1, monitor the webpage traffic in real time, and extract the URL address of the HTTP header;

[0055] Step 2. Match the URL address described in step 1 with the URL address stored in the blacklist library; if the match is successful, block the traffic, and if the match fails, perform step 3;

[0056] Step 3, parsing the webpage traffic of matching failure; the parsing method of the present invention can improve the efficiency in the webpage parsing process, can carry out a series of response processing and avoiding for the grammatical and format errors existing on the webpage, and can also be resolved by setting the maximum parsing depth Limit the running time of the program and de-dry the deep nodes. Specifically include the following st...

Embodiment 2

[0100] Embodiment 2. A computer. The computer device of the present invention may be a device including a processor and a memory, such as a single-chip microcomputer including a central processing unit. Moreover, when the processor is used to execute the computer program stored in the memory, the steps of the above-mentioned recommendation method based on CREO software that can modify the recommendation data driven by the relationship are realized.

[0101] The so-called processor can be a central processing unit (Central Processing Unit, CPU), and can also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf Programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocesso...

Embodiment 3

[0103] Embodiment 3, computer-readable storage medium

[0104] The computer-readable storage medium of the present invention can be any form of storage medium read by the processor of the computer device, including but not limited to non-volatile memory, volatile memory, ferroelectric memory, etc., computer-readable storage A computer program is stored on the medium, and when the processor of the computer device reads and executes the computer program stored in the memory, the steps of the above-mentioned modeling method based on CREO software that can modify the relationship-driven modeling data can be realized.

[0105] The computer program includes computer program code, which may be in source code form, object code form, executable file or some intermediate form, and the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a malicious webpage identification and detection method based on a static field, a computer and a storage medium, and belongs to the technical field of webpage identification and detection. Comprising the following steps: step 1, monitoring webpage traffic in real time, and extracting a URL address of an HTTP head; step 2, matching the URL address with a URL address stored in a blacklist library; 3, analyzing the webpage traffic which fails to be matched; step 4, crawling JS (JavaScript) and CSS (Cascading Style Sheet) files in the analyzed webpage traffic; 5, extracting a webpage fingerprint of the target webpage; step 6, identifying webpage traffic; 7, comparing the URL addresses of the two webpages; if the URL addresses are the same, the webpage in the flow is a normal webpage, and a matching log is stored; and if the URL addresses are different, the webpage in the flow is a malicious webpage, and blocking is carried out. The technical problem that the requirement of real-time detection in practical application cannot be met is solved. The technical effect of reducing the time cost in the webpage matching process is achieved.

Description

technical field [0001] The present application relates to a detection method, in particular to a static domain-based malicious web page identification and detection method, a computer and a storage medium, belonging to the technical field of web page identification and detection. Background technique [0002] Phishing attack is a cyber crime that steals user privacy data through social engineering or technical means. In recent years, many criminals have engaged in illegal activities by building malicious websites, and using various means (such as URL obfuscation, etc.) Concealment makes traditional defense detection technology invalid. [0003] The web page fingerprint is a byte sequence calculated by hash operation based on the key-value pairs in the header of the response message and a series of special elements (labels, attributes, etc.) extracted from the web page document. Web page identification is to identify the web page that best matches the target web page from th...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F21/56G06F16/955G06F16/951G06F16/9535G06F40/284G06F40/216
CPCG06F21/563G06F16/9566G06F16/951G06F16/9535G06F40/284G06F40/216
Inventor 余翔湛刘立坤陈巍史建焘葛蒙蒙叶麟于喜东王永强冯帅赵跃王久金宋赟祖郭明昊胡智超苗钧重刘凡李精卫石开宇韦贤葵孔德文羿天阳刘奉哲李竑杰
Owner HARBIN INST OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products