Semi-automatic anti-crawling system based on behavior characteristics

A semi-automatic and behavioral technology, applied in transmission systems, database management systems, special data processing applications, etc., can solve the problems of difficult and fast access, effective, high labor costs, and large amounts of data labeling.

Pending Publication Date: 2020-11-10
北京人人云图信息技术有限公司
View PDF1 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The former method requires the cooperation of business experts and strategy experts to extract and apply crawler identification features. It has a long cycle from identification to interception, high labor costs, and low versatility.
The second type introduces machine learning methods to identify crawlers. The advantage is that the rule discovery is automatically learned from the data samples by the algorithm. The disadvantage is that a large amount of data annotation is required, and the cost of generalization and update is high.
Also, if the behavior of the crawler changes, it will be difficult to catch the new crawler if the machine learning pre-designed characteristics cannot characterize it
In addition to the above shortcomings, the common problem of existing anti-climbing systems is low generalization ability
Artificial rules, character features, and machine learning are often summarized in specific business scenarios, and it is difficult to quickly access and take effect when facing new scenarios

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Semi-automatic anti-crawling system based on behavior characteristics
  • Semi-automatic anti-crawling system based on behavior characteristics
  • Semi-automatic anti-crawling system based on behavior characteristics

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0021] In order to understand the above-mentioned purpose, features and advantages of the present invention more clearly, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments. It should be noted that, in the case of no conflict, the embodiments of the present application and the features in the embodiments can be combined with each other.

[0022] In the following description, many specific details are set forth in order to fully understand the present invention. However, the present invention can also be implemented in other ways different from those described here. Therefore, the protection scope of the present invention is not limited by the specific details disclosed below. EXAMPLE LIMITATIONS.

[0023] This embodiment proposes a semi-automatic anti-climbing system based on behavioral characteristics, such as figure 1 As shown, it includes an ETL processing unit, a log analysis engine, a beh...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention belongs to the field of network data security, and relates to a semi-automatic anti-crawling system based on behavior characteristics, which comprises an ETL processing unit, a behavioranalysis and management unit, a log analysis engine and a request protection processing unit, wherein the ELT processing unit is used for carrying out ETL (Extract Transform and Load) processing on request information of a request initiated by a user side to obtain UID (Uniform Identifier) and URI (Uniform Resource Identifier) data; the behavior analysis and management unit performs deduplicationprocessing on the received URI data by taking the received URI as a main key aggregation to generate a behavior set and statistical indexes related to a service scene;, takes the behavior set as a main key aggregation, and generates an analysis view to judge whether the behavior set is threatened or not; manages the threat behavior set,and if it is judged that the behavior set is the threat behavior set, records and tracks the threat behavior set, automatically generates a disposal strategy parameter according to the behavior set access frequency, the URI data type and the URI data quantity, and pushs the disposal strategy parameter to a database for crawler real-time monitoring.

Description

technical field [0001] The invention belongs to the field of network data security, and relates to a crawler automatic identification and disposal system based on user behavior analysis. Background technique [0002] The existing anti-crawling system based on back-end data generally has two directions: one is the crawler identification and interception method based on artificial strategy and character feature matching, and the other is the crawler identification and interception combined with supervised and unsupervised machine learning method. The former method requires the cooperation of business experts and strategy experts to extract and apply crawler identification features. It has a long cycle from identification to interception, high labor costs, and low versatility. The second type introduces machine learning methods to identify reptiles. The advantage is that the rule discovery is automatically learned from the data samples by the algorithm. The disadvantage is tha...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F21/56H04L29/06G06F16/25
CPCG06F21/562H04L63/1425H04L63/1441H04L63/101G06F16/254
Inventor 陈芝茂同锋蔡月月
Owner 北京人人云图信息技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products