Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Anti-crawler method based on computing power

An anti-crawling and computing power technology, applied in the field of anti-crawling, can solve problems such as poor user experience, unsatisfactory protection effect, inability to crawl website content, etc., and achieve the effect of alleviating the consumption of server resources

Active Publication Date: 2018-02-23
成都知道创宇信息技术有限公司
View PDF6 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, User-Agent and HTTP request header information can be customized. By randomizing these values, the set web crawler interception rules can be bypassed, and the protection effect is not ideal.
[0005] When using other forms of verification codes for man-machine identification, web crawlers cannot crawl website content because web crawlers cannot directly enter the correct verification code
The user experience for normal access is not good, requiring frequent input of verification codes

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Anti-crawler method based on computing power
  • Anti-crawler method based on computing power
  • Anti-crawler method based on computing power

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] The present invention will be described in further detail below in conjunction with specific embodiments. The present invention is mainly used in a WEB server, and decrypts the encrypted page returned by the server through the calculation power of the client, and is used for alleviating the consumption of server resources by the web crawler and copying the content of the web page by the web crawler. Flowchart such as figure 1 As shown, the specific steps are as follows:

[0023] Step 1: Generate the page requested by the client on the server side.

[0024] If the client initiates the following request:

[0025]

[0026] Normally, the WEB server will read the content of the file index.html and return it to the client. The response page generated at this time is assumed to be:

[0027]

[0028]

[0029] Step 2: Encrypt the webpage using a randomly generated key and a high-strength encryption algorithm, and generate decrypted JavaScript code.

[0030] The dec...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an anti-crawler method based on computing power. The method includes the following steps: generating a page requested by a client on a server side; encrypting a webpage by using a randomly-generated key and an encryption algorithm, and generating a decrypted JavaScript code that includes a key adjacent to or associated with the a correct decryption key; after the client receives a request, executing the JavaScript decrypted code, trying to decrypt the webpage through violence; and rendering the decrypted webpage through a browser. According to the scheme of the invention, the webpage is encrypted, and the encrypted page and decrypted code are returned, the client tries to decrypt the key through violence, CPU resources of the client can be consumed to different extents by adjusting the encryption strength, the phenomenon that the same client captures a large number of website contents in a short time can be avoided, and the consumption of large-scale web crawlers to server resources can be effectively alleviated.

Description

technical field [0001] The invention relates to the technical field of crawler prevention, in particular to a computing power-based crawler prevention method. Background technique [0002] At present, traditional anti-crawler systems usually use IP blacklists, User-Agent and other access parameter blacklists, request frequency, verification codes of various interaction types, etc. to restrict web crawlers. [0003] Using the IP blacklist method, when the WEB server receives a request, it first calculates the access frequency of the requested IP, and returns an error message prompt page to the client when it exceeds the set threshold. However, false positives may occur in a NAT network environment, and by using a proxy IP, the IP access frequency limit can be bypassed, so the protection effect is not very satisfactory. [0004] Use the User-Agent method to obtain some web crawler request characteristics (User-Agent and other HTTP request header information) through WEB acces...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): H04L9/06H04L29/06H04L29/08
CPCH04L9/0625H04L9/0631H04L9/065H04L63/0428H04L63/10H04L63/145H04L67/02
Inventor 罗智高
Owner 成都知道创宇信息技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products