Multi-thread network crawler processing method based on connection proxy optimal management
A technology of web crawler and processing method, applied in the field of new web crawler processing, connection agent optimization management design
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0025] The present invention will be further described in detail below with reference to the drawings and embodiments.
[0026] figure 1 It is a further description of the process of the present invention. In the figure, the process in virtual box A is the initial work needed to build the crawler, and it only needs to be executed once. The process in virtual box B is the process of crawling web pages by crawlers, which need to be repeated until the end.
[0027] (1) Obtain a proxy server and store it in the proxy server pool.
[0028] (2) Test the network connection performance of the proxy server.
[0029] (3) Create a certain number of multi-threads based on the performance of the proxy server.
[0030] (4) Convert the crawling target address started by the crawler into an Http request and send it from the proxy server pool
[0031] Obtain a valid proxy server in and set the Http request to be executed through the proxy server.
[0032] (5) Add the Http request to an Http request que...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com