Directional and quantitative Internet data acquisition method and system
A technology of data collection and Internet, applied in the direction of network data query, network data retrieval, network data browsing optimization, etc., can solve the problems of large resource node occupancy, long collection time, missing target number, etc., and achieve less occupied collection nodes. , The collection time is short, and the effect of avoiding data leakage
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0040] as attached figure 1 As shown in the figure, the directional quantitative Internet data collection method of the present invention, the method is to send a retrieval request to a website through a self-defined data display upper limit and an offset value, obtain the associated customized retrieval result, and traverse through one or a few requests. Obtain the full amount of data, and then combine the retrieved results for structured processing, and save them into the database to achieve the purpose of data collection; among them, the associated customized retrieval results refer to the customized data display upper limit and offset value consistent with Response data; obtaining the full amount of data through one or less request traversal means that the number of access requests sent to obtain the full amount of data published by the website only needs to be sent once or less than the total number of pages displayed by default on the website. details as follows:
[004...
Embodiment 2
[0052] The directional quantitative Internet data collection system of the present invention includes,
[0053] The default parameter acquisition module is used to intercept the retrieval request or page-turning request sent to the target website through browser development tools or data collection tools, and obtain the display upper limit of each page and the current number of pages (ie offset value). each request parameter name and value;
[0054] The parameter customization module is used to artificially adjust and increase the value of the display upper limit according to the total amount of target data of the website and set a reasonable offset, and divide the full amount of data into blocks smaller than the total number of pages of the website; among them, the offset The product of the maximum value of the amount and the value of the display upper limit per page is less than or equal to the total amount of target data.
[0055] The test request sending module is used fo...
Embodiment 3
[0059] Embodiments of the present invention also provide an electronic device, including: a memory and a processor;
[0060] wherein, the memory stores computer-executed instructions;
[0061] The one processor executes the computer-executable instructions stored in the memory, so that the at least one processor executes the targeted quantitative Internet data collection method in Embodiment 1.
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com