Universal distributed acquisition system
A collection system, distributed technology, applied in the field of real-time data collection system, high-efficiency, general-purpose distributed collection system, can solve problems such as large-scale expansion of difficult machine performance, and achieve the effect of avoiding webpage collection and using efficient and reasonable
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0042] figure 1 It is a general distributed collection system, including seed warehouse, task scheduling module, data capture module, and text page warehouse; the seed warehouse is used to store the URL of the demand site and set the information source category and collection time interval; the task scheduling module It is used to coordinate the task load of each collection node; the data capture module is used to capture the information of the allocated collection tasks, which is divided into list page capture and text page capture; both the task scheduling module and the data capture module include Both the server and the client adopt a distributed communication framework; the text page warehouse is used to store the parsed text web page links and provide site access for the text page capture in the data capture module.
[0043] figure 2 It is a schematic diagram of a dynamic hash task allocation algorithm based on machine performance. It is assumed that A, B, and C are ph...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com