Keyword based topic-focused web crawler design method
A design method, web crawler technology, applied in the direction of web data indexing, web data retrieval, computing, etc., to achieve the effect of increasing the number, improving crawling efficiency, and improving the universality
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment
[0098] figure 1 It is a specific implementation flow chart about the producer thread, and the specific steps are as follows:
[0099] (1) Configure the description information of the domain ontology and use it as a template for the subject crawler. The description information includes: subject keywords and crawling keywords.
[0100] (2) Determine the subject keyword set of "food safety", and obtain the foodsecure subject keyword table foodsecureWord.
[0101] In this implementation, Baidu, google, bing and 360 are used as search engines, and the theme is set as "food safety". Keywords related to food safety such as "production safety standards", "food exceeding the standard", "food additives" are stored in the database table foodsecureWord, which is the so-called process of manually selecting subject keywords. Then use these keywords as search keywords to search in the search engine, and the retrieved content is stored in the text file. Finally, after word segmentation and ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com