Searching system and method based on web page extraction
A search system and search method technology, applied in the field of information search, can solve the problems of low extraction accuracy and poor operability of search engines, and achieve the effect of reducing complexity and speeding up
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0037] The present invention can accurately extract target content and eliminate irrelevant information through a preset template, thereby improving the accuracy and fault tolerance of information extraction, thereby improving the accuracy of search results. Different from ordinary text files, HTML pages contain obvious hierarchical information, which can be described in a tree structure, that is, DOM (Document Object Model, Document Object Model). Since DOM has a unified specification and programming interface, this embodiment establishes a DOM tree for HTML, and any node information in the tree can be conveniently accessed by using the DOM interface.
[0038] Such as figure 1 Shown is a schematic structural diagram of an embodiment of a search system based on web page extraction in the present invention. In this embodiment, the search system includes a web page downloading unit 11 , a web page extracting unit 12 , a template storage unit 13 and a result storage unit 14 . W...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com