Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

51 results about "Deep Web" patented technology

The deep web, invisible web, or hidden web are parts of the World Wide Web whose contents are not indexed by standard web search-engines. The opposite term to the deep web is the "surface web", which is accessible to anyone/everyone using the Internet. Computer-scientist Michael K. Bergman is credited with coining the term deep web in 2001 as a search-indexing term.

Data mining device based on Deep Web deep dynamic data and method thereof

The invention discloses a data mining device based on Deep Web deep dynamic data and a method thereof. The device comprises a commercial server, a data storage server, a data index server and a file server; device systems based on the device comprise an acquisition simulative theme thesaurus management system, an acquisition task scheduling management system, an acquisition server and an acquisition storage scheduling system. The invention provides a dynamic data acquisition means with large quantity, high data quality, strong real-time property and easy deep analysis, and makes up the defect that the quantity and quality of the conventional search engine are all limited; and the invention has simple and practical operation, rich customization function and good expandability and robustness, and a user can customize, acquire and reestablish a management database according to the specific or strongly-monographic requirements, provide data utilization efficiency to great extent, and expand data source and information resource.
Owner:TONGFANG KNOWLEDGE NETWORK TECH CO LTD (BEIJING)

Integration method of Deep Web query interface based on tree merging

The invention discloses an integration method of Deep Web query interface based on tree merging. A pattern tree is used for representing the query interface, and the structural features of the tree are utilized to embody the logical relation implied in the physical layout between query properties. Except for calculating the semantic similarity of attribute in conventional pattern matching, the matching process also introduces the structural similarity of attribute in the pattern tree, puts forward the method for calculating the structural similarity between nodes, thereby improving the accuracy of attribute matching. The integration of query interface is realized based on tree merging, which can not only inherit the structural features of the initial query interface, but also realize the accession of new query interfaces by one merging with good expansibility. Except for generating integration interface, the invention can also conveniently generate the mapping relation of attributes between the original query interface and the integration interface.
Owner:ZHEJIANG UNIV

Automatic extraction method oriented to data of deep web pages

The invention discloses an automatic extraction method oriented to data of deep web pages, and belongs to the field of computer data mining. The automatic extraction method includes obtaining two deep web pages of the same website at first, and respectively marking the two deep web pages as a first page and a second page; converting HTML (hypertext markup language) documents of the first page and the second page into XHTML (extensible hypertext markup language) documents; then removing noise of the first page and the second page; eliminating repeated modes of the first page and the second page to generate a webpage data extraction wrapper; removing noise of the page with the data to be extracted at first when the page is extracted; marking the page by the webpage data extraction wrapped after the noise of the webpage is removed, and finally extracting the marked page. By the aid of the automatic extraction method, efficiency of a repeated mode elimination algorithm and efficiency of a matching algorithm are improved, extraction complexity is reduced, the matching algorithm and an extraction algorithm, which are designed according to characteristics of the repeated mode elimination algorithm, in the method are simple and speedy in process, and data extraction accuracy is improved.
Owner:CHONGQING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products