Structured document retrieval device and program
A structured document and tree structure technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve problems such as structural condition retrieval that cannot perform structural conditions and annotations
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
no. 1 example )
[0044] (summary)
[0045] In this embodiment, a structured document retrieval device is described, which performs preprocessing on a collection of XML documents and a collection of annotation data to generate retrieval data in advance, and compares the retrieval data with the retrieval query to find the documents that match the retrieval query. Elements are output as search results. In this embodiment, a text-shared DOM tree in which structural information of XML tags and comment tags is integrated is used as data for retrieval.
[0046] (device structure)
[0047] Picture 1-1 A configuration example of the structured document retrieval device 400 is shown. The structured document retrieval device 400 is configured as a computer including a CPU (Central Processing Unit) 401 , a main storage device (memory) 402 , an auxiliary storage device 403A, and a user interface unit 406 . The structured document retrieval device 400 is connected to an external network device via a net...
no. 2 example )
[0129] In this embodiment, an inclusion relationship between different types of elements is defined between XML elements and annotation elements, or between annotation elements belonging to different annotation groups. Therefore, in this embodiment, a DOM DAG (Directed Acyclic Graph: acyclic directional flag) extended from the text-shared DOM tree structure of the first embodiment is used. In addition, the basic structure of the structured document retrieval device 400 of this embodiment is the same as that of the first embodiment. That is, with Picture 1-1 and Figure 1-2 The structure shown is the basic structure. However, in this embodiment, the DOM DAG construction unit 422 is used instead of the text sharing DOM tree construction unit 415 .
[0130] (Summary of preprocessing)
[0131] As described above, in this embodiment, a structure search using a DOM DAG considered as a parent-child relationship will be described regarding the inclusion relationship of tex...
no. 3 example )
[0151] As described above, if DOM DAG is used, it is possible to perform a search using the structural relationship between tags of different types. However, when searching for a location route, it is not efficient to trace all the constructed DOM DAGs from the root element.
[0152] Therefore, in this embodiment, a path DAG, which is a data structure that aggregates a structure of a plurality of DOM DAGs, is defined. Furthermore, elements in the route DAG can be used as entries, and searches can be performed based on a transposed index with elements in the DOM DAG as values, thereby enabling efficient search using a location route as a search query.
[0153] In the case of the present embodiment, the basic structure of the structured document retrieval apparatus 400 is the same as that of the first embodiment. That is, with Picture 1-1 and Figure 1-2 The structure shown serves as the basic structure. However, in the case of this embodiment, the functions of the D...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com