Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

778 results about "Structured document" patented technology

A structured document is an electronic document where some method such as markup or embedded coding, is used to identify the whole and parts of the document as having various meanings beyond their formatting. For example, a structured document might identify a certain portion as a "chapter title" (or "code sample" or "quatrain") rather than as "Helvetica bold 24" or "indented Courier". Such portions in general are commonly called "components" or "elements" of a document.

Enhanced transcoding of structured documents through use of annotation techniques

Methods, systems, and computer program products for improving the transcoding operations which are performed on structured documents (such as those encoded in the Hypertext Markup Language, or "HTML") through use of annotations. Source documents may be annotated according to one or more types of annotations. Representative types of annotations direct an annotation engine to perform selective clipping of document content, provide enhanced HTML form support, request node and / or attribute replacement or the insertion of HTML or other rendered markup syntax, and direct a transcoding engine to provide fine-grained transcoding preference support (such as controlling transcoding of tables on a per-row or per-column basis). The disclosed techniques may be used with statically-generated document content and with dynamically-generated content. Annotation is performed as a separate step preceding transcoding, and a modified document resulting from processing annotations may therefore be re-used for multiple different transcoding operations.
Owner:IBM CORP

Integrated retrieval scheme for retrieving semi-structured documents

An integrated retrieval scheme retrieves data involved in a plurality of semi-structured documents scattering over open networks and collects the required information item by item from the semi-structured documents through a unified interface without regard to differences in the document structures, presentation styles, and elements of the semi-structured documents.The search scheme receives a query consisting of search items and search conditions from a user. The search scheme finds, according to location data that specifies the location of each of the semi-structured documents, the location of each semi-structured document that contains all search items and converts, if necessary, item presentation styles of the entered query into that of the location found semi-structured documents according to style conversion data, and forms queries for the location found semi-structured documents, and transmits the queries to the found locations and obtains the location found semi-structured documents, and extracts item data from the obtained semi-structured documents according to structure data being used to delimit document into items and attribute data being used for conditional retrieval, and prepares a search result, and converts, if necessary, item presentation styles of the search result into the item presentation styles of each user according to the style conversion data.
Owner:NIPPON TELEGRAPH & TELEPHONE CORP

High-performance extensible document transformation

The present invention provides a method, system, and computer program product for applying transformations to extensible documents, enabling reductions in the processing time required to transform arbitrarily-structured documents having particular well-defined elements. Signatures for structured document types are defined, along with one or more transformations to be performed upon documents of that type. The transformations are specified using syntax elements referred to as maps. A map specifies an operation code for the transformation to be performed, and describes the input and output of the associated transformation. A special map processing engine locates an appropriate transformation object to a particular input document at run-time, and applies the transformation operation according to the map definition. This technique is preferably used for a set of predetermined core transformations, with other transformations being processed using stylesheet engines of the prior art. The input documents may be encoded in the Extensible Markup Language (XML), or in other structured notations. The techniques of the present invention are particularly well suited to use in high-volume and throughput-sensitive environments such as that encountered by business-to-business transaction servers.
Owner:IBM CORP

Remote operation system, communication apparatus remote control system and document inspection apparatus

This invention is, in a remote operation apparatus which transmits information to a terminal device having a display part which displays received information, for the purpose of operability improvement on the occasion of inspecting structured documents such as Web pages, in a document inspection apparatus with a small screen size, equipped with an input part which inputs various instructions, a display part which displays various information, a communication processing part which obtains display information which is displayed on a display screen of the display part through a network, an area recognition processing part which extracts a size of a rectangular area which is included in a window which was obtained by the communication processing part and is displayed on the display part, and display information in the rectangular area, a storage part which stores size information of the display screen of the display part of the terminal device, an area change processing part which modifies a size of the rectangular area to a size of the display screen which was stored in the storage part, and obtains display information in the rectangular area, and control means which controls the communication processing part so as to transmit the display information which was modified by the area change processing part to the terminal device.
Owner:PANASONIC CORP

Method for extracting, interpreting and standardizing tabular data from unstructured documents

A system, method, and computer program for automatically identifying, parsing, and interpreting tabular data from unstructured documents stored in various formats such as ASCII text, Unicode text, HTML, PDF text, and PDF image format is provided. A set of table identification, parsing / tokenizing, and interpreting / mapping rules are developed with grammar descriptors. These rules are then applied to a set of documents to identify a table, parse the content of the table, and interpret the parsed content, if required, thereby standardizing the tabular data.
Owner:RAGE FRAMEWORKS +1
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products