Text correction for PDF converters
a converter and pdf technology, applied in the field of information processing arts, can solve problems such as improperly removing spaces, introducing errors, and converting documents
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Benefits of technology
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0009] With reference to the FIGURE, a text-based document 10 is to be corrected for errors such as extraneous spaces, extraneous hyphens, or missing spaces. The document 10 may have been generated, for example, by a PDF-to-XML converter 12 that converts a PDF document 14 into XML. In this case, the document 10 may also have text flow problems, for example when concatenating text from contiguous tags. For example deleting tags in order to build a paragraph can introduce text flow problems. While the illustrated document 10 is an XML document, other text-based formats such as RTF, HTML, ASCII, and so forth can also be corrected using the methods and apparatuses disclosed herein.
[0010] A text extractor 16 extracts a portion of text from the text-based document 10 for processing. The portion of text can be a selected portion of the text of the document, for example a portion delineated by text block-delineating mark-up tag pairs, or can be the entire text of the document 10. In the c...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com