Automatic extraction device, method and program of essay title and correlation information

A title and article technology, applied in the field of article title extraction device, can solve problems such as prone to misjudgment, insufficient extraction rate, difficult to apply technical paper articles, etc., to achieve the effect of improving accuracy and reducing incompleteness

Inactive Publication Date: 2007-05-02
FUJIFILM BUSINESS INNOVATION CORP
View PDF2 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0010] However, since the title extraction device of Patent Document 1 targets unformatted articles, the title is extracted using the layout information (layout) feature of the line area, so there is a problem that the extraction rate is insufficient.
Although Patent Document 2 uses the attributes of several titles to evaluate titles, for articles with multiple short string rectangles, there is a problem that misjudgment is prone to occur because there are many short string rectangles with title attributes
[0011] In addition, the techniques disclosed in Non-Patent Document 1 and Non-Patent Document 2 have problems in that it is difficult to apply to articles other than technical papers because they depend on the structure of the article, and when the beginning information of the article is small, Cannot perform correct title extraction

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Automatic extraction device, method and program of essay title and correlation information
  • Automatic extraction device, method and program of essay title and correlation information
  • Automatic extraction device, method and program of essay title and correlation information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] Hereinafter, preferred embodiments of the present invention will be described with reference to the drawings.

[0041] (Example)

[0042] FIG. 1 is a block diagram showing an article title extracting apparatus according to an embodiment of the present invention. The title extraction device 10 includes an input device 12, a display device 14, a main storage device 16, a storage device 18, a central processing unit (CPU) 20, and a bus 22 connecting these devices.

[0043] The input device 12 includes a keyboard for inputting information by keyboard operation, an optical reading device (scanner) for optically reading text or the like written in a document, an input interface for inputting data from an external device or an external memory, and the like. The display device 14 includes a display or the like for displaying titles extracted from articles, their related information, and the like. The main storage device 16 includes ROM or RAM, and stores programs and processe...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

An automatic-drawing device of article title and correlation information consists of title candidate sentence drawing unit for drawing out multiple title candidate sentence from test article inputted by article input unit, characteristic value drawing unit for drawing out characteristic value from each of multiple title candidate sentence and title deciding unit of deciding out title from multiple title candidate sentence according to drawn out characteristic value.

Description

technical field [0001] The present invention relates to an article title extracting device for automatically extracting article titles from articles read by a scanner or the like. Background technique [0002] Devices for extracting article titles from digitized image data by reading paper manuscripts using an optical scanner or the like have been put into practical use. For example, Patent Document 1 relates to a headline extraction device for extracting a headline of an article from an image of an article obtained when an article is converted into image data. According to the headline extraction device, a rectangular area connected by black pixels circumscribed in an image of an article is Extract it as a character rectangle, and combine a plurality of adjacent character rectangles, and extract the rectangle area circumscribing these character rectangles as a character string rectangle, and then, according to the underline attribute, framed attribute, Table attributes and...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/20
Inventor 张正操孙茂松刘绍明
Owner FUJIFILM BUSINESS INNOVATION CORP
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products