Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Image multi-subtitle automatic generation method based on multiscale hierarchical residual network

An automatic generation, multi-scale technology, applied in the field of multi-subtitle acquisition, can solve problems such as easy to ignore image details, achieve the effect of solving gradient disappearance and gradient explosion problems, reducing parameters, and increasing funnel structure

Active Publication Date: 2018-03-27
ZHEJIANG GONGSHANG UNIVERSITY
View PDF3 Cites 20 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0008] Detection-based method: Although the sequence-based method achieves high accuracy on the subtitle acquisition task, it is often easy to ignore the details on the image, so a detection-based method is proposed to solve such problems

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Image multi-subtitle automatic generation method based on multiscale hierarchical residual network
  • Image multi-subtitle automatic generation method based on multiscale hierarchical residual network
  • Image multi-subtitle automatic generation method based on multiscale hierarchical residual network

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0052] In order to describe the present invention more specifically, the technical solutions of the present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0053] The multi-subtitle acquisition method provided in this embodiment can obtain a non-fixed number of category target descriptors in an image, and can be applied to semantic image search, visual intelligence of chat robots, subtitle acquisition of images and videos shared by social media, etc.

[0054] The process of semantically describing the target in the image using the multi-scale layered residual network-based automatic image multi-subtitle automatic generation method in this embodiment includes two parts: training and testing. The following will focus on introducing the multi-subtitle generation model adopted in this embodiment before explaining these two parts.

[0055] figure 1 is a schematic diagram of the framework of the multi-subtitle ge...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an image multi-subtitle automatic generation method based on a multiscale hierarchical residual network, and adopts an improved funnel network to capture multiscale target information. Firstly, when a funnel framework network is constructed, a densely connected polymerization residual block is put forward, and residual LSTM (Long Short Term Memory) is further put forward inorder to solve the problems of gradient vanishing and gradient explosion. By use of the method, high experiment performance is obtained, and the method has an obvious advantage on multi-subtitle taskacquisition.

Description

technical field [0001] The invention relates to a multi-subtitle acquisition technology, in particular to an image multi-subtitle automatic generation method based on a multi-scale layered residual network. Background technique [0002] Multi-caption acquisition is to obtain a non-fixed number of category target descriptors in an image. This work serves as a foundational service for many important applications, such as semantic image search, visual intelligence for chatbots, sharing images and videos on social media, helping people perceive the world around them, and more. [0003] The current study combines convolutional neural networks and recurrent neural networks to predict captions from image feature maps. However, some bottlenecks have been encountered in improving performance: 1) Target detection is still an open problem in computer vision; 2) From image feature space to description space is a nonlinear multimodal mapping; 3) Deeper network It is easier to learn thi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06K9/00G06N3/04G06N3/08
CPCG06N3/08G06V30/413G06V30/40G06N3/045
Inventor 田彦王勋黄刚
Owner ZHEJIANG GONGSHANG UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products