Cross-modal image generation method and device based on audio-tactile signal fusion

A technology of image generation and signal fusion, applied in neural learning methods, character and pattern recognition, biological neural network models, etc., can solve problems such as complex representation, blurred autoencoder images, and unstable training, to improve image quality, The effect of eliminating discrepancies and improving accuracy and completeness

Active Publication Date: 2021-11-09
NANJING UNIV OF POSTS & TELECOMM
View PDF5 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] In addition, the existing generative models are mainly represented by generative confrontation network (GAN) and variational autoencoder (VAE), but GAN has the defects of gradient disappearance and training instability, while the image generated by autoencoder is relatively blurred
At the same time, cross-modal image generation research is mainly based on text-based image generation, and these models can only handle cross-modal generation of a single modality, even if they can be extended, considering that the word-level features of text are relatively tactile and audio-usually in the form of Time-domain sequence representation is more complex, and text-based cross-modal models are not suitable for multi-modal scenarios

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Cross-modal image generation method and device based on audio-tactile signal fusion
  • Cross-modal image generation method and device based on audio-tactile signal fusion
  • Cross-modal image generation method and device based on audio-tactile signal fusion

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0092] In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.

[0093] Efficient and accurate cross-modal image generation methods need to be formulated, capable of fusing different modalities and achieving high-quality image data reconstruction. In recent years, the confrontation generation model has achieved good success in the field of image generation, and the knowledge distillation model also provides a simple and efficient way to improve the fine-grained image generation. Therefore, the present invention proposes a cross-modal image generation method based on audio-tactile signal fusion. The fusion method based on deep semantics can improve the accuracy of model reconstruction; latent space learning maps the semantic features of cross-modal data to a "latent learning space" to measure the similarity of dif...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a cross-modal image generation method based on audio-tactile signal fusion. The method comprises the following steps: 1), selecting a multi-modal data set containing audio data, image data and tactile signals, and dividing the data set into a training set and a test set; 2) designing an audio-tactile signal fused cross-modal image generation model, wherein the model comprises a deep semantic fusion module, a potential space learning module and a cross-modal image generation module; 3) training the model by using a training set to obtain an optimal parameter; and 4) generating a corresponding image in a cross-modal manner based on the trained model by using the tactile signal and the audio data in the test set. The invention also discloses a cross-modal image generation device based on audio-tactile signal fusion, which introduces a strong generative adversarial mechanism, utilizes label information and effectively improves the accuracy and robustness of image generation.

Description

technical field [0001] The invention relates to the technical field of image generation, in particular to a cross-modal image generation method and device based on audio-tactile signal fusion. Background technique [0002] With the rapid development of wireless communication and multimedia technology, people began to pursue a more realistic immersive experience. Touch, as a new sensory dimension, gradually began to integrate and sublimate traditional audio-visual services, forming a cross-modal business. The cross-modal communication of collaborative audio-visual-touch transmission is considered to be a reasonable and efficient communication method to support cross-modal business. However, due to the unreliability of transmission and the difference in communication quality of signals of different modalities, visual signals often face severe loss, and restoration and reconstruction are urgently needed. [0003] Existing image generation work mainly utilizes the inherent inf...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06K9/62G06K9/46G06N3/04G06N3/08
CPCG06N3/08G06N3/045G06F18/241G06F18/253
Inventor 姚玉媛魏昕高赟周亮
Owner NANJING UNIV OF POSTS & TELECOMM
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products