Multi-modal data expansion method and system, medium, computer equipment and terminal
A multi-modal, data technology, applied in computer parts, computing, image data processing, etc., can solve the problems of accurate correction of semantic change text description, time-consuming and labor-intensive, low efficiency, etc., to achieve good data expansion effect and data expansion efficiency. High, enhance the effect of training effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0072] In view of the problems existing in the prior art, the present invention provides a multi-modal data expansion method, which can automatically perform data expansion without changing the semantic information of any modal data.
[0073] 1. Program Description
[0074] Assuming a multimodal data set D={(I 1 , T 1 ),(I 2 , T 2 ),...,(I n , T n )}, where I i is a picture, T i is a piece of text corresponding to the picture, (I i ,T i ) to form a pair of samples, and there are n pairs of samples in the data set. For such data, the general process is to extract I respectively i Characteristics and T i Characteristics Then based on the multimodal machine learning model pair and Modeling the relationship between, so in fact constitute a pair of training samples. In particular, extract Divided into two steps, the first step is through the convolutional neural network target detection model from I i All the target objects in the picture are detected in , a...
Embodiment 2
[0093] This example describes the implementation process of one stitching. Taking the picture set I in the "COCO Caption train2014" data set as an example, k=2, m=10, that is, two pictures are stitched together, and each picture takes 10 detection targets Characteristics.
[0094] 1. Image stitching
[0095] Picture I with the label 000000190141 in I 190141 As an example, randomly fetch the picture collection {I 190141 , I 202099}, for k=2, this embodiment adopts the way of splicing left and right, splicing into pictures Stitching does not change the aspect ratio of the two pictures, before splicing I 190141 The resolution is 640*423, I 202099 The resolution is 640*480, because I 190141 and I 202099 The width is different, the unaligned part is filled with 0 value during splicing, after splicing The resolution is 1280*480. Figure 3 shows the images before and after stitching.
[0096] 2. Get the collection of detection frames
[0097] In this embodiment, the Faste...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com