Visual question and answer method based on multi-modal depth feature fusion and model thereof
A deep feature, multi-modal technology, applied in the field of visual question answering, can solve the problems of inability to interact closely with cross-modal features and easy loss of key feature information, so as to improve the prediction accuracy and performance.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment Construction
[0090] The present invention will be clearly and completely described below in conjunction with the accompanying drawings. Those skilled in the art will be able to implement the present invention based on these descriptions. Before the present invention is described in conjunction with the accompanying drawings, it should be pointed out that:
[0091] The technical solutions and technical features provided in each part of the present invention, including the following description, can be combined with each other under the condition of no conflict.
[0092] In addition, the embodiments of the present invention referred to in the following description are generally only some embodiments of the present invention, not all of them. Therefore, based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts shall fall within the protection scope of the present invention.
[0093] The term "MLP...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com