Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Text increment dimension reduction method based on tensor decomposition

A tensor decomposition and text technology, applied in text database indexing, unstructured text data retrieval, special data processing applications, etc., can solve problems such as low accuracy, big data application attack, semantic loss, etc., and achieve high accuracy , good scalability, low complexity effect

Active Publication Date: 2019-09-06
TONGJI UNIV
View PDF8 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Various existing data dimensionality reduction methods, such as principal component analysis, linear discriminant analysis, latent semantic analysis, etc., are mostly based on statistical theory, and are quite effective in dimensionality reduction of structured data, but ignore the semantics contained in the data , which often lead to serious deviations in dimensionality reduction results and low accuracy
Failure to study the problem of semantic preservation in dimensionality reduction will lead to dimensionality reduction results of semantic loss, which will be a fatal blow to big data applications

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text increment dimension reduction method based on tensor decomposition
  • Text increment dimension reduction method based on tensor decomposition
  • Text increment dimension reduction method based on tensor decomposition

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the drawings in the embodiments of the present invention. Obviously, the described embodiments are part of the embodiments of the present invention, not all of them. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts shall fall within the protection scope of the present invention.

[0034] Such as figure 1 As shown, this embodiment provides a text incremental dimensionality reduction method based on tensor decomposition, which specifically includes the following steps:

[0035] S1: Divide the input text data into multiple subsets, and construct a text feature map cluster for each subset;

[0036] S2: After obtaining multiple text feature map clusters, express each feature map cluster as a second-order tensor of "feature word-feature ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a text increment dimension reduction method based on tensor decomposition. The method comprises the following steps: dividing text data into a plurality of subsets, and constructing a text feature graph cluster for each subset; expressing the second-order tensors as second-order tensors, adding feature dimensions to the second-order tensors to form third-order tensors, decomposing the third-order tensors, and obtaining which feature words and feature word relations the text features of which are subjected to dimensionality reduction according to a decomposed relation matrix, thereby realizing the target of incremental text dimensionality reduction. Compared with the prior art, the method has the advantages of high-efficiency dimension reduction, simplicity, accuracy, suitability for mass data and the like.

Description

technical field [0001] The invention relates to the fields of machine learning and natural language information processing, in particular to a text incremental dimensionality reduction method based on tensor decomposition. Background technique [0002] With the development of information technologies such as the Internet, the Internet of Things, and cloud computing, data resources in cyberspace are growing and accumulating at an unprecedented rate, and the world has entered the era of networked big data. In addition to the massive quantitative characteristics of data volume, big data also has complex characteristics such as discretization, diversification, and unstructured data attributes, which leads to the outbreak of data "dimension disaster", and the result will seriously affect data analysis. and decision support accuracy and efficiency. In order to make better use of the data, it is necessary to reduce the dimensionality of the data. Data dimensionality reduction is ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/31G06F17/16
CPCG06F17/16G06F16/313
Inventor 向阳丁玲
Owner TONGJI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products