The invention discloses a text clustering multi-document automatic abstracting method and a
system for improving a word vector model. The CBOW of the Hierachic Softmax belongs to the field of large-
scale model training, and the CBOW of the Hierachic Softmax belongs to the field of large-
scale model training. Based on the method, a TesorFlow
deep learning framework is introduced into word vector model training; the
problem of time efficiency of a large-scale
training set is solved through streaming
processing calculation, TF-IDF is introduced firstly during
sentence vector representation, thenthe
semantic similarity of a semantic unit to be extracted is calculated, weighting parameters are set for comprehensive consideration, and a semantic weighted
sentence vector is generated; beneficialeffects are as follows. The advantages and disadvantages of
semantics,
deep learning and
machine learning are comprehensively considered; density clustering and
convolutional neural network algorithms are applied. Intelligent degree is high, according to the method, the statement with high relevancy with the central content can be quickly extracted to serve as the abstract of the text, various
machine learning algorithms are applied to the automatic text abstract to achieve a better abstract effect, the method is possibly the main research direction in future in the field, and in addition, the
system according to the invention supplies a tool for automatic extraction of a document abstract based on the method.