Text word vector model training method and device, terminal and storage medium

A training method and word vector technology, applied in neural learning methods, biological neural network models, electrical digital data processing, etc., can solve the problem that word vector models are difficult to have both semantic and grammatical complex characteristics

Inactive Publication Date: 2020-02-14
GUANGDONG BOZHILIN ROBOT CO LTD
View PDF3 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] This application provides a training method, device, terminal and storage medium for a word vector model to solve the problem that the current word vector model is difficult to have both semantic and grammatical complex features

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text word vector model training method and device, terminal and storage medium
  • Text word vector model training method and device, terminal and storage medium
  • Text word vector model training method and device, terminal and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0054] In order to enable those skilled in the art to better understand the solutions of the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application.

[0055]In some processes described in the specification and claims of the present application and the description in the above-mentioned drawings, multiple operations appearing in a specific order are included, but it should be clearly understood that these operations may not be performed in the order in which they appear herein Execution or parallel execution, the sequence number of the operation, such as S11, S12, etc., is only used to distinguish different operations, and the sequence number itself does not represent any execution sequence. Additionally, these processes can include more or fewer operations, and these operations can be performed sequentially or in paral...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a text word vector model training method and device, a terminal and a storage medium. The training method of the text word vector model comprises the following steps: acquiringtext information of a text sample; splitting characters of the text information into etymons based on a five-stroke input method, converting the five-stroke etymons into a form of a numerical value sequence, and establishing a relational dictionary of each etymons in the five-stroke etymons and numerical values; converting all the etymons into corresponding numerical values according to the relational dictionary, and performing one-hot coding on the numerical values of the etymons to obtain etymons codes; inputting the etymon code into a recurrent neural network to generate a glyph code; performing vocabulary one-hot coding on each word of the text information to obtain a corresponding vocabulary code; and inputting the glyph code and the vocabulary code into a bidirectional recurrent neural network for model training to obtain a word vector model. The bidirectional recurrent neural network is trained by utilizing glyph coding and vocabulary coding, so that word vectors output by the word vector model have glyph information and context information.

Description

technical field [0001] The present application relates to the technical field of natural language processing, and in particular to a training method, device, terminal and storage medium for a word vector model. Background technique [0002] In the development of natural language processing, the distributed representation method is a milestone word representation technology. It represents a word in the form of a multi-dimensional vector, which can represent the similarity between words from multiple dimensions and is closer to words. connotation in language. [0003] Although the distributed representation method has made breakthroughs compared with the early text representation methods, it cannot effectively identify and distinguish unregistered words or polysemous words, making it difficult for the trained text word vector model to have both semantic and grammatical complex characteristics. Contents of the invention [0004] The present application provides a training me...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/279G06F40/247G06F40/126G06N3/04G06N3/08
CPCG06N3/08G06N3/045
Inventor 胡盼盼佟博黄仲强谢晓婷严彦昌杨金辉余梓玲胡浩
Owner GUANGDONG BOZHILIN ROBOT CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products