Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

SQL (Structured Query Language) conversion method and system based on language model coding and multi-task decoding

A technology of language model and conversion method, which is applied in the direction of neural learning method, biological neural network model, special data processing application, etc., can solve the problems of unable to meet the joint coding requirements of text and database mode, poor scalability, complex model design, etc., to achieve Strong feature encoding ability and generalization, improve accuracy, and alleviate the effect of lack of labeled data

Active Publication Date: 2021-06-18
ZHEJIANG UNIV
View PDF12 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this method is too simple and has poor scalability, and it is easy to parse errors for slightly more complex query conditions
Another SQL parsing scheme based on a syntax tree, the model design is complex, and the interpretability is poor
[0008] In terms of text encoding, traditional word vectors are static encodings. The same word has the same feature vector in different contexts, which cannot meet the joint encoding requirements of text and database schemas.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • SQL (Structured Query Language) conversion method and system based on language model coding and multi-task decoding
  • SQL (Structured Query Language) conversion method and system based on language model coding and multi-task decoding
  • SQL (Structured Query Language) conversion method and system based on language model coding and multi-task decoding

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] The present invention will be further elaborated and illustrated below in conjunction with the accompanying drawings and specific embodiments.

[0032] Such as figure 1 Shown, a kind of SQL conversion method based on language model coding and multi-task decoding, comprises the following steps:

[0033] 1. According to the type of the query database, the language model encoder is pre-trained. The language model encoder includes an Embedding layer and a Transformer network, and the pre-trained language model encoder is obtained after training;

[0034] 2. Expand the query database in turn according to the table name and column name, convert the two-dimensional table into a one-dimensional text sequence, combine the user query statement to form the input sequence X, and give the target SQL sequence corresponding to the user query statement;

[0035] 3. Use the sequence X as the input of the Embedding layer of the pre-trained language model encoder to obtain the initial en...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an SQL (Structured Query Language) conversion method and system based on language model coding and multi-task decoding. The method comprises the steps: combining a language model in combination with a field where a data set is located to carry out pre-training, and improving the feature extraction capability in the field; sequentially expanding the query database according to table names and column names, converting a two-dimensional table into a one-dimensional text sequence, and splicing the one-dimensional text sequence into an input sequence X in combination with user questions; inputting the sequence X into a pre-training language model, and outputting a coding result; a multi-task decoder composed of nine different neural networks is utilized to decode and restore the SQL fragments, and cross entropy loss is calculated; different weights are set for loss values of different neural networks, the sum is finally calculated as the total loss of the model, a gradient descent algorithm is utilized to optimize an objective function, and model training parameters are updated; after training is completed. Model parameters are stored, and a corresponding SQL sequence is automatically generated according to the user problem and the target database.

Description

technical field [0001] The invention relates to the sub-field of natural language processing semantic analysis Text to SQL, in particular to an SQL conversion method and system based on language model encoding and multi-task decoding. Background technique [0002] With the rise of big data, the data in real life shows an explosive exponential growth trend. According to the report "Data Age 2025" released by IDC, the annual data generated globally will increase from 33ZB in 2018 to 175ZB, which is equivalent to 491EB data generated every day. [0003] At the same time, the scale of structured data and database storage is also increasing. In the past, when users want to query the contents of the database, they need to write the structured database query language SQL first, and then interact with the database, which brings inconvenience to ordinary users who are not computer professionals. SQL itself is powerful and flexible, and has a certain learning threshold. Moreover, f...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/242G06N3/04G06N3/08
CPCG06F16/2433G06N3/08G06N3/045
Inventor 徐叶琛邹剑云贺一帆赵洲
Owner ZHEJIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products