Long text-oriented semantic matching method and system

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology for semantic matching and long text, applied in the field of semantic matching methods and systems for long texts, can solve problems such as unsatisfactory effects of text semantic understanding methods, achieve unsatisfactory results, optimize user experience, and improve search speed

Active Publication Date: 2020-02-21

SICHUAN CHANGHONG ELECTRIC CO LTD

View PDF19 Cites 1 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] The technical problem to be solved by the present invention is to provide a long text-oriented semantic matching method and system to solve the unsatisfactory effect of the text semantic understanding method in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0038] Embodiment 1 provides a semantic matching method for long texts, which is mainly used in the field of semantic matching of long texts, to find TOPK text data similar to the target text, such as figure 1 The specific implementation steps shown are as follows:

[0039] Step s1: Perform data processing on the input text, including operations such as removing special characters, word segmentation, word segmentation, and text preprocessing.

[0040] During the data processing in step s1, invalid characters in the input text can be removed, and then the input text can be converted into a text sequence in units of characters and a text sequence in units of words.

[0041] Step s2: Map the input text after data processing into a numerical sequence. Specifically, it may include:

[0042] Step s21: Perform word vector training based on the data in the database, and generate a dictionary to obtain a word vector model. Different sub-feature extraction modules have different word ...

Embodiment 2

[0072] Embodiment 2 provides a long text-oriented semantic matching system, including:

[0073] The text processing module is used to perform data processing on the input text, including operations such as removing special characters, word segmentation, word segmentation, and text preprocessing;

[0074] A numerical sequence generation module, which is used to map the input text after data processing into a numerical sequence in units of words and a numerical sequence in units of words;

[0075] The feature vector extraction module is used to input the numerical sequence of the input text into the feature extraction model to obtain the feature vector of the input text. The feature extraction module includes multiple sub-feature extraction models, and the feature vector of the input text is the output result of multiple sub-feature models. Fusion;

[0076] The database processing module is used to pass each piece of data in the database through a text processing module, a nume...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to the technical field of natural language understanding, discloses a long text-oriented semantic matching method and system, and the method is used for solving the problem of unsatisfactory effect of a text semantic understanding method in the prior art. The method comprises the following steps: performing data processing on an input text, wherein the data processing comprises removing special characters, segmenting words and segmenting characters; mapping the input text subjected to data processing into a numerical sequence; inputting the numerical sequence of the inputtext into a feature extraction model to obtain a feature vector of the input text; clustering based on the feature vectors; based on the clustered database, selecting TOP-N types of candidate data most similar to the input text from the database; and performing similarity measurement on the feature vector of the input text and the feature vector of the candidate data, and selecting TOP-K data most similar to the input text from the candidate data. The method is suitable for semantic matching of the long text.

Description

technical field [0001] The invention relates to the technical field of natural language understanding, in particular to a long text-oriented semantic matching method and system. Background technique [0002] As one of the important directions in the field of artificial intelligence, natural language understanding technology has always been a research hotspot for researchers in related fields. Especially in recent years, with the rapid development of mobile Internet technology and the increasing degree of informatization, people are increasingly eager to allow machines to understand natural language, so as to achieve the goals of reducing manual investment and sharing massive data. [0003] In related technologies, mainstream methods are text semantic understanding methods based on recurrent neural networks and text semantic understanding methods based on convolutional neural networks. However, the usual recurrent neural network and convolutional neural network are difficult...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Applications(China)

IPC IPC(8): G06F16/33G06F40/289G06F40/30G06K9/62G06N3/04

CPCG06F16/3344G06N3/045G06F18/23G06F18/214

Inventor 杨兰展华益孙锐周兴发饶璐谭斌

Owner SICHUAN CHANGHONG ELECTRIC CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Long text-oriented semantic matching method and system

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology