A method and electronic device for long text retrieval in an open domain question answering task
An open-field, long-text technology, applied in unstructured text data retrieval, text database query, semantic analysis, etc., can solve problems such as poor generalization, error-prone, and cumbersome processes, and achieve strong reusability and improved accuracy rate, improving the effect of ambiguity
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0050] Such as figure 1 As shown, the embodiment of the present invention provides a method for long text retrieval in an open domain question answering task, including:
[0051]S101, using a pre-trained encoder to encode open domain documents and search requests into document dense vectors and request dense vectors respectively; wherein, the encoder uses historical search requests, positive samples and negative samples as sample data for training;
[0052] S102. Calculate the similarity score between the search request and the open domain document according to the document dense vector and the request dense vector, and select the open domain document whose similarity score meets the requirements as a candidate document;
[0053] S103. Select a target document corresponding to the search request from the candidate documents.
[0054] The above method can be described as:
[0055] Given a collection of historical search requests , the document collection where the answer co...
Embodiment 2
[0100] Such as image 3 As shown, another aspect of the present invention also includes a functional module architecture completely corresponding to the aforementioned method flow, that is, the embodiment of the present invention also provides a device for long text retrieval in an open domain question answering task, including:
[0101] The encoding module 201 is used to encode the open-domain documents and search requests into document dense vectors and request dense vectors respectively by using a pre-trained encoder; wherein, the encoder uses historical search requests, positive samples and negative samples as sample data to perform train;
[0102] A candidate document selection module 202, configured to calculate the similarity score between the search request and the open domain document according to the document dense vector and the request dense vector, and select the open domain document whose similarity score meets the requirements as a candidate document;
[0103] ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com