Interactive code searching method and device based on structured embedding
A search method and structured technology, applied in digital data information retrieval, instrumentation, computing, etc., can solve problems such as insufficient search performance
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0093] This embodiment provides an interactive code search method based on structured embedding, please refer to figure 1 , the method includes:
[0094] Step S1: Collect the original data, extract the software repository and the model corpus of the code-description matching pair from the original data, and obtain the social attribute value of each code-description matching pair during the extraction process.
[0095] Specifically, the original data can come from different open source databases, and the software repositories can be in different programming languages. For example, the software repositories including C#, Java, SQL and Python are crawled from StackOverflow in the software Q&A community. and code - the model corpus describing the matching pairs.
[0096] Step S2: Perform structured word segmentation and preprocessing on the model corpus to obtain the processed corpus.
[0097] Specifically, S2 is the word segmentation of the code repository and model corpus. Sp...
Embodiment 2
[0178] Based on the same inventive concept, this embodiment provides, please refer to Figure 5 , the device consists of:
[0179] The collection module 201 is used to collect the original data, extract the software repository and the model corpus of the code-description matching pair from the original data, and obtain the social attribute value of each code-description matching pair during the extraction process;
[0180] The structured word segmentation module 202 is used to perform structured word segmentation and preprocessing on the model corpus to obtain the processed corpus;
[0181] Structured word embedding module 203, for adopting preset tool to carry out word embedding training to the corpus after processing, constructs the structured word embedding of pre-training;
[0182] The high-quality corpus extraction and division module 204 is used to carry out structured word segmentation and preprocessing on the model corpus, and filter out a preset number of corpus acco...
Embodiment 3
[0188] See Image 6 , based on the same inventive concept, the present application also provides a computer-readable storage medium 300, on which a computer program 311 is stored. When the program is executed, the method as described in the first embodiment is implemented.
[0189] Since the computer-readable storage medium introduced in the third embodiment of the present invention is the computer-readable storage medium used to implement the interactive code search method based on structured embedding in the first embodiment of the present invention, based on the introduction in the first embodiment of the present invention Those skilled in the art can understand the specific structure and deformation of the computer-readable storage medium, so details will not be repeated here. All computer-readable storage media used in the method in Embodiment 1 of the present invention fall within the scope of protection intended by the present invention.
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com