Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

System and method for acquiring Web API knowledge based on Stack Overflow website

A website and knowledge technology, applied in character and pattern recognition, special data processing applications, instruments, etc., to achieve the effect of improving prediction accuracy

Active Publication Date: 2020-08-14
SHANGHAI JIAO TONG UNIV
View PDF5 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

As of now, there is no research on how to get knowledge about Web API from Stack Overflow website

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • System and method for acquiring Web API knowledge based on Stack Overflow website
  • System and method for acquiring Web API knowledge based on Stack Overflow website
  • System and method for acquiring Web API knowledge based on Stack Overflow website

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0044] At the time of data collection and filtering, the data file posts.xml was downloaded from the publicly available data dump on the Stack Overflow website, which contained all questions and answers posted between August 2008 and February 2019. For each Web API, the first step of data collection is carried out by combining keyword search and tag search, such as the YouTube API, collecting all questions containing the keyword "youtube" and tags related to the API; and then eliminating Remove some irrelevant data, which only contain keywords in code segments or HTML hyperlinks; finally select a label that is most relevant to the Web API, and use the corresponding data as a positive sample, combined with the remaining unlabeled samples, Use the semi-supervised learning method of PUL (Positive and Unlabeled Learning) to filter positive sample data from these unlabeled samples, and use all positive sample data as the data set of the Web API.

[0045] When classifying the proble...

Embodiment 2

[0048] A system for obtaining Web API knowledge from the Stack Overflow website provided by the present invention includes: data collection and filtering modules, problem category classification modules and performance measurement and prediction modules, such as figure 1 shown.

[0049] First, the data collection and filtering module downloads the data file posts.xml from the data dump disclosed by the Stack Overflow website, which contains all questions and answers issued between a specific time period, and this embodiment contains data from August 2008 All questions and answers posted between April and February 2019. For each Web API, the first step of data collection is carried out through a combination of keyword search and tag search, such as the YouTube API, which collects all questions containing the keyword "youtube" and tags related to the API; then eliminates Remove some irrelevant data, which only contain keywords in code segments or HTML hyperlinks; finally select...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a system and a method for acquiring Web API knowledge based on a Stack Overflow website. The method includes: downloading a data file from data dump disclosed by a Stack Overflow website, screening most relevant data identifiers as positive samples, identifying other data identifiers as unmarked samples, and screening out the positive samples from the unmarked samples by utilizing semi-supervised learning; classifying questions into different categories, performing sentence segmentation on question subjects, classifying segmented sentences by using a deep learning model,counting the number of sentences of each question on different categories according to a classification result, forming a training set and training a prediction model, and predicting the question category through the prediction model to obtain the category to which each question belongs; and for the category to which each problem belongs and the issue time of each problem, carrying out measurement prediction on the performance of Web API by adopting time series analysis prediction to form an opinion about Web API.

Description

technical field [0001] The present invention relates to the field of network service and data mining, in particular to a system and method for acquiring Web API knowledge based on a Stack Overflow website. Background technique [0002] In recent years, Web services on the Internet have developed rapidly, and Web API has become the main type of Web services. For a company or organization, it is a necessary strategy to package some of its functions, resources or data into services and publish them on the Internet in the form of Web API. This phenomenon has led to an exponential increase in the number of Web APIs and the functions they provide. For example, on ProgrammableWeb (https: / / www.programmableweb.com), the largest Web API sharing website, more than 20,000 Web APIs have been released, Divided into more than 480 categories. [0003] When a developer wants to use a certain Web API, he has to consider many factors such as functionality, quality and usability etc. Therefo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/33G06F16/35G06K9/62
CPCG06F16/3344G06F16/35G06F18/241G06F18/24147Y02D10/00
Inventor 曹健王乃轩钱诗友
Owner SHANGHAI JIAO TONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products