Method and system for providing gpu service

A server-end and status description technology, applied in the field of providing GPU services, can solve the problems of multiple memory resources, occupation, and lack of deep learning framework Caffe support, etc., to achieve the effect of meeting service requests and high efficiency

Active Publication Date: 2022-03-08
广东星舆科技有限公司
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

But TensorFlow Serving does not provide support for deep learning framework Caffe
[0013] The essence of the deployment of the Caffe model can be understood as the deployment of GPU services. In order to improve the utilization of GPU resources in TensorFlow Serving, Batcher is used to combine multiple reasoning requests into one batch processing request. This mode is not applicable to Caffe. Caffe Although the model under the framework can also load multiple pieces of data at one time for processing, its internal code logic is still executed sequentially rather than in parallel. If batch mode is used, it will not only occupy more video memory resources, but also slow down the processing speed. Improvement, so it is necessary to additionally consider the deployment method of the Caffe model

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for providing gpu service

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] The technical means or technical effects involved in the present disclosure will be further described below. Obviously, the provided examples (or implementations) are only some of the implementations that the present disclosure intends to cover, but not all of them. All other embodiments that can be obtained by those skilled in the art based on the embodiments in the present disclosure and the explicit or implied representations of the pictures and texts will fall within the protection scope of the present disclosure.

[0028] In general, this disclosure proposes a method for providing GPU services, including the following steps: start a container in a container cluster management system; read configuration information and load an inference server in the container according to the configuration information; receive a request from a client information, according to the request information, send a calculation instruction to the inference server, and the calculation instruc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The disclosure relates to the technical field of deep learning services, and discloses a method and system for providing GPU services. The method includes the following steps: start a container in a container cluster management system; read configuration information and load the container in the container according to the configuration information Inference server; receiving request information from the client, and sending a calculation instruction to the inference server according to the request information, the calculation instruction is used to instruct the inference server to invoke a model deployed based on the Caffe framework to perform inference on the GPU; receiving the calculation result returned by the reasoning server; and sending the calculation result to the client. Some technical effects of the present disclosure are: by loading the inference server in the container, and invoking the model deployed by the Caffe framework to perform inference on the GPU, the service request of the client can be satisfied with high efficiency.

Description

technical field [0001] The present disclosure relates to the technical field of deep learning services, in particular to a method and system for providing GPU services. Background technique [0002] Deep learning (DL, Deep Learning) has developed extremely rapidly in recent years. Various researches based on convolutional neural networks (CNNs, Convolutional Neural Networks) are in full swing, and academic literature emerges in endlessly. The emergence of these research results is inseparable from the support of hardware. It is the super computing power of the image processing unit (GPU, Graphics Processing Unit) that makes it possible for shallow neural networks to rapidly develop into deep neural networks. Therefore, GPU is an indispensable computing resource for deep learning research and application. [0003] The application fields of deep learning are extremely wide, such as image processing, speech recognition, target detection and other fields are involved. The proc...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F9/50G06T1/20
CPCG06F9/5027G06T1/20
Inventor 谢盈
Owner 广东星舆科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products