Method and system for uniformly managing AI models based on distributed file system

A technology of distributed files and management modules, which is applied in the field of unified management of AI models based on distributed file systems, can solve the problems of model prediction value deviation, cumbersome conversion process, and increase the difficulty of deployment, so as to achieve the effect of optimizing update and convenient use

Active Publication Date: 2020-02-07
中电福富信息科技有限公司 +1
View PDF6 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, this method has many shortcomings. For example, the process of converting TensorFlow’s proprietary model to PMML format model is cumbersome, and the model file after conversion generally becomes larger, and requires the installation of corresponding plug-ins to read, which increases the difficulty of deployment; and PMML The unified model does not record the unique optimization of each framework, and the running speed is slow
In addition, there may be deviations between the transformed model and the predicted value of the original model.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for uniformly managing AI models based on distributed file system
  • Method and system for uniformly managing AI models based on distributed file system
  • Method and system for uniformly managing AI models based on distributed file system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0031] At present, the industry lacks effective unified management of AI models built by various machine learning frameworks. The method of converting the SparkMLlib model into a PMML (Predictive Model Markup Language) model file is not only limited in scope of application, but also has shortcomings in optimization and storage. As an open neural network exchange format, ONNX is applicable to models such as TensorFlow and Pytorch. It also has the problems of unavoidable lack of optimization and deviation of prediction results during model conversion. Using REST or gRPC needs to build a serving environment, which requires heavy deployment workload and cannot support traditional machine learning frameworks such as scikit-learn, gensim, and xgboost. The present invention is based on the model warehouse of the distributed file system, and adds a model iteration management module that can input AI models constructed by various frameworks, and can extract preset model file informatio...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method and system for unified management of AI models based on a distributed file system. Based on the distributed file system, a model iteration management module is additionally arranged to extract preset model file information; wherein the model file information comprises information such as a model name, a model version, model creation time, whether a model is onlinepublic and whether the model is dirty, an AI model record is newly added in the metadata table, and the model is stored in a model warehouse according to a preset model storage path to construct an AImodel management system formed by combining the metadata table and the model warehouse. The newly-added model reading module analyzes data input by a user, extracts model information, matches the model information with records in a metadata table, extracts metadata items in the table, checks whether a model is online or not, and if the model is online, extracts complete target model information including a dirty model state from nodes of the distributed file system according to metadata and returns the complete target model information to the user. A user can use the model in real time and can also optimize the model and upload the model again to facilitate optimization and updating of the model.

Description

technical field [0001] The invention relates to the technical field of artificial intelligence, in particular to a method and system for unified management of AI models based on a distributed file system. Background technique [0002] Currently, there are various methods applied to generate predictive models. Data scientists and engineers can choose from a variety of languages ​​to build AI predictive models. For example, use the Python language to call the scikit-learn framework to build a prediction model, use Java or Scala language to call the Spark MLlib framework to build a prediction model, and so on. Numerous construction methods result in specialized models for various environments. Recently, with the widespread use of deep learning, frameworks such as TensorFlow and Pytorch support publishing machine learning models through REST or gRPC, but traditional machine learning frameworks, such as scikit-learn, gensim, xgboost, etc., do not yet support it. Also, tensorfl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/13G06F16/172G06F16/182
CPCG06F16/13G06F16/172G06F16/182
Inventor 连城张恩赐刘威
Owner 中电福富信息科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products