Deep reinforcement learning robot control method based on priority experience playback

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A technology of intensive learning and prioritization, applied in the direction of comprehensive factory control, program control manipulator, manipulator, etc., to achieve good operational performance, improve learning efficiency and effect

Active Publication Date: 2020-07-17

XI AN JIAOTONG UNIV

View PDF5 Cites 5 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

However, there are great difficulties in scientific and effective priority design. Currently, there is a lack of a priority design method for robot arm manipulation tasks.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0027] The overall flow chart of the algorithm is as follows figure 1 shown. The details will be described below.

[0028]The priority experience playback technology method based on the state change of the target object described in this embodiment is to speed up the learning speed and improve the learning effect through the way of priority experience playback during the interactive learning process between the robot and the environment, including the following steps :

[0029] S1. Construct a virtualization environment, and complete the initialization setting of the virtual environment.

[0030] In this embodiment, the present invention is based on a virtualized environment, and the training is completed in the virtualized environment.

[0031] The virtual environment is a simulation environment based on the real environment and robots, and mainly includes two parts: the simulated task environment and the simulated robot. The virtual environment is built based on GYM, and...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a deep reinforcement learning control algorithm based on a priority experience playback mechanism. The priority is calculated through employing state information of an object operated by a robot, an end-to-end robot control model is completed through employing a deep reinforcement learning method, and a deep reinforcement learning intelligent agent is enabled to autonomously learn and complete a specified task in the environment. In a training process, the state information of the target object is collected in real time and used for calculating the priority of experience playback, and then data in an experience playback pool is sampled and learned by a reinforcement learning algorithm according to the priority to obtain the control model. According to the method, onthe premise of ensuring the robustness of the deep reinforcement learning algorithm, environment information is utilized to the maximum extent, the effect of the control model is improved, and the learning convergence speed is increased.

Description

technical field [0001] The invention belongs to the field of robot control, and in particular relates to a robot control method based on a virtual environment, deep reinforcement learning, and a priority experience playback algorithm based on object position changes. Background technique [0002] At present, most robot space grasping technologies mainly pre-set the possible behaviors of robots or are based on traditional 3D vision algorithms. However, with the continuous expansion of robot application fields, the tasks faced by robots are becoming more and more complex. When faced with complex tasks, complex visual calibration methods and visual modeling methods are required, and designers cannot make effective predictions for rapidly changing environments, making it difficult to make reasonable predictions about robot behavior. [0003] Reinforcement learning is an important learning method in the field of machine learning. In the application of the field of robotics, the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): B25J9/16

CPCB25J9/1602Y02P90/02

Inventor 田智强李根杨洋王丛司翔宇

Owner XI AN JIAOTONG UNIV

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Deep reinforcement learning robot control method based on priority experience playback

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology