Deep reinforcement learning robot control method based on priority experience playback

A technology of intensive learning and prioritization, applied in the direction of comprehensive factory control, program control manipulator, manipulator, etc., to achieve good operational performance, improve learning efficiency and effect

Active Publication Date: 2020-07-17
XI AN JIAOTONG UNIV
View PDF5 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, there are great difficulties in scientific and effective priority design. Currently, there is a lack of a priority design method for robot arm manipulation tasks.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Deep reinforcement learning robot control method based on priority experience playback
  • Deep reinforcement learning robot control method based on priority experience playback
  • Deep reinforcement learning robot control method based on priority experience playback

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0027] The overall flow chart of the algorithm is as follows figure 1 shown. The details will be described below.

[0028]The priority experience playback technology method based on the state change of the target object described in this embodiment is to speed up the learning speed and improve the learning effect through the way of priority experience playback during the interactive learning process between the robot and the environment, including the following steps :

[0029] S1. Construct a virtualization environment, and complete the initialization setting of the virtual environment.

[0030] In this embodiment, the present invention is based on a virtualized environment, and the training is completed in the virtualized environment.

[0031] The virtual environment is a simulation environment based on the real environment and robots, and mainly includes two parts: the simulated task environment and the simulated robot. The virtual environment is built based on GYM, and...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a deep reinforcement learning control algorithm based on a priority experience playback mechanism. The priority is calculated through employing state information of an object operated by a robot, an end-to-end robot control model is completed through employing a deep reinforcement learning method, and a deep reinforcement learning intelligent agent is enabled to autonomously learn and complete a specified task in the environment. In a training process, the state information of the target object is collected in real time and used for calculating the priority of experience playback, and then data in an experience playback pool is sampled and learned by a reinforcement learning algorithm according to the priority to obtain the control model. According to the method, onthe premise of ensuring the robustness of the deep reinforcement learning algorithm, environment information is utilized to the maximum extent, the effect of the control model is improved, and the learning convergence speed is increased.

Description

technical field [0001] The invention belongs to the field of robot control, and in particular relates to a robot control method based on a virtual environment, deep reinforcement learning, and a priority experience playback algorithm based on object position changes. Background technique [0002] At present, most robot space grasping technologies mainly pre-set the possible behaviors of robots or are based on traditional 3D vision algorithms. However, with the continuous expansion of robot application fields, the tasks faced by robots are becoming more and more complex. When faced with complex tasks, complex visual calibration methods and visual modeling methods are required, and designers cannot make effective predictions for rapidly changing environments, making it difficult to make reasonable predictions about robot behavior. [0003] Reinforcement learning is an important learning method in the field of machine learning. In the application of the field of robotics, the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): B25J9/16
CPCB25J9/1602Y02P90/02
Inventor 田智强李根杨洋王丛司翔宇
Owner XI AN JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products