Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Reinforcement learning-based AUV behavior planning and motion control method

A technology of reinforcement learning and motion control, applied in three-dimensional position/channel control, biological neural network model, neural architecture, etc., can solve the problems of over-reliance on artificial experience, limited training experience, and low level of intelligence

Active Publication Date: 2019-10-15
HARBIN ENG UNIV
View PDF15 Cites 54 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0006] The present invention solves the problem that the intelligence level of the underwater robot is not high and relies too much on manual experience when the underwater robot completes complex tasks, and the control method designed based on the intelligent algorithm of the existing underwater robot needs an accurate environment model, resulting in very limited training experience , applied to difficult problems in real-world settings

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Reinforcement learning-based AUV behavior planning and motion control method
  • Reinforcement learning-based AUV behavior planning and motion control method
  • Reinforcement learning-based AUV behavior planning and motion control method

Examples

Experimental program
Comparison scheme
Effect test

specific Embodiment approach 1

[0075] This embodiment is an AUV behavior planning and action control method based on reinforcement learning.

[0076] The present invention defines a three-layer structure of intelligent underwater robot tasks, namely: task layer, behavior layer and action layer; AUV behavior planning is performed when a sudden state is encountered, and the Deep Deterministic Policy Gradient (DDPG) controller is used to control the AUV. motion control.

[0077] The implementation process includes the following three parts:

[0078] (1) Hierarchical design of intelligent underwater robot tasks;

[0079] (2) Behavior planning system construction;

[0080] (3) Design based on DDPG control algorithm;

[0081] Further, the process of the content (1) is as follows:

[0082] In order to complete the stratification of underwater robot tunnel detection tasks, the concepts of intelligent underwater robot tunnel detection tasks, behaviors and actions are defined: the underwater robot tunnel detectio...

specific Embodiment approach 2

[0165] The process of establishing an AUV model with fuzzy hydrodynamic parameters described in the first embodiment is a common AUV dynamic modeling process, which can be realized by using the existing technology in the field. In order to use the above process more clearly, this embodiment The process of establishing an AUV model with fuzzy hydrodynamic parameters will be described. It should be noted that the present invention includes but not limited to the following methods to establish an AUV model with fuzzy hydrodynamic parameters. The process of building an AUV model with fuzzy hydrodynamic parameters includes the following steps:

[0166] Establish the hydrodynamic equation of the underwater robot:

[0167]

[0168] Among them, f—random disturbance force; M—system inertial coefficient matrix, satisfying M=M RB +M A ≥0; M RB —The inertia matrix of the carrier, satisfying and m A —Additional mass coefficient matrix, satisfying — Coriolis force-centripetal f...

Embodiment

[0184] The main purpose of the present invention is to allow the underwater robot to independently complete behavior decision-making and action control according to the current environmental state in the underwater environment, so that people can get rid of the complicated programming process. The specific implementation process is as follows:

[0185] 1) Use programming software to build a behavior planning simulation system for intelligent underwater robots based on deep reinforcement learning, and obtain the optimal decision-making strategy of the robot through simulation training. The specific steps are as follows:

[0186] 1.1) Establish an environment model, determine the initial position and target point, and initialize the algorithm parameters;

[0187] 1.2) Determine the current state of the environment and the robot task at time t, and decompose the task into behaviors: approaching the target, wall tracking, and obstacle avoidance;

[0188] 1.3) According to the curr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a reinforcement learning-based AUV behavior planning and motion control method, which belongs to the technical field of underwater robots and aims at solving the problems of limited training experience and difficult application in a real environment as complex task planning by the AUV depends much on manual experience and a control method designed based on an intelligent algorithm needs an accurate environment mode. AUV tunnel detection is defined as a general task; behaviors corresponding to task completion comprise target trending, wall tracking and obstacle avoidance; a control instruction generated by the robot under water to complete the planned behavior is defined as an action; and when the AUV executes a tunnel detection task, a deep reinforcement learning DQN algorithm is used for real-time behavior planning, a corresponding deep learning behavior network is constructed, and planning of the tunnel detection task is completed. The AUV action network is trained by the DDPG method, the AUV is regarded as an environment model, and force-to-state mapping is obtained, thereby realizing the action control of the AUV.

Description

technical field [0001] The invention belongs to the technical field of underwater robots, and in particular relates to an AUV behavior planning and action control method. Background technique [0002] The 21st century is the ocean century, and vigorously developing the ocean industry has become the broad consensus of all countries in the world. my country has also issued and implemented an important marine strategy. Since our country is currently in a stage of rapid development and is a populous country with limited land resources, marine resources have become an important resource space to support sustainable development. The development and exploration of marine resources is an important prerequisite for real-time marine strategies. As a key underwater technical equipment, autonomous underwater vehicles (AUVs) have become practical and effective development tools in the fields of marine civil, military and scientific research. Tools are an important means of ocean develop...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G05D1/10G06N3/04
CPCG05D1/10G06N3/045G05D1/0088
Inventor 孙玉山冉祥瑞张国成李岳明曹建王力锋王相斌徐昊吴新雨马陈飞
Owner HARBIN ENG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products