Robust optimal control method of trolley inverted pendulum system based on departure strategy reinforcement learning

A technology of reinforcement learning and optimal control, applied in the direction of adaptive control, general control system, control/adjustment system, etc. The effect of early convergence

Pending Publication Date: 2021-06-18
CHINA JILIANG UNIV
View PDF0 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

In addition, the finite-time optimal control problem is more difficult to solve than the infinite-time optimal control problem, the main reason is that the obtained HJB equation is time-varying
Therefore, for the optimal control problem of the linear inverted pendulum system in the finite time domain, the solution of the HJB equation obviously contains time t, which increases the difficulty of solving
In addition, the controller needs to meet the limit of saturation constraints. Once the saturation limit is exceeded, it will be fatal to the inverted pendulum control system.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Robust optimal control method of trolley inverted pendulum system based on departure strategy reinforcement learning
  • Robust optimal control method of trolley inverted pendulum system based on departure strategy reinforcement learning
  • Robust optimal control method of trolley inverted pendulum system based on departure strategy reinforcement learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] The robust optimal control method of the trolley inverted pendulum system based on out-of-the-box reinforcement learning includes the following steps:

[0023] Step 1. Aiming at the tracking problem of the inverted pendulum system, the dynamic model of the inverted pendulum system is established, and the inverted pendulum system is abstracted into a continuous-time affine nonlinear system considering the influence of external disturbance and unmodeled dynamics. Then, the augmented system consisting of tracking error system and signal generation system of the trolley inverted pendulum system is constructed by means of state augmentation.

[0024] Define the state vector after dimension increase where e(t) is the tracking error, y m (t) is the reference signal, then the following time-invariant system is composed of the tracking error system and the signal generation system

[0025] Step two, according to the knowledge of game theory, deduce the solution of the time-...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a robust optimal control method of a trolley inverted pendulum system based on departure strategy reinforcement learning. According to the invention, an augmented system is built by means of a tracking error system and a signal generation system of the inverted pendulum. Relative to the augmented system, a corresponding time-varying HJI equation based on the trolley inverted pendulum system is obtained. Then the algorithm based on an execution-evaluation-disturbance network structure is designed, and an approximate solution of a time-varying HJI equation is obtained; wherein a neural network of an activation function with a time-varying characteristic is adopted for approximation; in order to meet the finite time terminal constraint condition of the inverted pendulum, when the neural network weight update rate is designed, an additional terminal error term is considered; finally, the angle and speed error convergence of the inverted pendulum system of the trolley and the stability of a tracking error system can be proved by means of the Lyapunov stability theory. According to the invention, finite time optimal control of the linear inverted pendulum system under the action of external disturbance is realized.

Description

technical field [0001] The invention relates to a finite-time robust optimal control method for an uncertain nonlinear system, in particular to a data-driven controller design method, which can reduce the influence of the error of the reconstructed system model on subsequent controller design. Background technique [0002] As the most classic system in the inverted pendulum family, the first-class linear trolley inverted pendulum is a multi-variable, strongly coupled, single input and multiple output system, so the control of the inverted pendulum control system has certain complexity. The inverted pendulum system requires high real-time control. The control accuracy of the traditional inverted pendulum control theory has been difficult to meet the current people's needs, and its control accuracy needs to be improved. As a classic nonlinear control object, the control of the first-stage inverted pendulum is a complex nonlinear problem. In the control process, not only the an...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G05B13/04
CPCG05B13/042
Inventor 崔小红陈家裕
Owner CHINA JILIANG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products