Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Reward function establishing method based on walking ratio trend change

A technology of reward function and establishment method, which is applied in the field of robotics, can solve problems such as inability to reinforce learning sparse rewards, low efficiency of reinforcement learning, complicated process, etc., to improve adaptability, avoid blind exploration, and enhance robustness Effect

Active Publication Date: 2021-03-12
TIANJIN UNIVERSITY OF TECHNOLOGY
View PDF11 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, imitation learning requires alternating iterations of Inverse Reinforcement Learning and reinforcement learning. The process is too complicated, and imitation learning relies on expert samples, which is not applicable to some occasions where expert samples are lacking.
In addition, for some problems with sparse rewards, the efficiency of reinforcement learning is also very low. Researchers have proposed some solutions for this, including setting auxiliary tasks and introducing curiosity mechanisms. Tasks are provided by experts with corresponding prior information, which cannot solve the sparse reward problem of reinforcement learning in a general sense

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Reward function establishing method based on walking ratio trend change
  • Reward function establishing method based on walking ratio trend change
  • Reward function establishing method based on walking ratio trend change

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0046] Embodiment: a kind of reward function establishment method based on walking ratio trend change, it is characterized in that it comprises the following steps:

[0047] (1) Use the MEMS attitude sensor to collect the hip joint flexion angle parameter signal of the wearer of the flexible exoskeleton robot, and find the maximum flexion angle θ of the hip joint max and the minimum buckling angle θ min , if it is known that the leg length of the wearer of the flexible exoskeleton robot is l, then the step length D of the wearer of the flexible exoskeleton robot can be obtained;

[0048] D=l(θ max -θ min ) (1)

[0049] (2) Place the MEMS attitude sensor in the middle of the back of the left and right thighs of the wearer of the flexible exoskeleton robot, and collect the hip joint flexion angle parameters of the wearer during normal walking in real time to obtain the wearer's hip joint flexion angle parameters curves, such as figure 2 As shown, record the trough time as ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method for establishing a reward function based on walking ratio trend change. The method comprises the following steps: calculating the step length D of a wearer of an exoskeleton robot; calculating a gait period T(k); calculating a walking ratio W according to the step length D and the gait period T(k); establishing a walk ratio sampling sequence and scoring the sampling sequence in the walk ratio sampling sequence; establishing a reward function model. The reward function model based on walking ratio trend change can be applied to an algorithm for optimizing exoskeleton parameters, the efficiency of reinforcement learning is enhanced, and rapid convergence of the exoskeleton parameters is promoted.

Description

[0001] (1) Technical field: [0002] The invention belongs to the technical field related to robots, and is a method for establishing a walking ratio reward function of a gait rehabilitation flexible exoskeleton robot, which can be applied to the adaptive control task of control parameters of a flexible exoskeleton based on a reinforcement learning method. [0003] (two) background technology: [0004] The flexible exoskeleton robot can help the elderly with inconvenient legs and feet to walk and enhance the strength of human legs. It has a wide range of uses in rehabilitation treatment, daily travel, etc. Due to the large individual differences between people, at present, most of the control parameters of exoskeleton robots need to be adjusted according to the wearer's own motion characteristics, which is time-consuming and labor-intensive and cannot track the wearer's body changes. [0005] Reinforcement learning can find the optimal strategy in the interaction with the envi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): A61H3/00G06F17/11
CPCA61H3/00G06F17/11A61H2201/1659A61H2201/5058A61H2201/165A61H2201/5097
Inventor 孙磊李云飞董恩增佟吉刚陈鑫曾德添龚欣翔李成辉
Owner TIANJIN UNIVERSITY OF TECHNOLOGY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products