Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Fictitious self-play-based multi-person incomplete information game policy resolving method, device and system as well as storage medium

A non-complete, two-player game technology applied in the field of artificial intelligence

Active Publication Date: 2019-11-05
HARBIN INST OF TECH SHENZHEN GRADUATE SCHOOL
View PDF4 Cites 19 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] Many decision-making problems in reality can be abstracted as strategy optimization problems of games with incomplete information, but the current strategy optimization algorithms for incomplete information, such as Master Lengpu, can only solve game problems with two people, discrete actions, and simple states , cannot be well applied in solving real-world decision-making problems

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Fictitious self-play-based multi-person incomplete information game policy resolving method, device and system as well as storage medium
  • Fictitious self-play-based multi-person incomplete information game policy resolving method, device and system as well as storage medium
  • Fictitious self-play-based multi-person incomplete information game policy resolving method, device and system as well as storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0024] 1.1 The present invention discloses a multiplayer incomplete information game strategy solution method based on virtual self-play. Taking multiplayer unrestricted Texas Hold'em as an example, the present invention is a multiplayer unrestricted Texas Hold'em strategy solving algorithm. The invention is based on virtual self-play, combined with deep learning, multi-agent reinforcement learning and other technologies, and uses Texas Hold'em and multi-agent particle environment as the experimental platform. When the traditional method solves the incomplete information game problem of Texas Hold'em, it needs to use card abstraction and other field methods to reduce the size of the game tree, and the transferability is poor. The present invention introduces the algorithm framework of virtual self-play, divides the strategy optimization process of Texas Hold'em into two parts, the optimal response strategy learning and the average strategy learning, and uses imitation learning ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a fictitious self-play-based multi-person incomplete information game policy resolving method, device and system as well as a storage medium. The method comprises the followingsteps: specific to a two-person gaming condition, implementing the generation of an average policy by using multi-type logistic regression and reservoir sampling, and implementing the generation of anoptimal response policy by using a DQN (Deep Q-Network) and annular buffering memory; and specific to a multi-person gaming condition, implementing the optimal response policy by using a multi-agentproximal policy optimization (MAPPO) algorithm, and meanwhile adjusting agent training by using multi-agent NFSP (Neural Fictitious Self-Play). The method has the beneficial effects that a fictitiousself-play algorithm framework is introduced; the Texas Poker policy optimizing process is partitioned into optimal response policy learning and average policy learning which are implemented by simulation learning and deep enhancement learning respectively; and a more universal multi-agent optimal policy learning method is designed.

Description

technical field [0001] The present invention relates to the technical field of artificial intelligence, and in particular to a method, device, system and storage medium for solving a multiplayer incomplete information game strategy based on virtual self-play. Background technique [0002] Machine games and artificial intelligence are inextricably linked, which is an important aspect that reflects the development of artificial intelligence. Many well-known scholars in the computer field have conducted related research on machine games: the father of computer von Neumann and the mathematician Aumann proposed the minimax method in the game. Alan Turing, the father of artificial intelligence, provided the theoretical basis for developing computer chess programs. This theory was later used to design the world's first computer-programmed chess on ENIAC. For more than half a century, many major research achievements in the field of machine games are considered to be important mil...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): A63F13/67G06N20/00
CPCA63F13/67G06N20/00A63F2300/6027
Inventor 王轩漆舒汉蒋琳胡书豪毛建博廖清李化乐张加佳刘洋夏文
Owner HARBIN INST OF TECH SHENZHEN GRADUATE SCHOOL
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products