Lengpudashi, the “Cold Poker Master,” arrived in Hainan to face the top Texas Hold’Em players in China for a USD 290,000 winner-takes-all purse. His opponents were serious players – Team Dragons was led by Yue (Alan) Du, a venture capitalist and amateur player who was the first Chinese mainlander to win a trophy at the World Series of Poker. But none of them was a match for Lengpudashi, an artificial intelligence bot developed by scientists at Carnegie Mellon University.
The event, organized by former Greater China President of Google and venture capitalist Kai-fu Lee, followed an earlier showdown in January where Lengpudashi’s older version Libratus beat four of the world’s top poker professionals in Pittsburgh, Pennsylvania.
Unlike Google’s AlphaGo’s triumph over 18-time world champion Lee Sedol last year, there was not so much fanfare in this event. But the implications of the AI’s victory are equally interesting – computers have successfully learnt how to bluff, something which was previously thought only humans could do.
“Poker has been one of the hardest games for AI to crack, because of the large state space and that you see only partial information”, said former Google and Baidu’s AI top expert Andrew Ng. “There is no well defined optimal move, but instead the player has to randomize its actions”.
Unlike chess and Go where players can see everything going on the board, the information is hidden from players in poker – what game theorists call imperfect information. Players cannot judge whether a move is good or not, and they have to rely on complex betting strategies – bluffing – and what we call intuition. Until now, scientists thought that it would be hard for an AI to read a bluff or predict a strategy in a way that humans do.
“I am very excited to take this new kind of AI technology to China,” said Tuomas Sandholm, professor at the Computer Science Department at CMU who created Libratus and Lengpudashi with Ph.D. student Noam Brown in a statement. “I want to explore various commercial opportunities for this in poker and a host of other application areas, ranging from recreational games to business strategy to strategic pricing to cybersecurity and medicine”.
Sandholm also explained that the AI did not learn from mimicking human poker players and analyzing historical data, but from game theory. The machine mainly relied on a trial-and-error form of AI known as reinforcement learning. Unlike AlphaGo which analyzed millions of games played by other players, Libratus began from zero – learning from its own mistakes and developing its own strategies. The AI still needed help from humans – its creators helped it to randomize bluffs, making it hard for other players to follow.
In the future, an AI which can crack games with imperfect information could become a valuable asset in predicting potential moves by opponents and planning strategies, especially in economics and diplomacy. With stakes like these, we are betting that investments in game-playing AI are only going to rise.
The next big human-AI showdown will be held in May between AlphaGo and world’s number one Go player Ke Jun.
(Top photo from Baidu Images.)