Meta teaches an AI to lie, strategise

An AI taught to play a board game that involves negotiating with human players and inferring their motives could have applications for enterprise chatbots, Meta says.

Comments

Meta has trained an AI agent to play a board game that involves chatting with other players to persuade them to support its strategies — and then betraying them.

The company, which owns Facebook, Instagram and WhatsApp, said that its Cicero AI may have widespread applications in the near future including developing smarter virtual assistants with the combined use of technologies such as natural language processing (NLP) and strategic reasoning, according to a blog post released by the organisation.

In a research article in the academic journal Science, Meta said its Cicero AI achieved human-level performance at the strategy board game Diplomacy in an online league where it played 40 games against 82 humans, ranking in the top 10 per cent of participants who played more than one game.

Diplomacy pits seven players against one another for control of a map of Europe. Each turn begins with players negotiating with one another for support for their plans and concludes with them simultaneously trying to execute their moves. Without the support of other players, many of these moves will fail.

The game posed a challenge for the AI agent, Meta said, as winning required it to understand if its opponents were bluffing or strategising in a certain way to win the game. The AI needed to extend a certain level of empathy while playing the game to form collaborations with other players, something AIs have not needed to do when playing games such as chess against human opponents.

AI agents have been getting better at strategy games over the years: In 1997, IBM’s Deep Blue software defeated world chess champion Gary Kasparov, and in 2016, DeepMind’s AlphaGo beat top Go player Lee Sedol. Facebook has also developed another AI engine that can top humans in Poker.

Strategic reasoning

Cicero is built on two main technology components: strategic reasoning and natural language processing (NLP). While the strategic reasoning engine predicts moves of other players and uses that information to form a strategy of its own, the natural language processing engine generates messages and analyses responses in conversations with other players to negotiate and reach agreement, the researchers explained.

In order to help the AI agent generate relevant conversations, researchers started with a 2.7 billion-parameter natural language generation model pre-trained on text from the internet and fine-tuned it with conversations between human players in over 40,000 games from webDiplomacy.net.

“We developed techniques to automatically annotate messages in the training data with corresponding planned moves in the game, so that at inference time we can control dialogue generation to discuss specific desired actions for the agent and its conversation partners,” researchers said in a more detailed blog post.

Meta has open sourced the code for Cicero for other researchers to build on the capabilities of the AI agent. In addition, the company has created a portal to invite proposals on research in the area of human-AI cooperation through NLP using Diplomacy as the core concept.

Long-term plans

Large technology companies, such as Microsoft, Google, Amazon, are in a race against each other to develop smarter independent virtual assistants to support variety of business use cases, ranging from call centres to AI agents that can conduct sentiment analysis and teach new skills to an individual.

The global natural language processing (NLP) market, which includes such assistants, is projected to grow from $26.4 billion in 2022 to $161.8 billion by 2029, according to a report from Fortune Business Insights.

Researchers at Meta seemed to suggest that the success of Cicero in diplomacy supersedes the capabilities of other virtual assistants available today, saying in a blog post, “For example, current AI assistants can complete simple question-answer tasks, like telling you the weather — but what if they could hold a long-term conversation with the goal of teaching you a new skill?”

This is a dig at tools like Google Duplex, Amazon Alexa, Microsoft’s Xiaoice and Apple’s Siri. But Cicero isn’t up to long-term conversations either, as its reasoning is strictly short term.

As Meta’s researchers said in the paper in Science, “From a strategic perspective, Cicero reasoned about dialogue purely in terms of players’ actions for the current turn. It did not model how its dialogue might affect the relationship with other players over the long-term course of a game.”