With exploits learned from AI, humans were able to beat a system playing superhuman Go. Talk about artificial intelligence is gaining momentum. But a study has uncovered weaknesses in a Go-playing bot that can beat the world’s best players. The study has called into question the safety and reliability of virtual systems.
How much sense does it make to trust AI?
“The research leaves an important question mark about building robust virtual intelligence that people can trust,” said Huan Zhang, a computer scientist at the University of Illinois Urbana-Champaign. “This is the biggest piece of evidence yet that it is difficult to make advanced models behave as intended,” said Stephen Casper, a computer scientist at the Massachusetts Institute of Technology (MIT).
The analysis, published in June and not yet peer-reviewed, used techniques known as adversarial attacks. These attacks give the virtual intelligence inputs intended to make it wrong, enabling it to make mistakes for malicious purposes.
In the game of Go, two players take turns placing black and white stones on a grid. They try to surround and capture the other player’s pieces. In 2022, researchers announced that they had trained adversarial virtual intelligence bots to beat KataGo, the best open-source Go-playing virtual system. These bots found vulnerabilities that could regularly defeat KataGo, which comfortably beat the best humans.
Did the study point to a fundamental weakness in KataGo, and therefore in other superhuman-capable virtual intelligence systems? The fact that humans can beat expert AI systems using the tactics of adversarial bots questions whether these systems still make sense.