A new investigation by security researchers has shown that languages like ChatGPT and Google Bard can be “hypnotized”. The generative AI models reportedly give harmful advice and generate malicious content after certain commands.
Hypnotized ChatGPT writes malicious code, advises you to run red lights
IBM’s cybersecurity team has published a new study showing that large language models such as ChatGPT can be hypnotized. In a series of experiments, researchers designed games and scenarios aimed at getting AI models to provide false information.
For example, the hypnotized ChatGPT advises you to run red lights in traffic and comply with ransomware demands. When given the correct answers (e.g. running a green light), the AI models claimed that these were false.
According to the researchers, new models like the GPT-4 are in some ways even easier to fool. In particular, they said, models with access to the internet can make dangerous suggestions with false commands. The report published by IBM said that “malicious hypnotism will be a real threat”.
Malicious people can capture confidential data held by tools such as ChatGPT through carefully crafted prompts. It could also enable artificial intelligence systems to generate malicious code. In fact, many brands such as Apple and Google have previously banned their employees from using ChatGPT.
The study shows that generative AI has serious vulnerabilities. Although these models continue to evolve, researchers point out that they can cause harm if abused.
So what do you think about this issue? You can share your opinions with us in the comments section.