data, AI — Image: Guillaume Bourdages via Unsplash

UK cyber agency warns of potentially fundamental flaw in AI technology

Britain’s National Cyber Security Centre (NCSC) is warning of an apparently fundamental security flaw affecting large language models (LLMs) — the type of AI used by ChatGPT to conduct human-like conversations.

Since the launch of ChatGPT last November, the bulk of security concerns regarding the technology have focused on its ability to produce human-like speech automatically. Today, criminals are now actively deploying their own versions to generate “remarkably persuasive” fraudulent emails.

But aside from using LLM software properly for malicious ends, there are potential vulnerabilities arising directly from its use and integration with other systems — particularly when the technology is used to interface with databases or other components of a product.

It’s known as a “prompt injection” attack, and the NCSC said that the problem may be fundamental. “Research is suggesting that an LLM inherently cannot distinguish between an instruction and data provided to help complete the instruction,” warned the agency.

While some of the popular examples on social media of getting Bing to appear to have an existential crisis are largely amusing and cosmetic, this flaw could be more severe for commercial applications that include LLMs.

NCSC writes: “Consider a bank that deploys an 'LLM assistant' for account holders to ask questions, or give instructions about their finances. An attacker might be able send you a transaction request, with the transaction reference hiding a prompt injection attack on the LLM. When the LLM analyses transactions, the attack could reprogram it into sending your money to the attacker’s account. Early developers of LLM-integrated products have already observed attempted prompt injection attacks.”

The software debugging company Honeycomb for instance detailed its attempts to protect its underlying systems from prompt injection attacks, but warned that the issues were so fundamental that the company was making sure its LLM neither touched user data nor backend services.

“We’d rather not have an end-user reprogrammable system that creates a rogue agent running in our infrastructure, thank you,” the company wrote, noting “yes, people are already attempting prompt injection in our system today,” including in attempts to extract customer information.

In another example included by the NCSC, a security researcher was able to extract a sensitive API key from the MathGPT model.

Alongside prompt injection attacks, the agency warned about data poisoning — essentially corrupting the data that these models are trained on.

Recent research by a team led by experts from Google, ETH Zurich, Nvidia and Robust Intelligence has demonstrated that data poisoning attacks are feasible against “extremely large models” even if the intruder has “access to only a very small part of their training data.”

Both of these attacks “can be extremely difficult to detect and mitigate,” warned the agency, meaning that it was even more important to design the whole system of anything connected to a machine learning component with security in mind.

“The emergence of LLMs is undoubtedly a very exciting time in technology. This new idea has landed - almost completely unexpectedly - and a lot of people and organisations (including the NCSC) want to explore and benefit from it,” the agency wrote.

“However, organisations building services that use LLMs need to be careful, in the same way they would be if they were using a product or code library that was in beta. They might not let that product be involved in making transactions on the customer's behalf, and hopefully wouldn't fully trust it. Similar caution should apply to LLMs.”

Get more insights with the

Recorded Future

Intelligence Cloud.

Learn more.