This week, we’re back discussing another list from the Open Worldwide Application Security Project (OWASP). We’ve already looked at the OWASP Mobile Top 10, which covers mobile application vulnerabilities, as well as the OWASP Top 10, which delves into web app vulnerabilities. This time, we’ll be covering the top 10 vulnerabilities that OWASP identified in large language models (LLMs), the technology underlying much of the AI boom that has become prominent in the past year or so.
Whether you think we’re on the brink of an AI utopia, or that artificial general intelligence (AGI) will soon take over and turn us all into paperclips, or even that the technology will stall out, the fact is that many individuals and organizations are rushing to implement LLMs wherever they can. OWASP released its list of LLM vulnerabilities in response to this rapid rollout, aiming to highlight some of the main security concerns surrounding this technology. The top ten LLM vulnerabilities are:
1. Prompt injection
Prompt injection vulnerabilities occur when attackers use crafted inputs to manipulate an LLM into unwittingly doing something that its developers don’t intend for it to do. Direct prompt injection involves jailbreaking the system to reveal or overwrite the system prompt. It may allow attackers to interact with sensitive data stores or insecure functions. A good example would be for an attacker to craft a prompt which allows them to circumvent the LLM’s safeguards and forces the LLM to produce a phishing email.
There are also indirect prompt injections, which occur when an attacker manages to secretly insert prompts in between an LLM user and the LLM. As an example, an attacker could hide instructions to an LLM on a webpage. If a user asks an LLM to summarize the webpage, the LLM could act upon the attacker’s instructions and behave in a completely unexpected manner.
2. Insecure output handling
If an LLM is used to generate inputs for other systems and components, the output from the LLM must be appropriately sanitized and validated before it is input elsewhere. It’s similar to how applications must sanitize and validate inputs from human users. If we don’t sanitize and validate inputs appropriately, both humans and LLMs can end up causing issues like cross-site request forgeries and (CSRF) and cross-site scripting (XSS) attacks.
3. Training data poisoning
An LLM’s capabilities are related to the quality of its training data. If a malicious actor manages to insert low-quality or dangerous data for the model to train on, it can impact the model’s ultimate performance. The impacts of training data poisoning can range from the model being worse at tasks than it otherwise would have been, to the model producing outputs that are dangerous. Another possibility is for the attacker to introduce proprietary or other sensitive information into the training data. If the model outputs this data under specific prompts, it could land the LLM developer in legal trouble for copyright infringement or privacy violations.
4. Model denial of service
Model denial of service involves an attacker using an LLM in such a way that it sucks up a huge amount of resources, degrading the experience for other users or completely denying them access. It could also result in high compute costs for the organization that runs the LLM.
5. Supply chain vulnerabilities
Just like any other tech organization, the developers of LLMs rely on a range of vendors so that they can focus on their core products. These supply chains represent sources of vulnerabilities. Examples include:
- A supplier of training data having insufficient protections in place, resulting in the model being trained on poisoned data.
- Vulnerabilities in third-party libraries and other software packages that are used by the LLM developer.
Due to the varied nature of security misconfigurations, there is a wide range of different mitigation techniques. These include hardening and implementing strong security baselines, keeping software up-to-date, as well as using strong and unique passwords.
6. Sensitive information disclosure
When users interact with LLMs, their prompts may be incorporated into future training data. This can be dangerous if a user inputs sensitive or proprietary data, because later models may ultimately end up outputting the information to others under specific prompts. LLM developers need clear terms of use, while users need to be aware of whether their inputs may be used in future training data, and they must act accordingly.
7. Insecure plugin design
A wide range of plugins have been developed to extend the capabilities of tools like ChatGPT. Common plugin issues include a lack of sanitization and validation, which we discussed back at number 2, as well as insufficient access controls. If a plugin lacks appropriate access controls, an attacker may be able to insert inputs that the plugin assumes originated from the legitimate user. This can ultimately lead to privilege escalation, remote code execution or data infiltration.
8. Excessive agency
LLM-based tools are able to respond to user prompts and interface with other systems—they have a certain degree of agency. If an LLM-based tool has too much agency, it can act in ways that are undesirable or dangerous. Excessive agency is usually caused by the system having too much functionality, too many permissions, too much autonomy, or some combination of the three.
9. Overreliance
If you’ve used a tool like ChatGPT, Claude or Gemini, you’re probably well aware that they hallucinate and make things up. Overreliance involves putting too much trust in the information produced by an LLM. One example is a college student getting an LLM to produce a paper on French history without checking the output. Another is a developer using an LLM to generate their code, without doing the appropriate security reviews. In the first case, the student may get an F for confidently stating that the French Revolution occurred in 2074. In the second case, the code could have security vulnerabilities which endanger the app and its users.
10. Model theft
LLMs are incredibly powerful and also extremely valuable. This creates a huge incentive to steal them. If a company has its model stolen, it could lose its competitive advantage, suffer reputational damage, and deal with the consequences of sensitive data being exposed. Another risk is that the thief could then deploy the model without any guardrails, using it to commit harms that the model’s owner deliberately tried to prevent.