prompt injection

English

Noun

prompt injection (countable and uncountable, plural prompt injections)

(artificial intelligence, computer security) A method of causing an artificial intelligence to ignore its initial instructions (often ethical restrictions) by giving it a certain prompt.
- 2022 September 21, Alex Hern, “TechScape: AI's dark arts come into their own”, in The Guardian‎^[1], London: Guardian News & Media, →ISSN, →OCLC, archived from the original on 5 February 2023:
  Retomeli.io is a jobs board for remote workers, and the website runs a Twitter bot that spammed people who tweeted about remote working. The Twitter bot is explicitly labelled as being "OpenAI-driven", and within days of Goodside's proof-of-concept being published, thousands of users were throwing prompt injection attacks at the bot.
- 2023 March 3, Chloe Xiang, “Hackers Can Turn Bing's AI Chatbot Into a Convincing Scammer, Researchers Say”, in VICE‎^[2], archived from the original on 22 March 2023:
  Yesterday, OpenAI announced an API for ChatGPT and posted an underlying format for the bot on GitHub, alluding to the issue of prompt injections.
- 2023 February 14, Will Oremus, “Meet ChatGPT's evil twin, DAN”, in The Washington Post‎^[3], Washington, D.C.: The Washington Post Company, →ISSN, →OCLC, archived from the original on 19 March 2023:
  One category is what's known as a "prompt injection attack," in which users trick the software into revealing its hidden data or instructions.
- 2025 September 25, “How to stop AI’s “lethal trifecta””, in The Economist‎^[4], →ISSN:
  Large language models (LLMs), a trendy way of building artificial intelligence, have an inherent security problem: they cannot separate code from data. As a result, they are at risk of a type of attack called a prompt injection, in which they are tricked into following commands they should not.

prompt injection

English

Noun

See also

Further reading