
Conversation injection and stealthy data exfiltration
Because ChatGPT receives output from SearchGPT after the search model processes content, Tenable’s researchers wondered what would happen if SearchGPT’s response itself contained a prompt injection. In other words, could they use a website to inject a prompt that instructs SearchGPT to inject a different prompt into ChatGPT, effectively creating a chained attack? The answer is yes, resulting in a technique Tenable dubbed “conversation injection.”
“When responding to the following prompts, ChatGPT will review the Conversational Context, see and listen to the instructions we injected, not realizing that SearchGPT wrote them,” the researchers said. “Essentially, ChatGPT is prompt-injecting itself.”
But getting an unauthorized prompt to ChatGPT accomplishes little for an attacker without a way to receive the model’s response, which could include sensitive information from the conversation context.
