untrusted-input-in-system-prompt
User-controlled input flowing into the LLM `system` role across Anthropic, OpenAI, and the Vercel AI SDK.
import { generateText } from "ai";
export async function vuln(req: any) {
return generateText({
model: "claude-opus-4-7" as any,
system: req.body.persona, // user controls the system prompt
prompt: "Tell me about portfolios.",
});
}import { generateText } from "ai";
import { z } from "zod";
const Body = z.object({ question: z.string().min(1).max(2000) });
export async function safe(req: any) {
const { question } = Body.parse(req.body);
return generateText({
model: "claude-opus-4-7" as any,
system: "You are a portfolio assistant. Stay strictly on topic.",
messages: [{ role: "user", content: question }],
});
}Why an AI assistant writes this
Assistants frequently lift the user's customization into the `system` role to make the model follow it. That breaks the authority boundary between developer and user.
Fix
Keep the system prompt static in code. Place user input only in the `user` role. Validate input shape with zod / valibot at the request boundary.