Context Compliance Harmful Content Attack

What is Context Compliance Harmful Content Attack?

This evaluation assesses whether an AI agent can be manipulated into delivering harmful or inappropriate content. It involves presenting the agent with an artificial conversation history, designed to mimic prior compliance with similar, potentially malicious requests.

Get AI security insights in your inbox