What is Crescendo Harmful Content Attack?
The Crescendo Attack is a multi-turn strategy designed to incrementally lead AI models to produce harmful content through a series of small, seemingly harmless steps. This method takes advantage of the model’s tendency to prioritize recent inputs, follow established patterns, and trust its own generated text.
