HarmBench Harmful Content Attack

What is HarmBench Harmful Content Attack?

The HarmBench Harmful Content Attack is designed to evaluate AI systems by using data from the HarmBench dataset. This comprehensive benchmark assesses an AI's ability to resist generating harmful content across various categories of potential risk.

Stay updated with
the Giskard Newsletter