Knowledge
Blog
September 25, 2025
5
mn
read
Blanca Rivera Campos

[Release notes]: New LLM vulnerability scanner for dynamic & multi-turn Red Teaming

We're releasing an upgraded LLM vulnerability scanner that deploys autonomous red teaming agents to conduct dynamic, multi-turn attacks across 40+ probes, covering both security and business failures. Unlike static testing tools, this new scanner adapts attack strategies in real-time to detect sophisticated conversational vulnerabilities.
LLM vulnerability scanner for dynamic & multi-turn Red Teaming

We're releasing an upgraded version of our LLM vulnerability scanner in Giskard Hub, specifically designed to secure conversational AI agents in production environments. While our open-source scanner provided basic heuristic testing with nine static detectors, this enterprise version deploys autonomous red teaming agents that conduct dynamic, multi-turn attacks across dozens of vulnerability categories covering more than 40 probes*. The new system adapts attack strategies in real-time to cover complex conversational vulnerabilities that emerge over multiple interactions.

*Probe: a structured adversarial test designed to expose weaknesses in an AI agent, such as harmful content generation, data leakage, or unauthorized tool execution.

What's new: enhanced AI Red Teaming capabilities

The upgraded LLM vulnerability scanner in Giskard Hub introduces new capabilities that go beyond basic AI security checks:

Comprehensive LLM vulnerabilities coverage

The scanner covers LLM vulnerabilities across established OWASP categories and business failures:

  • Prompt Injection (OWASP LLM 01) - Attacks that manipulate AI agents through carefully crafted prompts to override original instructions
  • Training Data Extraction (OWASP LLM 02) - Attempts to extract or infer information from the AI model's training data
  • Data Privacy Exfiltration (OWASP LLM 05) - Attacks aimed at extracting sensitive information, personal data, or confidential content
  • Excessive Agency (OWASP LLM 06) - Tests whether AI agents can be manipulated to perform actions beyond their intended scope
  • Hallucination & Misinformation (OWASP LLM 08) - Tests for AI systems providing false, inconsistent, or fabricated information
  • Denial of Service (OWASP LLM 10) - Attacks that attempt to cause resource exhaustion or performance degradation
  • Internal Information Exposure (OWASP LLM 01-07) - Attempts to extract system prompts, configuration details, or other sensitive internal information
  • Harmful Content Generation - Probes that bypass safety measures to generate dangerous, illegal, or harmful content
  • Brand Damage & Reputation - Tests for reputational risks and brand damage scenarios
  • Legal & Financial Risk - Attacks that would make the agent generate statements exposing the agent deployer to legal and financial liabilities
  • Unauthorized Professional Advice - Tests whether AI agents provide professional advice outside their intended scope

Full list of probes we cover can be found in the here.

Business alignment

Our scanner evaluates both security vulnerabilities and business failures, automatically validating business logic by generating expected outputs from your knowledge bases to ensure agents provide accurate, contextually appropriate responses.

Domain-specific attacks

Previous tools treated LLMs as static models. AI agents now operate in dynamic environments with tool access, memory, and complex interaction patterns. To ensure realistic evaluation, we adapt our testing methodologies to agent-specific contexts, using bot descriptions, tools specification, and knowledge bases. Dynamic interaction with the agent allows us to craft targeted, more context-aware attacks.

Multi-turn attack simulation

Real-world attacks rarely succeed in a single prompt. The new LLM vulnerability scanner implements dynamic multi-turn testing that simulates realistic conversation flows, detecting context-dependent vulnerabilities that emerge through conversation history (risks that single-turn testing misses).

Adaptive AI Red Teaming

The scanner includes adaptive red teaming that adjusts attack strategies based on agent resistance. When encountering defenses, our testing agent escalates tactics or pivots approaches, mimicking attackers to ensure comprehensive coverage.

Root-cause analysis

Every detected vulnerability includes detailed explanations of the attack methodology and severity scoring. Security teams can quickly identify which vulnerabilities pose the highest risk and understand exactly how each attack succeeded, enabling them to prioritize their security efforts and validate fixes against the same attack patterns.

Continuous Red Teaming

Detected vulnerabilities automatically convert into reusable tests for continuous validation and integration into golden datasets.

Getting started

To start using Giskard’s LLM vulnerability scanner you can follow these steps:

  1. Configure vulnerability scope: Select specific vulnerability categories relevant to your use case, covering comprehensive LLM security areas from prompt injection to business logic failures.
  2. Execute the scan: The system runs hundreds of probes (structured adversarial tests designed to expose weaknesses through harmful content generation attempts, data leakage exploration, and unauthorized tool execution testing).
  3. Analyze results by severity: Results are organized by criticality, so it’s easier for review and fixing first what’s critical for your use case.
  4. Review individual probes: Each probe provides detailed attack descriptions, success/failure analysis, and explanations for why specific vulnerabilities occurred, enabling targeted fixes.
  5. Turn into continuous tests (optional): Successful probes can convert into tests for continuous validation, ensuring remediation efforts remain effective over time.

To discover all features and capabilities, visit our documentation for detailed implementation guides and vulnerability coverage.

Conclusion

This release brings new capabilities to Giskard’s LLM vulnerability scanner, you'll now be able to detect sophisticated attacks that evolve across multiple conversation turns. The scanner will automatically generate attacks, analyze your system's responses, then modify their approach and help you correct the agents with re-executable tests.

Ready to secure your AI agents? We're offering free access to a limited number of companies this month. Request your trial to experience advanced LLM security testing that adapts to your specific environment and threat landscape.

Continuously secure LLM agents, preventing hallucinations and security issues.
Book a demo

You will also like

Crescendo multi-turn LLM jailbreak attack

How LLM jailbreaking can bypass AI security with multi-turn attacks

Multi-turn jailbreaking attacks like Crescendo bypass AI security measures by gradually steering conversations toward harmful outputs through innocent-seeming steps, creating serious business risks that standard single-message testing misses. This article reveals how these attacks work with real-world examples and provides practical techniques to detect and prevent them.

View post
Red Teaming LLM Applications course

New course with DeepLearningAI: Red Teaming LLM Applications

Our new course in collaboration with DeepLearningAI team provides training on red teaming techniques for Large Language Model (LLM) and chatbot applications. Through hands-on attacks using prompt injections, you'll learn how to identify vulnerabilities and security failures in LLM systems.

View post
Phare LLM Benchmark - an analysis of hallucination in leading LLMs

Good answers are not necessarily factual answers: an analysis of hallucination in leading LLMs

We're sharing the first results from Phare, our multilingual benchmark for evaluating language models. The benchmark research reveals leading LLMs confidently produce factually inaccurate information. Our evaluation of top models from eight AI labs shows they generate authoritative-sounding responses containing completely fabricated details, particularly when handling misinformation.

View post
Stay updated with
the Giskard Newsletter