G

Matteo A. D'Alessandro

LLM jailbreaking
Blog

Defending LLMs against Jailbreaking: Definition, examples and prevention

Jailbreaking refers to maliciously manipulating Large Language Models (LLMs) to bypass their ethical constraints and produce unauthorized outputs. This emerging threat arises from combining the models' high adaptability with inherent vulnerabilities that attackers can exploit through techniques like prompt injection. Mitigating jailbreaking risks requires a holistic approach involving robust security measures, adversarial testing, red teaming, and ongoing vigilance to safeguard the integrity and reliability of AI systems.

Matteo A. D'Alessandro
Matteo A. D'Alessandro
View post
Data poisoning attacks
Blog

Data Poisoning attacks on Enterprise LLM applications: AI risks, detection, and prevention

Data poisoning is a real threat to enterprise AI systems like Large Language Models (LLMs), where malicious data tampering can skew outputs and decision-making processes unnoticed. This article explores the mechanics of data poisoning attacks, real-world examples across industries, and best practices to mitigate risks through red teaming, and automated evaluation tools.

Matteo A. D'Alessandro
Matteo A. D'Alessandro
View post