Knowledge
Blog
June 7, 2024
5
mn
read
Alex Combessie

Partnership announcement: Bringing Giskard LLM evaluation to Databricks

Giskard has integrated with Databricks MLflow to enhance LLM testing and deployment. This collaboration allows AI teams to automatically identify vulnerabilities, generate domain-specific tests, and log comprehensive reports directly into MLflow. The integration aims to streamline the development of secure, reliable, and compliant LLM applications, addressing key risks like prompt injection, hallucinations, and unintended data disclosures.
Giskard + Databricks integration

At Giskard, we are committed to helping enterprises safely develop and deploy Large Language Models (LLMs) at scale. That's why we're excited to announce our integration with MLflow, an open-source platform developed by Databricks for managing end-to-end machine learning workflows.

As companies increasingly adopt LLMs, particularly for applications like customer support chatbots, knowledge bases and question-answering with Retrieval Augmented Generation (RAG), it's crucial to comprehensively evaluate model outputs. According to OWASP, key vulnerabilities that must be addressed include prompt injection attacks, hallucinations and misinformation, unintended data disclosures, and generation of harmful or unethical content.

These issues can lead to regulatory penalties, reputational damage, ethical missteps, and erosion of public trust in AI systems. The Giskard-MLflow integration creates an ideal solution for building secure and compliant LLM applications while mitigating risks.

Giskard - Databricks MLFlow workflow
“By combining Giskard's open-source LLM evaluation capabilities with Databricks MLflow's model management features, we're making it easier for AI teams to incorporate comprehensive testing into their ML pipelines, and deploy LLM applications with confidence”, said Alex Combessie, CEO and co-founder of Giskard.

With this integration, AI teams can now incorporate Giskard Open-Source Scan feature to ensure the automatic identification of vulnerabilities on ML models and LLMs, instantaneously generate domain-specific tests, and leverage the Quality Assurance best practices of the open-source community.

Giskard's vulnerability reports, and metrics are automatically logged into Databricks MLflow, and AI teams can easily compare model performance across different versions and experiments. This allows the comparison of the issues detected in model versions, and provides a set of vulnerability reports that describe the source and reasoning behind these issues with examples. With the Giskard and Databricks partnership, you can ensure your models are safe, reliable, and compliant from development to deployment.

LLM Scan results

At Giskard, our goal is to build a holistic platform that covers all risks of AI models for quality, security and compliance. Our solution helps AI teams to automatically create tests, allowing them to efficiently validate models, generate comprehensive reports, and streamline review processes.

Reach out to us today to learn more about how we can help you make the most of your AI investments.

Continuously secure LLM agents, preventing hallucinations and security issues.
Book a demo

You will also like

Red Teaming LLM Applications course

New course with DeepLearningAI: Red Teaming LLM Applications

Our new course in collaboration with DeepLearningAI team provides training on red teaming techniques for Large Language Model (LLM) and chatbot applications. Through hands-on attacks using prompt injections, you'll learn how to identify vulnerabilities and security failures in LLM systems.

View post
Build and evaluate a Customer Service Chatbot. Image generated by DALL-E

How to find the best Open-Source LLM for your Customer Service Chatbot

Explore how to use open-source Large Language Models (LLMs) to build AI customer service chatbots. We guide you through creating chatbots with LangChain and HuggingFace libraries, and how to evaluate their performance and safety using Giskard's testing framework.

View post
Giskard LLM scan multi-model

[Release notes] LLM app vulnerability scanner for Mistral, OpenAI, Ollama, and Custom Local LLMs

Releasing an upgraded version of Giskard's LLM scan for comprehensive vulnerability assessments of LLM applications. New features include more accurate detectors through optimized prompts and expanded multi-model compatibility supporting OpenAI, Mistral, Ollama, and custom local LLMs. This article also covers an initial setup guide for evaluating LLM apps.

View post
Stay updated with
the Giskard Newsletter