Giskard-AI/giskard-oss - OSS Alternative - Discover Top Open Source Alternatives to Popular Software

AI/ML Testing and Evaluation Framework

5.3k 2026-04-26

Giskard-AI/giskard-oss

An open-source Python library for comprehensive testing, evaluation, and red teaming of LLM agents and AI systems, designed for dynamic, multi-turn interactions.

Core Features

Modular architecture for testing LLMs, black-box agents, and multi-step pipelines.

Advanced evaluation capabilities including scenario API, built-in checks, and LLM-as-judge.

Agent vulnerability scanning and red teaming for robust AI safety.

Support for RAG evaluation and synthetic data generation (in progress).

Designed for non-deterministic outputs and multi-turn conversational agent testing.

GitHub Repo Documentation

Quick Start

pip install giskard

Detailed Introduction

Giskard is an open-source Python library specifically engineered for the rigorous testing and evaluation of agentic AI systems. Its v3 architecture is a lightweight, modular rewrite, enabling dynamic, multi-turn testing of LLMs, black-box agents, and complex pipelines. It addresses the unique challenges of non-deterministic AI outputs, offering tools for catching regressions, validating RAG quality, enforcing safety rules, and performing red teaming to ensure AI system reliability and security.