Giskard, a French startup, aims to enhance the reliability, fairness, and ethics of AI models. It introduces an open-source testing framework for large language models (LLMs), enabling developers to identify potential risks and problems before deploying their models. This framework encompasses multiple quality assurance facets like performance, biases, security, misinformation, data leaks, and harmful content generation.
Compatible with prevalent ML tools like Hugging Face, PyTorch, TensorFlow, and Langchain, Giskard’s framework also fosters collaboration among AI and business teams for model validation and testing via automated tests and comprehensive feedback.
Founded in 2021 by former Dataiku employees Alex Combessie, Jean-Marie John-Mathews, and Andrey Avtomonov, Giskard secured its initial funding of 1.5 million euros in 2022, led by Elaia with contributions from Bessemer and prominent angel investors.
In 2023, it obtained a strategic investment of 3 million euros from the European Commission. This investment aimed to construct a SaaS platform for automated compliance with the impending EU AI regulations.
Giskard’s ultimate goal is to facilitate responsible AI, promoting business performance while respecting citizens’ rights. Their mission centers on providing AI professionals with a quality assurance platform for evaluating and testing all AI models, guided by values such as passion, customer focus, learning, consciousness, and empathy.
Giskard’s framework working:
Giskard’s framework operates by offering an open-source Python library that seamlessly integrates into any LLM project. This library empowers developers to automate the generation and execution of a comprehensive test suite, encompassing diverse quality assurance aspects.
Furthermore, it remains compatible with popular ML ecosystem tools such as Hugging Face, PyTorch, TensorFlow, and Langchain. Additionally, the library supports retrieval-augmented generation (RAG) models, enhancing LLM outputs with external knowledge sources.
The framework extends its capabilities through the AI Quality Hub, a web-based application enabling collaborative AI quality assurance on a large scale. With dashboards and visual debugging tools, ML and business teams can validate and test models through automated processes and feedback loops. It further aids in ensuring compliance with forthcoming EU AI regulations by providing a quality management system for AI models.
Moreover, the framework introduces a beta feature called LLMon, monitoring LLM outputs in real-time to identify potential risks like hallucinations, incorrect responses, and toxicity. Compatible with any LLM API, LLMon offers deployment options both on-premise and as a SaaS solution.