evalkit

evalkit

The TypeScript LLM Evaluation Library

Stars: 70

Visit
 screenshot

EvalKit is an open-source TypeScript library for evaluating and improving the performance of large language models (LLMs). It helps developers ensure the reliability, accuracy, and trustworthiness of their AI models. The library provides various metrics such as Bias Detection, Coherence, Faithfulness, Hallucination, Intent Detection, and Semantic Similarity. EvalKit is designed to be user-friendly with detailed documentation, tutorials, and recipes for different use cases and LLM providers. It requires Node.js 18+ and an OpenAI API Key for installation and usage. Contributions from the community are welcome under the Apache 2.0 License.

README:

EvalKit

The TypeScript LLM Evaluations Library


EvalKit is an open-source library designed for TypeScript developers to evaluate and improve the performance of large language models (LLMs) with confidence. Ensure your AI models are reliable, accurate, and trustworthy.

License

๐Ÿš€ Features, Metrics and Docs

Click here to navigate to the Official EvalKit Documentation

In the documentation, you can find information on how to use EvalKit, its architecture, including tutorials and recipes for various use cases and LLM providers.

Feature Availability Docs
Bias Detection Metric โœ… ๐Ÿ”—
Coherence Metric โœ… ๐Ÿ”—
Dynamic Metric (G-Eval) โœ… ๐Ÿ”—
Faithfulness Metric โœ… ๐Ÿ”—
Hallucination Metric โœ… ๐Ÿ”—
Intent Detection Metric โœ… ๐Ÿ”—
Semantic Similarity Metric โœ… ๐Ÿ”—
Semantic Similarity Metric โœ… ๐Ÿ”—
Reporting ๐Ÿšง ๐Ÿšง

Looking for a metric/feature that's not listed here? Open an issue and let us know!

Getting Started - Quickstart

Prerequisites

  • Node.js 18+
  • OpenAI API Key

Installation

EvalKit currently exports a core package that includes all evaluation related functionalities. Install the package by running the following command:

npm install --save-dev @evalkit/core

Contributing

We welcome contributions from the community! Please feel free to submit pull requests or create issues for bugs or feature suggestions.

License

This repository's source code is available under the Apache 2.0 License.

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for evalkit

Similar Open Source Tools

For similar tasks

For similar jobs