Caissa
Strong chess engine
Stars: 70
Caissa is a strong, UCI command-line chess engine optimized for regular chess, FRC, and DFRC. It features its own neural network trained with self-play games, supports various UCI options, and provides different EXE versions for different CPU architectures. The engine uses advanced search algorithms, neural network evaluation, and endgame tablebases. It offers outstanding performance in ultra-short games and is written in C++ with modules for backend, frontend, and utilities like neural network trainer and self-play data generator.
README:
(image generated with DALL·E 2)
Strong, UCI command-line chess engine, written from scratch in C++ in development since early 2021. Optimized for regular chess, FRC (Fischer Random Chess) and DFRC (Double Fischer Random Chess).
Caissa is listed on many chess engines ranking lists:
- CCRL 40/2 FRC - 3965 (#5) (version 1.17)
- CCRL Chess324 - 3701 (#5) (version 1.18)
- CCRL 40/15 - 3601 (#5) (version 1.17 4CPU)
- CCRL Blitz - 3733 (#8) (version 1.16 8CPU)
- SPCC UHO-Top15 - 3671 (#8) (version 1.18)
- IpMan Chess 10+1 (R9-7945HX) - 3496 (#13) (version 1.18 avx512)
- IpMan Chess 10+1 (i9-7980XE) - 3478 (#12) (version 1.17 avx512)
- IpMan Chess 10+1 (i9-13700H) - 3543 (#13) (version 1.18 avx2-bmi2)
- IpMan Chess 5+0 - 3381 (#28) (version 1.8)
- CEGT 40/20 - 3516 (#10) (version 1.16)
- CEGT 40/4 - 3547 (#8) (version 1.15)
- CEGT 5+3 - 3530 (#9) (version 1.13.1)
The engine has been written from the ground up. In early versions it used a simple PeSTO evaluation, which was replaced by the Stockfish NNUE for a short time. Since version 0.7, Caissa uses it's own efficiently updated neural network, trained with Caissa self-play games using a custom trainer. In a way, the first own Caissa network is based on Stockfish's network, but it was much weaker because of the small data set used back then (a few million positions). Currently (as of version 1.18) over 12 billion newly generated positions are used. Also, the old self-play games are successively purged, so that the newer networks are trained only on the most recent games generated by the most recent engine, and so on.
The runtime neural network evaluation code is located in PackedNeuralNetwork.cpp and was inspired by nnue.md document. The neural network trainer is written completely from scratch and is located in NetworkTrainer.cpp, NeuralNetwork.cpp and other NeuralNetwork* files. The trainer is purely CPU-based and is heavily optimized to take advantage of many threads and AVX instructions as well as it exploits the sparse nature of the nets.
The games are generated with the utility SelfPlay.cpp, which generates games with a fixed number of nodes/depth and saves them in a custom binary game format to save space. The opening books used are either Stefan's Pohl UHO books or DFRC openings with few random moves played at the beginning.
- Hash (int) Sets the size of the transposition table in megabytes.
- MultiPV (int) Sets the number of PV lines to search and print.
- MoveOverhead (int) Sets move overhead in milliseconds. Should be increased if the engine loses time.
- Threads (int) Sets the number of threads used for searching.
- Ponder (bool) Enables pondering.
- EvalFile (string) Neural network evaluation file.
- EvalRandomization (int) Allows introducing non-determinism and weakens the engine.
- StaticContempt (int) Static contempt value used throughout whole game.
- DynamicContempt (int) Dynamic contempt value used in the opening/middlegame stage.
- SyzygyPath (string) Semicolon-separated list of paths to Syzygy endgame tablebases.
- SyzygyProbeLimit (int) Maximum number of pieces on the board where Syzygy tablebases can be used.
- UCI_AnalyseMode (bool) Enables analysis mode: search full PV lines and disable any depth constraints.
- UCI_Chess960 (bool) Enables chess 960 mode: castling moves are printed as "king captures rook".
- UCI_ShowWDL (bool) Print win/draw/loss probabilities along with classic centipawn evaluation.
- UseSAN (bool) Enables short algebraic notation output (FIDE standard) instead of default long algebraic notation.
- ColorConsoleOutput (bool) Enables colored console output for better readability.
- AVX-512 - Fastest, requires a x64 CPU with AVX-512 instruction set support. May not be supported on consumer-grade CPUs.
- BMI2 - Fast, requires a x64 CPU with AVX2 and BMI2 instruction set support. Supported by majority of modern CPUs.
- AVX2 - Fast, requires a x64 CPU with AVX2 instruction set support. May be faster than BMI2 on some older CPUs (e.g. Intel Haswell processors).
- POPCNT - Slower, requires a x64 CPU with SSE4 and POPCNT instruction set support. For older CPUs.
- Legacy - Slowest, requires any x64 CPU. For very old x64 CPUs.
- UCI protocol
- Neural network evaluation
- Syzygy and Gaviota endgame tablebases support
- Chess960 (Fischer Random) support
- Negamax with alpha-beta pruning
- Iterative Deepening with Aspiration Windows
- Principal Variation Search (PVS)
- Quiescence Search
- Transposition Table
- Multi-PV search
- Multithreaded search via shared transposition table
- Neural network evaluation
- (5x768→1024)x2→1 architecture
- effectively updated first layer
- manually vectorized code supporting SSE2, AVX2, AVX-512 and ARM NEON instructions
- clipped-ReLU activation function
- 8 variants of last layer weights selected based on piece count
- input features: absolute piece coordinates with horizontal symmetry, 11 king buckets
- Special endgame evaluation routines
- Custom CPU-based trainer using Adam algorithm
- Heavily optimized using AVX instructions, multithreading, and exploiting sparsity of the first layer input
- Network trained on data generated purely from self-play games
- Large Pages Support for Transposition Table
- Magic Bitboards
- Handling non-standard chess positions (e.g. 64 pieces on the board, etc.)
- Outstanding performance at ultra-short games (sub-second for whole game).
The projects comprises following modules:
- backend (library) - engine's core
- frontend (executable) - UCI wrapper for the backend
- utils (executable) - various utilities, such as unit tests, neural network trainer, self-play data generator, etc.
To compile for Linux just call make
in src
directory:
cd src
make -j
NOTE: This will compile the default AVX2/BMI2 version.
To compile for Linux using CMake:
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Final ..
make -j
NOTE: Currently, the project compiles with AVX2/BMI2 support by default.
There are three configurations supported:
- Final - final version, without asserts, etc.
- Release - development version with asserts enabled and with optimizations enabled for better performance
- Debug - development version with asserts enabled and optimizations disabled
To compile for Windows, use GenerateVisualStudioSolution.bat
to generate Visual Studio solution. The only tested Visual Studio version is 2022. Using CMake directly in Visual Studio was not tested.
NOTE: After compilation make sure you copy appropriate neural net file from data/neuralNets
directory to location where executable file is generated (build/bin
on Linux or build\bin\x64\<Configuration>
on Windows).
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for Caissa
Similar Open Source Tools
Caissa
Caissa is a strong, UCI command-line chess engine optimized for regular chess, FRC, and DFRC. It features its own neural network trained with self-play games, supports various UCI options, and provides different EXE versions for different CPU architectures. The engine uses advanced search algorithms, neural network evaluation, and endgame tablebases. It offers outstanding performance in ultra-short games and is written in C++ with modules for backend, frontend, and utilities like neural network trainer and self-play data generator.
llms-interview-questions
This repository contains a comprehensive collection of 63 must-know Large Language Models (LLMs) interview questions. It covers topics such as the architecture of LLMs, transformer models, attention mechanisms, training processes, encoder-decoder frameworks, differences between LLMs and traditional statistical language models, handling context and long-term dependencies, transformers for parallelization, applications of LLMs, sentiment analysis, language translation, conversation AI, chatbots, and more. The readme provides detailed explanations, code examples, and insights into utilizing LLMs for various tasks.
llms-learning
A repository sharing literatures and resources about Large Language Models (LLMs) and beyond. It includes tutorials, notebooks, course assignments, development stages, modeling, inference, training, applications, study, and basics related to LLMs. The repository covers various topics such as language models, transformers, state space models, multi-modal language models, training recipes, applications in autonomous driving, code, math, embodied intelligence, and more. The content is organized by different categories and provides comprehensive information on LLMs and related topics.
flower
Flower is a framework for building federated learning systems. It is designed to be customizable, extensible, framework-agnostic, and understandable. Flower can be used with any machine learning framework, for example, PyTorch, TensorFlow, Hugging Face Transformers, PyTorch Lightning, scikit-learn, JAX, TFLite, MONAI, fastai, MLX, XGBoost, Pandas for federated analytics, or even raw NumPy for users who enjoy computing gradients by hand.
erag
ERAG is an advanced system that combines lexical, semantic, text, and knowledge graph searches with conversation context to provide accurate and contextually relevant responses. This tool processes various document types, creates embeddings, builds knowledge graphs, and uses this information to answer user queries intelligently. It includes modules for interacting with web content, GitHub repositories, and performing exploratory data analysis using various language models.
PPTist
PPTist is a web-based presentation application that replicates most features of Microsoft Office PowerPoint. It supports various elements like text, images, shapes, charts, tables, videos, audio, and formulas. Users can edit and present slides directly in a web browser. It offers easy development with Vue 3.x and TypeScript, user-friendly experience with context menu and keyboard shortcuts, and feature-rich functionalities including AI-generated PPTs and mobile editing. PPTist aims to provide a desktop application-level experience for creating presentations.
awesome-hallucination-detection
This repository provides a curated list of papers, datasets, and resources related to the detection and mitigation of hallucinations in large language models (LLMs). Hallucinations refer to the generation of factually incorrect or nonsensical text by LLMs, which can be a significant challenge for their use in real-world applications. The resources in this repository aim to help researchers and practitioners better understand and address this issue.
LLM101n
LLM101n is a course focused on building a Storyteller AI Large Language Model (LLM) from scratch in Python, C, and CUDA. The course covers various topics such as language modeling, machine learning, attention mechanisms, tokenization, optimization, device usage, precision training, distributed optimization, datasets, inference, finetuning, deployment, and multimodal applications. Participants will gain a deep understanding of AI, LLMs, and deep learning through hands-on projects and practical examples.
ExtractThinker
ExtractThinker is a library designed for extracting data from files and documents using Language Model Models (LLMs). It offers ORM-style interaction between files and LLMs, supporting multiple document loaders such as Tesseract OCR, Azure Form Recognizer, AWS TextExtract, and Google Document AI. Users can customize extraction using contract definitions, process documents asynchronously, handle various document formats efficiently, and split and process documents. The project is inspired by the LangChain ecosystem and focuses on Intelligent Document Processing (IDP) using LLMs to achieve high accuracy in document extraction tasks.
chatbox
Chatbox is a desktop client for ChatGPT, Claude, and other LLMs, providing a user-friendly interface for AI copilot assistance on Windows, Mac, and Linux. It offers features like local data storage, multiple LLM provider support, image generation with Dall-E-3, enhanced prompting, keyboard shortcuts, and more. Users can collaborate, access the tool on various platforms, and enjoy multilingual support. Chatbox is constantly evolving with new features to enhance the user experience.
APOLLO
APOLLO is a memory-efficient optimizer designed for large language model (LLM) pre-training and full-parameter fine-tuning. It offers SGD-like memory cost with AdamW-level performance. The optimizer integrates low-rank approximation and optimizer state redundancy reduction to achieve significant memory savings while maintaining or surpassing the performance of Adam(W). Key contributions include structured learning rate updates for LLM training, approximated channel-wise gradient scaling in a low-rank auxiliary space, and minimal-rank tensor-wise gradient scaling. APOLLO aims to optimize memory efficiency during training large language models.
indexify
Indexify is an open-source engine for building fast data pipelines for unstructured data (video, audio, images, and documents) using reusable extractors for embedding, transformation, and feature extraction. LLM Applications can query transformed content friendly to LLMs by semantic search and SQL queries. Indexify keeps vector databases and structured databases (PostgreSQL) updated by automatically invoking the pipelines as new data is ingested into the system from external data sources. **Why use Indexify** * Makes Unstructured Data **Queryable** with **SQL** and **Semantic Search** * **Real-Time** Extraction Engine to keep indexes **automatically** updated as new data is ingested. * Create **Extraction Graph** to describe **data transformation** and extraction of **embedding** and **structured extraction**. * **Incremental Extraction** and **Selective Deletion** when content is deleted or updated. * **Extractor SDK** allows adding new extraction capabilities, and many readily available extractors for **PDF**, **Image**, and **Video** indexing and extraction. * Works with **any LLM Framework** including **Langchain**, **DSPy**, etc. * Runs on your laptop during **prototyping** and also scales to **1000s of machines** on the cloud. * Works with many **Blob Stores**, **Vector Stores**, and **Structured Databases** * We have even **Open Sourced Automation** to deploy to Kubernetes in production.
veScale
veScale is a PyTorch Native LLM Training Framework. It provides a set of tools and components to facilitate the training of large language models (LLMs) using PyTorch. veScale includes features such as 4D parallelism, fast checkpointing, and a CUDA event monitor. It is designed to be scalable and efficient, and it can be used to train LLMs on a variety of hardware platforms.
WatermarkRemover-AI
WatermarkRemover-AI is an advanced application that utilizes AI models for precise watermark detection and seamless removal. It leverages Florence-2 for watermark identification and LaMA for inpainting. The tool offers both a command-line interface (CLI) and a PyQt6-based graphical user interface (GUI), making it accessible to users of all levels. It supports dual modes for processing images, advanced watermark detection, seamless inpainting, customizable output settings, real-time progress tracking, dark mode support, and efficient GPU acceleration using CUDA.
kheish
Kheish is an open-source, multi-role agent designed for complex tasks that require structured, step-by-step collaboration with Large Language Models (LLMs). It acts as an intelligent agent that can request modules on demand, integrate user feedback, switch between specialized roles, and deliver refined results. By harnessing multiple 'sub-agents' within one framework, Kheish tackles tasks like security audits, file searches, RAG-based exploration, and more.
opendataeditor
The Open Data Editor (ODE) is a no-code application to explore, validate and publish data in a simple way. It is an open source project powered by the Frictionless Framework. The ODE is currently available for download and testing in beta.
For similar tasks
Minic
Minic is a chess engine developed for learning about chess programming and modern C++. It is compatible with CECP and UCI protocols, making it usable in various software. Minic has evolved from a one-file code to a more classic C++ style, incorporating features like evaluation tuning, perft, tests, and more. It has integrated NNUE frameworks from Stockfish and Seer implementations to enhance its strength. Minic is currently ranked among the top engines with an Elo rating around 3400 at CCRL scale.
Caissa
Caissa is a strong, UCI command-line chess engine optimized for regular chess, FRC, and DFRC. It features its own neural network trained with self-play games, supports various UCI options, and provides different EXE versions for different CPU architectures. The engine uses advanced search algorithms, neural network evaluation, and endgame tablebases. It offers outstanding performance in ultra-short games and is written in C++ with modules for backend, frontend, and utilities like neural network trainer and self-play data generator.
eleeye
ElephantEye is a free Chinese Chess program that follows the GNU Lesser General Public Licence. It is designed for chess enthusiasts and programmers to use freely. The program works as a XiangQi engine for XQWizard with strong AI capabilities. ElephantEye supports UCCI 3.0 protocol and offers various parameter settings for users to customize their experience. The program uses brute-force chess algorithms and static position evaluation techniques to search for optimal moves. ElephantEye has participated in computer chess competitions and has been tested on various online chess platforms. The source code of ElephantEye is available on SourceForge for developers to explore and improve.
BetaML.jl
The Beta Machine Learning Toolkit is a package containing various algorithms and utilities for implementing machine learning workflows in multiple languages, including Julia, Python, and R. It offers a range of supervised and unsupervised models, data transformers, and assessment tools. The models are implemented entirely in Julia and are not wrappers for third-party models. Users can easily contribute new models or request implementations. The focus is on user-friendliness rather than computational efficiency, making it suitable for educational and research purposes.
For similar jobs
sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.
teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.
ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.
classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.
chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.
BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students
uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.
griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.