cassio

cassio

A framework-agnostic Python library to seamlessly integrate Cassandra with ML/LLM/genAI workloads

Stars: 103

Visit
 screenshot

cassIO is a framework-agnostic Python library that seamlessly integrates Apache Cassandra with ML/LLM/genAI workloads. It provides an easy-to-use interface for developers to connect their Cassandra databases to machine learning models, allowing them to perform complex data analysis and AI-powered tasks directly on their Cassandra data. cassIO is designed to be flexible and extensible, making it suitable for a wide range of use cases, from data exploration and visualization to predictive modeling and natural language processing.

README:

cassIO

A framework-agnostic Python library to seamlessly integrate Apache Cassandra with ML/LLM/genAI workloads.

Note: this is currently an alpha release.

Users

Installation is as simple as:

pip install cassio

For example usages and integration with higher-level LLM frameworks such as LangChain, please visit cassio.org.

CassIO developers

Setup

To develop cassio, we use poetry

pip install poetry

Use poetry to install dependencies

poetry install

Use cassio current code in other Poetry base projects

If the integration is Poetry-based (e.g. LangChain itself), you should get this in your pyproject.toml:

cassio = {path = "../../cassio", develop = true}

Then you do

poetry remove cassio                                      # if necessary
poetry lock --no-update
poetry install -E all --with dev --with test_integration  # or similar, this is for langchain

Inspired from this. You also need a recent Poetry for this to work.

Versioning

We are still at 0.*. Occasional breaking changes are to be expected, but please think carefully. Later, a stronger versioning model will be adopted.

Style and typing

Style is enforced through black, linting with ruff, and typechecking with mypy. The code should run through make format without issues.

Python version coverage

At the moment we try to run tests under Python3.8 and Python3.10 to try and catch versions-specific issues (such as the newer typing syntax such as typeA | typeB, illegal on 3.8).

Publishing

  • Bump version in pyproject.toml
  • Add to CHANGES.txt
  • Commit the very code that will be built:
  • git tag v<x.y.z>; git push origin v<x.y.z>
make build
poetry publish  # (login to PyPI ...)

Testing

Please run tests (and add some coverage for new features). This is not enforced other than to your conscience. Type make for the available tests.

To run the full tests (except specific tests targeting Cassandra), there's make test-all.

Unit testing

make test-unit

Integration with the DB

Ensure the required environment variables are set (see for instance the provided TEMPLATE.testing.env). You need at least one of either Astra DB or a Cassandra (5+) cluster to use.

Launch the tests with either of:

make test-integration

make test-astra-integration
make test-cassandra-integration
make test-testcontainerscassandra-integration

The last three above specify TEST_DB_MODE as either LOCAL_CASSANDRA, TESTCONTAINERS_CASSANDRA or ASTRA_DB. Refer to TEMPLATE.testing.env for required environment variables in the specific cases.

Note: Ideally you should test with both Astra DB and one Cassandra, since some tests are skipped in either case.

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for cassio

Similar Open Source Tools

For similar tasks

For similar jobs