hash

🚀 The open-source, multi-tenant, self-building knowledge graph

Stars: 1153

Visit

HASH is a self-building, open-source database which grows, structures and checks itself. With it, we're creating a platform for decision-making, which helps you integrate, understand and use data in a variety of different ways.

README:

HASH

This is HASH's public monorepo which contains our public code, docs, and other key resources.

What is HASH?

HASH is a self-buliding, open-source database which grows, structures and checks itself. HASH integrates data in (near-)realtime, and provides a powerful set of interfaces so that information can be understood and used in any context. Intelligent, autonomous agents can be deployed to grow, check, and maintain the database, integrating and structuring information from the public internet as well as your own connected private sources. And users, including those who are non-technical, are able to visually browse and manage both entities (data) and types (schemas). HASH acts as a source of truth for critical data, no matter its source, and provides a platform for high-trust, safety-assured decision-making. Read our blog post →

In the future... we plan on growing HASH into an all-in-one workspace, or complete operating system, with AI-generated interfaces known as "blocks" created at the point of need, on top of your strongly-typed data (addressing the data quality and integrity challenges inherent in today's current generation of generative AI interfaces).

Getting started

🚀 Quick-start (<5 mins): use the hosted app

Create an account

The only current "officially supported" way of trying HASH right now is by signing up for and using the hosted platform at app.hash.ai

Create an account to get started.

Sign in

Skip the queue

When you first create an account you may be placed on a waitlist. To jump the queue, once signed in, follow the instructions shown in your HASH dashboard. All submissions are reviewed by a member of the team.

Running HASH locally

Running HASH locally is not yet officially supported. We plan on publishing a comprehensive guide to running your own instance of HASH shortly (2025Q2). In the meantime, you may try the instructions below.

Experimental instructions

Make sure you have, Git, Rust, Docker, and Protobuf. Building the Docker containers requires Docker Buildx. Run each of these version commands and make sure the output is expected:
```
git --version
## ≥ 2.17

rustup --version
## ≥ 1.27.1 (Required to match the toolchain as specified in `rust-toolchain.toml`, lower versions most likely will work as well)

docker --version
## ≥ 20.10

docker compose version
## ≥ 2.17.2

docker buildx version
## ≥ 0.10.4
```
If you have difficulties with git --version on macOS you may need to install Xcode Command Line Tools first: xcode-select --install.

If you use Docker for macOS or Windows, go to Preferences → Resources and ensure that Docker can use at least 4GB of RAM (8GB is recommended).
Clone this repository and navigate to the root of the repository folder in your terminal.
We use mise-en-place to manage tool versions consistently across our codebase. We recommend using mise to automatically install and manage the required development tools:
```
mise install
```
It's also possible to install them manually, use the correct versions for these tools as specified in .config/mise.

After installing mise you will also need to set it to automatically activate in your shell.
Install dependencies:
```
yarn install
```
Ensure Docker is running. If you are on Windows or macOS, you should see app icon in the system tray or the menu bar. Alternatively, you can use this command to check Docker:
```
docker run hello-world
```

If you need to test or develop AI-related features, you will need to create an .env.local file in the repository root with the following values:

OPENAI_API_KEY=your-open-ai-api-key                                      # required for most AI features
ANTHROPIC_API_KEY=your-anthropic-api-key                                 # required for most AI features
HASH_TEMPORAL_WORKER_AI_AWS_ACCESS_KEY_ID=your-aws-access-key-id         # required for most AI features
HASH_TEMPORAL_WORKER_AI_AWS_SECRET_ACCESS_KEY=your-aws-secret-access-key # required for most AI features
E2B_API_KEY=your-e2b-api-key                                             # only required for the question-answering flow action

Note on environment files: .env.local is not committed to the repo – put any secrets that should remain secret here. The default environment variables are taken from .env, extended by .env.development, and finally by .env.local. If you want to overwrite values specified in .env or .env.development, you can add them to .env.local. Do not change any other .env files unless you intend to change the defaults for development or testing.

Launch external services (Postgres, the graph query layer, Kratos, Redis, and OpenSearch) as Docker containers:
```
yarn external-services up --wait
```
1. You can optionally force a rebuild of the Docker containers by adding the --build argument(this is necessary if changes have been made to the graph query layer). It's recommended to do this whenever updating your branch from upstream.
2. You can keep external services running between app restarts by adding the --detach argument to run the containers in the background. It is possible to tear down the external services with yarn external-services down.
3. When using yarn external-services:offline up, the Graph services does not try to connect to https://blockprotocol.org to fetch required schemas. This is useful for development when the internet connection is slow or unreliable.
4. You can also run the Graph API and AI Temporal worker outside of Docker – this is useful if they are changing frequently and you want to avoid rebuilding the Docker containers. To do so, stop them in Docker and then run yarn dev:graph and yarn workspace @apps/hash-ai-worker-ts dev respectively in separate terminals.
Launch app services:
```
yarn start
```
This will start backend and frontend in a single terminal. Once you see http://localhost:3000, the frontend end is ready to visit there. The API is online once you see localhost:5001 in the terminal. Both must be online for the frontend to function.

You can also launch parts of the app in separate terminals, e.g.:
```
yarn start:graph
yarn start:backend
yarn start:frontend
```
See package.json → scripts for details and more options.
Log in

When the HASH API is started, three users are automatically seeded for development purposes. Their passwords are all password.
- [email protected], [email protected] – regular users
- [email protected] – an admin

Running the browser plugin

If you need to run the browser plugin locally, see the README.md in the apps/plugin-browser directory.

Resetting the local database

If you need to reset the local database, to clear out test data or because it has become corrupted during development, you have two options:

The slow option – rebuild in Docker
1. In the Docker UI (or via CLI at your preference), stop and delete the hash-external-services container
2. In 'Volumes', search 'hash-external-services' and delete the volumes shown
3. Run yarn external-services up --wait to rebuild the services
The fast option – reset the database via the Graph API
1. Run the Graph API in test mode by running yarn dev:graph:test-server
2. Run yarn graph:reset-database to reset the database
3. If you need to use the frontend, you will also need to delete the rows in the identities table in the dev_kratos database, or signin will not work. You can do so via any Postgres UI or CLI. The db connection and user details are in .env

External services test mode

The external services of the system can be started in 'test mode' to prevent polluting the development database. This is useful for situations where the database is used for tests that modify the database without cleaning up afterwards.

To make use of this test mode, the external services can be started as follows:

yarn external-services:test up

Deploying HASH to the cloud

Sending emails

Email-sending in HASH is handled by either Kratos (in the case of authentication-related emails) or through the HASH API Email Transport (for everything else).

To use AwsSesEmailTransporter, set export HASH_EMAIL_TRANSPORTER=aws_ses in your terminal before running the app. Valid AWS credentials are required for this email transporter to work.

Transactional emails templates are located in the following locations:

Kratos emails in ./../../apps/hash-external-services/kratos/templates/. This directory contains the following templates:
- recovery_code - Email templates for the account recovery flow using a code for the UI.
  - When an email belongs to a registered HASH user, it will use the valid template, otherwise the invalid template is used.
- verification_code - Email verification templates for the account registration flow using a code for the UI.
  - When an email belongs to a registered HASH user, it will use the valid template, otherwise the invalid template is used.
HASH emails in ../hash-api/src/email/index.ts

Deploying HASH to the cloud

Support for running HASH in the cloud is coming soon. We plan on publishing a comprehensive guide to deploying HASH on AWS/GCP/Azure in the near future. In the meantime, instructions contained in the root /infra directory might help in getting started.

Examples

Coming soon: we'll be collecting examples in the Awesome HASH repository.

Roadmap

Browse the HASH development roadmap for more information about currently in-flight and upcoming features.

About this repository

Repository structure

This repository's contents is divided across several primary sections:

/apps contains the primary code powering our runnable applications
- The HASH application itself is divided into various different services which can be found in this directory.
/blocks contains our public Block Protocol blocks
/infra houses deployment scripts, utilities and other infrastructure useful in running our apps
/libs contains libraries including npm packages and Rust crates
/tests contains end-to-end and integration tests that span across one or more apps, blocks or libs

Environment variables

Here's a list of possible environment variables. Everything that's necessary already has a default value.

You do not need to set any environment variables to run the application.

General API server environment variables

NODE_ENV: ("development" or "production") the runtime environment. Controls default logging levels and output formatting.
PORT: the port number the API will listen on.

AWS configuration

If you want to use AWS for file uploads or emails, you will need to have it configured:

AWS_REGION: The region, eg. us-east-1
AWS_ACCESS_KEY_ID: Your AWS access key
AWS_SECRET_ACCESS_KEY: Your AWS secret key
AWS_S3_UPLOADS_BUCKET: The name of the bucket to use for file uploads (if you want to use S3 for file uploads), eg: my_uploads_bucket
AWS_S3_UPLOADS_ACCESS_KEY_ID: (optional) the AWS access key ID to use for file uploads. Must be provided along with the secret access key if the API is not otherwise authorized to access the bucket (e.g. via an IAM role).
AWS_S3_UPLOADS_SECRET_ACCESS_KEY: (optional) the AWS secret access key to use for file uploads.
AWS_S3_UPLOADS_ENDPOINT: (optional) the endpoint to use for S3 operations. If not, the AWS S3 default for the given region is used. Useful if you are using a different S3-compatible storage provider.
AWS_S3_UPLOADS_FORCE_PATH_STYLE: (optional) set true if your S3 setup requires path-style rather than virtual hosted-style S3 requests.

For some in-browser functionality (e.g. document previewing), you must configure a Access-Control-Allow-Origin header on your bucket to be something other than '*'.

File uploads

By default, files are uploaded locally, which is not recommended for production use. It is also possible to upload files on AWS S3.

FILE_UPLOAD_PROVIDER: Which type of provider is used for file uploads. Possible values LOCAL_FILE_SYSTEM, or AWS_S3. If choosing S3, then you need to configure the AWS_S3_UPLOADS_ variables above.
LOCAL_FILE_UPLOAD_PATH: Relative path to store uploaded files if using the local file system storage provider. Default is var/uploads (the var folder is the folder normally used for application data)

Email

During development, the dummy email transporter writes emails to a local folder.

HASH_EMAIL_TRANSPORTER: dummy or aws. If set to dummy, the local dummy email transporter will be used during development instead of aws (default: dummy)
DUMMY_EMAIL_TRANSPORTER_FILE_PATH: Default is var/api/dummy-email-transporter/email-dumps.yml
DUMMY_EMAIL_TRANSPORTER_USE_CLIPBOARD: true or false (default: true)

OpenSearch

NOTE: OpenSearch is currently disabled by default, and is presently unmaintained.

HASH_OPENSEARCH_ENABLED: whether OpenSearch is used or not. true or false. (default: false).
HASH_OPENSEARCH_HOST: the hostname of the OpenSearch cluster to connect to. (default: localhost)
HASH_OPENSEARCH_PASSWORD: the password to use when making the connection. (default: admin)
HASH_OPENSEARCH_PORT: the port number that the cluster accepts (default: 9200)
HASH_OPENSEARCH_USERNAME: the username to connect to the cluster as. (default: admin)
HASH_OPENSEARCH_HTTPS_ENABLED: (optional) set to "1" to connect to the cluster over an HTTPS connection.

Postgres

POSTGRES_PORT (default: 5432)

Various services also have their own configuration.

The Postgres superuser is configured through:

POSTGRES_USER (default: postgres)
POSTGRES_PASSWORD (default: postgres)

The Postgres information for Kratos is configured through:

HASH_KRATOS_PG_USER (default: kratos)
HASH_KRATOS_PG_PASSWORD (default: kratos)
HASH_KRATOS_PG_DATABASE (default: kratos)

The Postgres information for Temporal is configured through:

HASH_TEMPORAL_PG_USER (default: temporal)
HASH_TEMPORAL_PG_PASSWORD (default: temporal)
HASH_TEMPORAL_PG_DATABASE (default: temporal)
HASH_TEMPORAL_VISIBILITY_PG_DATABASE (default: temporal_visibility)

The Postgres information for the graph query layer is configured through:

HASH_GRAPH_PG_USER (default: graph)
HASH_GRAPH_PG_PASSWORD (default: graph)
HASH_GRAPH_PG_DATABASE (default: graph)

Redis

HASH_REDIS_HOST (default: localhost)
HASH_REDIS_PORT (default: 6379)

Statsd

If the service should report metrics to a StatsD server, the following variables must be set.

STATSD_ENABLED: Set to "1" if the service should report metrics to a StatsD server.
STATSD_HOST: the hostname of the StatsD server.
STATSD_PORT: (default: 8125) the port number the StatsD server is listening on.

Snowplow telemetry

HASH_TELEMETRY_ENABLED: whether Snowplow is used or not. true or false. (default: false)
HASH_TELEMETRY_HTTPS: set to "1" to connect to the Snowplow over an HTTPS connection. true or false. (default: false)
HASH_TELEMETRY_DESTINATION: the hostname of the Snowplow tracker endpoint to connect to. (required)
HASH_TELEMETRY_APP_ID: ID used to differentiate application by. Can be any string. (default: hash-workspace-app)

Others

FRONTEND_URL: URL of the frontend website for links (default: http://localhost:3000)
NOTIFICATION_POLL_INTERVAL: the interval in milliseconds at which the frontend will poll for new notifications, or 0 for no polling. (default: 10_000)
HASH_INTEGRATION_QUEUE_NAME The name of the Redis queue which updates to entities are published to
HASH_REALTIME_PORT: Realtime service listening port. (default: 3333)
HASH_SEARCH_LOADER_PORT: (default: 3838)
HASH_SEARCH_QUEUE_NAME: The name of the queue to push changes for the search loader service (default: search)
API_ORIGIN: The origin that the API service can be reached on (default: http://localhost:5001)
SESSION_SECRET: The secret used to sign sessions (default: secret)
LOG_LEVEL: the level of runtime logs that should be omitted, either set to debug, info, warn, error (default: info)
BLOCK_PROTOCOL_API_KEY: the api key for fetching blocks from the Þ Hub. Generate a key at https://blockprotocol.org/settings/api-keys.

Contributing

Please see CONTRIBUTING if you're interested in getting involved in the design or development of HASH.

We're also hiring for a number of key roles. We don't accept applications for engineering roles like a normal company might, but exclusively headhunt (using HASH as a tool to help us find the best people). Contributing to our public monorepo, even in a small way, is one way of guaranteeing you end up on our radar as every PR is reviewed by a human, as well as AI.

We also provide repo-specific example configuration files you can use for popular IDEs, including VSCode or Zed.

License

The vast majority of this repository is published as free, open-source software. Please see LICENSE for more information about the specific licenses under which the different parts are available.

Security

Please see SECURITY for instructions around reporting issues, and details of which package versions we actively support.

Contact

Find us on 𝕏 at @hashintel, email [email protected], create a discussion, or open an issue for quick help and community support.

Project permalink: https://github.com/hashintel/hash

For Tasks:

Click tags to check more tools for each tasks

integrate data understand data use data make decisions build databases

For Jobs:

data analyst data scientist software engineer database administrator business analyst

Alternative AI tools for hash

Similar Open Source Tools

hash

github

: 1.2k

mods

AI for the command line, built for pipelines. LLM based AI is really good at interpreting the output of commands and returning the results in CLI friendly text formats like Markdown. Mods is a simple tool that makes it super easy to use AI on the command line and in your pipelines. Mods works with OpenAI, Groq, Azure OpenAI, and LocalAI To get started, install Mods and check out some of the examples below. Since Mods has built-in Markdown formatting, you may also want to grab Glow to give the output some _pizzazz_.

github

: 3.4k

HuggingFaceGuidedTourForMac

HuggingFaceGuidedTourForMac is a guided tour on how to install optimized pytorch and optionally Apple's new MLX, JAX, and TensorFlow on Apple Silicon Macs. The repository provides steps to install homebrew, pytorch with MPS support, MLX, JAX, TensorFlow, and Jupyter lab. It also includes instructions on running large language models using HuggingFace transformers. The repository aims to help users set up their Macs for deep learning experiments with optimized performance.

github

: 79

garak

Garak is a free tool that checks if a Large Language Model (LLM) can be made to fail in a way that is undesirable. It probes for hallucination, data leakage, prompt injection, misinformation, toxicity generation, jailbreaks, and many other weaknesses. Garak's a free tool. We love developing it and are always interested in adding functionality to support applications.

github

: 1.3k

py-vectara-agentic

The `vectara-agentic` Python library is designed for developing powerful AI assistants using Vectara and Agentic-RAG. It supports various agent types, includes pre-built tools for domains like finance and legal, and enables easy creation of custom AI assistants and agents. The library provides tools for summarizing text, rephrasing text, legal tasks like summarizing legal text and critiquing as a judge, financial tasks like analyzing balance sheets and income statements, and database tools for inspecting and querying databases. It also supports observability via LlamaIndex and Arize Phoenix integration.

github

: 98

garak

Garak is a vulnerability scanner designed for LLMs (Large Language Models) that checks for various weaknesses such as hallucination, data leakage, prompt injection, misinformation, toxicity generation, and jailbreaks. It combines static, dynamic, and adaptive probes to explore vulnerabilities in LLMs. Garak is a free tool developed for red-teaming and assessment purposes, focusing on making LLMs or dialog systems fail. It supports various LLM models and can be used to assess their security and robustness.

github

: 4.2k

aio-theme

github

: 71

py-gpt

github

: 785

Hurley-AI

Hurley AI is a next-gen framework for developing intelligent agents through Retrieval-Augmented Generation. It enables easy creation of custom AI assistants and agents, supports various agent types, and includes pre-built tools for domains like finance and legal. Hurley AI integrates with LLM inference services and provides observability with Arize Phoenix. Users can create Hurley RAG tools with a single line of code and customize agents with specific instructions. The tool also offers various helper functions to connect with Hurley RAG and search tools, along with pre-built tools for tasks like summarizing text, rephrasing text, understanding memecoins, and querying databases.

github

: 175

termax

Termax is an LLM agent in your terminal that converts natural language to commands. It is featured by: - Personalized Experience: Optimize the command generation with RAG. - Various LLMs Support: OpenAI GPT, Anthropic Claude, Google Gemini, Mistral AI, and more. - Shell Extensions: Plugin with popular shells like `zsh`, `bash` and `fish`. - Cross Platform: Able to run on Windows, macOS, and Linux.

github

: 88

hume-python-sdk

The Hume AI Python SDK allows users to integrate Hume APIs directly into their Python applications. Users can access complete documentation, quickstart guides, and example notebooks to get started. The SDK is designed to provide support for Hume's expressive communication platform built on scientific research. Users are encouraged to create an account at beta.hume.ai and stay updated on changes through Discord. The SDK may undergo breaking changes to improve tooling and ensure reliable releases in the future.

github

: 79

code2prompt

code2prompt is a command-line tool that converts your codebase into a single LLM prompt with a source tree, prompt templating, and token counting. It automates generating LLM prompts from codebases of any size, customizing prompt generation with Handlebars templates, respecting .gitignore, filtering and excluding files using glob patterns, displaying token count, including Git diff output, copying prompt to clipboard, saving prompt to an output file, excluding files and folders, adding line numbers to source code blocks, and more. It helps streamline the process of creating LLM prompts for code analysis, generation, and other tasks.

github

: 5.1k

llm-ollama

LLM-ollama is a plugin that provides access to models running on an Ollama server. It allows users to query the Ollama server for a list of models, register them with LLM, and use them for prompting, chatting, and embedding. The plugin supports image attachments, embeddings, JSON schemas, async models, model aliases, and model options. Users can interact with Ollama models through the plugin in a seamless and efficient manner.

github

: 247

ML-Bench

ML-Bench is a tool designed to evaluate large language models and agents for machine learning tasks on repository-level code. It provides functionalities for data preparation, environment setup, usage, API calling, open source model fine-tuning, and inference. Users can clone the repository, load datasets, run ML-LLM-Bench, prepare data, fine-tune models, and perform inference tasks. The tool aims to facilitate the evaluation of language models and agents in the context of machine learning tasks on code repositories.

github

: 344

log10

Log10 is a one-line Python integration to manage your LLM data. It helps you log both closed and open-source LLM calls, compare and identify the best models and prompts, store feedback for fine-tuning, collect performance metrics such as latency and usage, and perform analytics and monitor compliance for LLM powered applications. Log10 offers various integration methods, including a python LLM library wrapper, the Log10 LLM abstraction, and callbacks, to facilitate its use in both existing production environments and new projects. Pick the one that works best for you. Log10 also provides a copilot that can help you with suggestions on how to optimize your prompt, and a feedback feature that allows you to add feedback to your completions. Additionally, Log10 provides prompt provenance, session tracking and call stack functionality to help debug prompt chains. With Log10, you can use your data and feedback from users to fine-tune custom models with RLHF, and build and deploy more reliable, accurate and efficient self-hosted models. Log10 also supports collaboration, allowing you to create flexible groups to share and collaborate over all of the above features.

github

: 96

sage

Sage is a tool that allows users to chat with any codebase, providing a chat interface for code understanding and integration. It simplifies the process of learning how a codebase works by offering heavily documented answers sourced directly from the code. Users can set up Sage locally or on the cloud with minimal effort. The tool is designed to be easily customizable, allowing users to swap components of the pipeline and improve the algorithms powering code understanding and generation.

github

: 705

For similar tasks

hash

github

: 1.2k

n8n-docs

n8n is an extendable workflow automation tool that enables you to connect anything to everything. It is open-source and can be self-hosted or used as a service. n8n provides a visual interface for creating workflows, which can be used to automate tasks such as data integration, data transformation, and data analysis. n8n also includes a library of pre-built nodes that can be used to connect to a variety of applications and services. This makes it easy to create complex workflows without having to write any code.

github

: 352

island-ai

island-ai is a TypeScript toolkit tailored for developers engaging with structured outputs from Large Language Models. It offers streamlined processes for handling, parsing, streaming, and leveraging AI-generated data across various applications. The toolkit includes packages like zod-stream for interfacing with LLM streams, stream-hooks for integrating streaming JSON data into React applications, and schema-stream for JSON streaming parsing based on Zod schemas. Additionally, related packages like @instructor-ai/instructor-js focus on data validation and retry mechanisms, enhancing the reliability of data processing workflows.

github

: 134

ezdata

Ezdata is a data processing and task scheduling system developed based on Python backend and Vue3 frontend. It supports managing multiple data sources, abstracting various data sources into a unified data model, integrating chatgpt for data question and answer functionality, enabling low-code data integration and visualization processing, scheduling single and dag tasks, and integrating a low-code data visualization dashboard system.

github

: 93

buildel

Buildel is an AI automation platform that empowers users to create versatile workflows without writing code. It supports multiple providers and interfaces, offers pre-built use cases, and allows users to bring their own API keys. Ideal for AI-powered document retrieval, conversational interfaces, and data integration. Users can get started at app.buildel.ai or run Buildel locally with Node.js, Elixir/Erlang, Docker, Git, and JQ installed. Join the community on Discord for support and discussions.

github

: 150

obot

Obot is an open source AI agent platform that allows users to build agents for various use cases such as copilots, assistants, and autonomous workflows. It offers integration with leading LLM providers, built-in RAG for data, easy integration with custom web services and APIs, and OAuth 2.0 authentication.

github

: 152

semantic-router

Semantic Router is a superfast decision-making layer for your LLMs and agents. Rather than waiting for slow LLM generations to make tool-use decisions, we use the magic of semantic vector space to make those decisions — _routing_ our requests using _semantic_ meaning.

github

: 2.5k

AgentKit

AgentKit is a framework for constructing complex human thought processes from simple natural language prompts. It offers a unified way to represent and execute these processes as graphs, making it easy to design and tune agents without any programming experience. AgentKit can be used for a variety of tasks, including generating text, answering questions, and making decisions.

github

: 219

For similar jobs

Azure-Analytics-and-AI-Engagement

The Azure-Analytics-and-AI-Engagement repository provides packaged Industry Scenario DREAM Demos with ARM templates (Containing a demo web application, Power BI reports, Synapse resources, AML Notebooks etc.) that can be deployed in a customer’s subscription using the CAPE tool within a matter of few hours. Partners can also deploy DREAM Demos in their own subscriptions using DPoC.

github

: 136

skyvern

Skyvern automates browser-based workflows using LLMs and computer vision. It provides a simple API endpoint to fully automate manual workflows, replacing brittle or unreliable automation solutions. Traditional approaches to browser automations required writing custom scripts for websites, often relying on DOM parsing and XPath-based interactions which would break whenever the website layouts changed. Instead of only relying on code-defined XPath interactions, Skyvern adds computer vision and LLMs to the mix to parse items in the viewport in real-time, create a plan for interaction and interact with them. This approach gives us a few advantages: 1. Skyvern can operate on websites it’s never seen before, as it’s able to map visual elements to actions necessary to complete a workflow, without any customized code 2. Skyvern is resistant to website layout changes, as there are no pre-determined XPaths or other selectors our system is looking for while trying to navigate 3. Skyvern leverages LLMs to reason through interactions to ensure we can cover complex situations. Examples include: 1. If you wanted to get an auto insurance quote from Geico, the answer to a common question “Were you eligible to drive at 18?” could be inferred from the driver receiving their license at age 16 2. If you were doing competitor analysis, it’s understanding that an Arnold Palmer 22 oz can at 7/11 is almost definitely the same product as a 23 oz can at Gopuff (even though the sizes are slightly different, which could be a rounding error!) Want to see examples of Skyvern in action? Jump to #real-world-examples-of- skyvern

github

: 12.9k

pandas-ai

PandasAI is a Python library that makes it easy to ask questions to your data in natural language. It helps you to explore, clean, and analyze your data using generative AI.

github

: 14.0k

vanna

Vanna is an open-source Python framework for SQL generation and related functionality. It uses Retrieval-Augmented Generation (RAG) to train a model on your data, which can then be used to ask questions and get back SQL queries. Vanna is designed to be portable across different LLMs and vector databases, and it supports any SQL database. It is also secure and private, as your database contents are never sent to the LLM or the vector database.

github

: 10.8k

databend

Databend is an open-source cloud data warehouse that serves as a cost-effective alternative to Snowflake. With its focus on fast query execution and data ingestion, it's designed for complex analysis of the world's largest datasets.

github

: 7.7k

Avalonia-Assistant

Avalonia-Assistant is an open-source desktop intelligent assistant that aims to provide a user-friendly interactive experience based on the Avalonia UI framework and the integration of Semantic Kernel with OpenAI or other large LLM models. By utilizing Avalonia-Assistant, you can perform various desktop operations through text or voice commands, enhancing your productivity and daily office experience.

github

: 113

marvin

Marvin is a lightweight AI toolkit for building natural language interfaces that are reliable, scalable, and easy to trust. Each of Marvin's tools is simple and self-documenting, using AI to solve common but complex challenges like entity extraction, classification, and generating synthetic data. Each tool is independent and incrementally adoptable, so you can use them on their own or in combination with any other library. Marvin is also multi-modal, supporting both image and audio generation as well using images as inputs for extraction and classification. Marvin is for developers who care more about _using_ AI than _building_ AI, and we are focused on creating an exceptional developer experience. Marvin users should feel empowered to bring tightly-scoped "AI magic" into any traditional software project with just a few extra lines of code. Marvin aims to merge the best practices for building dependable, observable software with the best practices for building with generative AI into a single, easy-to-use library. It's a serious tool, but we hope you have fun with it. Marvin is open-source, free to use, and made with 💙 by the team at Prefect.

github

: 5.5k

activepieces

Activepieces is an open source replacement for Zapier, designed to be extensible through a type-safe pieces framework written in Typescript. It features a user-friendly Workflow Builder with support for Branches, Loops, and Drag and Drop. Activepieces integrates with Google Sheets, OpenAI, Discord, and RSS, along with 80+ other integrations. The list of supported integrations continues to grow rapidly, thanks to valuable contributions from the community. Activepieces is an open ecosystem; all piece source code is available in the repository, and they are versioned and published directly to npmjs.com upon contributions. If you cannot find a specific piece on the pieces roadmap, please submit a request by visiting the following link: Request Piece Alternatively, if you are a developer, you can quickly build your own piece using our TypeScript framework. For guidance, please refer to the following guide: Contributor's Guide

github

: 12.6k