llm-graph-builder

Neo4j graph construction from unstructured data using LLMs

Stars: 3247

Visit

Knowledge Graph Builder App is a tool designed to convert PDF documents into a structured knowledge graph stored in Neo4j. It utilizes OpenAI's GPT/Diffbot LLM to extract nodes, relationships, and properties from PDF text content. Users can upload files from local machine or S3 bucket, choose LLM model, and create a knowledge graph. The app integrates with Neo4j for easy visualization and querying of extracted information.

README:

Knowledge Graph Builder App

Creating knowledge graphs from unstructured data

LLM Graph Builder

Overview

This application is designed to turn Unstructured data (pdfs,docs,txt,youtube video,web pages,etc.) into a knowledge graph stored in Neo4j. It utilizes the power of Large language models (OpenAI,Gemini,etc.) to extract nodes, relationships and their properties from the text and create a structured knowledge graph using Langchain framework.

Upload your files from local machine, GCS or S3 bucket or from web sources, choose your LLM model and generate knowledge graph.

Key Features

Knowledge Graph Creation: Transform unstructured data into structured knowledge graphs using LLMs.
Providing Schema: Provide your own custom schema or use existing schema in settings to generate graph.
View Graph: View graph for a particular source or multiple sources at a time in Bloom.
Chat with Data: Interact with your data in a Neo4j database through conversational queries, also retrieve metadata about the source of response to your queries.For a dedicated chat interface, access the standalone chat application at: Chat-Only. This link provides a focused chat experience for querying your data.

Getting started

⚠️ You will need to have a Neo4j Database 5.23 or later with APOC installed to use this Knowledge Graph Builder. You can use any Neo4j Aura database (including the free database) If you are using Neo4j Desktop, you will not be able to use the docker-compose but will have to follow the separate deployment of backend and frontend section. ⚠️

Deployment

Local deployment

Running through docker-compose

By default only OpenAI and Diffbot are enabled since Gemini requires extra GCP configurations. According to enviornment we are configuring the models which is indicated by VITE_LLM_MODELS_PROD variable we can configure model based on our need.

EX:

VITE_LLM_MODELS_PROD="openai_gpt_4o,openai_gpt_4o_mini,diffbot,gemini_1.5_flash"

Additional configs

By default, the input sources will be: Local files, Youtube, Wikipedia ,AWS S3 and Webpages. As this default config is applied:

VITE_REACT_APP_SOURCES="local,youtube,wiki,s3,web"

If however you want the Google GCS integration, add gcs and your Google client ID:

VITE_REACT_APP_SOURCES="local,youtube,wiki,s3,gcs,web"
VITE_GOOGLE_CLIENT_ID="xxxx"

You can of course combine all (local, youtube, wikipedia, s3 and gcs) or remove any you don't want/need.

Chat Modes

By default,all of the chat modes will be available: vector, graph_vector, graph, fulltext, graph_vector_fulltext , entity_vector and global_vector.

If none of the mode is mentioned in the chat modes variable all modes will be available:

VITE_CHAT_MODES=""

If however you want to specify the only vector mode or only graph mode you can do that by specifying the mode in the env:

VITE_CHAT_MODES="vector,graph"
VITE_CHAT_MODES="vector,graph"

Running Backend and Frontend separately (dev environment)

Alternatively, you can run the backend and frontend separately:

For the frontend:

Create the frontend/.env file by copy/pasting the frontend/example.env.
Change values as needed
```
cd frontend
yarn
yarn run dev
```

For the backend:

Create the backend/.env file by copy/pasting the backend/example.env. To streamline the initial setup and testing of the application, you can preconfigure user credentials directly within the backend .env file. This bypasses the login dialog and allows you to immediately connect with a predefined user.
- NEO4J_URI:
- NEO4J_USERNAME:
- NEO4J_PASSWORD:
- NEO4J_DATABASE:
Change values as needed

cd backend
python -m venv envName
source envName/bin/activate 
pip install -r requirements.txt
uvicorn score:app --reload

Deploy in Cloud

To deploy the app and packages on Google Cloud Platform, run the following command on google cloud run:

# Frontend deploy 
gcloud run deploy dev-frontend 
source location current directory > Frontend
region : 32 [us-central 1]
Allow unauthenticated request : Yes

# Backend deploy 
gcloud run deploy --set-env-vars "OPENAI_API_KEY = " --set-env-vars "DIFFBOT_API_KEY = " --set-env-vars "NEO4J_URI = " --set-env-vars "NEO4J_PASSWORD = " --set-env-vars "NEO4J_USERNAME = "
source location current directory > Backend
region : 32 [us-central 1]
Allow unauthenticated request : Yes

ENV

Env Variable Name	Mandatory/Optional	Default Value	Description

BACKEND ENV
OPENAI_API_KEY	Mandatory		An OpenAPI Key is required to use open LLM model to authenticate andn track requests
DIFFBOT_API_KEY	Mandatory		API key is required to use Diffbot's NLP service to extraction entities and relatioship from unstructured data
BUCKET	Mandatory		bucket name to store uploaded file on GCS
NEO4J_USER_AGENT	Optional	llm-graph-builder	Name of the user agent to track neo4j database activity
ENABLE_USER_AGENT	Optional	true	Boolean value to enable/disable neo4j user agent
DUPLICATE_TEXT_DISTANCE	Mandatory	5	This value used to find distance for all node pairs in the graph and calculated based on node properties
DUPLICATE_SCORE_VALUE	Mandatory	0.97	Node score value to match duplicate node
EFFECTIVE_SEARCH_RATIO	Mandatory	1
GRAPH_CLEANUP_MODEL	Optional	0.97	Model name to clean-up graph in post processing
MAX_TOKEN_CHUNK_SIZE	Optional	10000	Maximum token size to process file content
YOUTUBE_TRANSCRIPT_PROXY	Optional		Proxy key to process youtube video for getting transcript
EMBEDDING_MODEL	Optional	all-MiniLM-L6-v2	Model for generating the text embedding (all-MiniLM-L6-v2 , openai , vertexai)
IS_EMBEDDING	Optional	true	Flag to enable text embedding
KNN_MIN_SCORE	Optional	0.94	Minimum score for KNN algorithm
GEMINI_ENABLED	Optional	False	Flag to enable Gemini
GCP_LOG_METRICS_ENABLED	Optional	False	Flag to enable Google Cloud logs
NUMBER_OF_CHUNKS_TO_COMBINE	Optional	5	Number of chunks to combine when processing embeddings
UPDATE_GRAPH_CHUNKS_PROCESSED	Optional	20	Number of chunks processed before updating progress
NEO4J_URI	Optional	neo4j://database:7687	URI for Neo4j database
NEO4J_USERNAME	Optional	neo4j	Username for Neo4j database
NEO4J_PASSWORD	Optional	password	Password for Neo4j database
LANGCHAIN_API_KEY	Optional		API key for Langchain
LANGCHAIN_PROJECT	Optional		Project for Langchain
LANGCHAIN_TRACING_V2	Optional	true	Flag to enable Langchain tracing
GCS_FILE_CACHE	Optional	False	If set to True, will save the files to process into GCS. If set to False, will save the files locally
LANGCHAIN_ENDPOINT	Optional	https://api.smith.langchain.com	Endpoint for Langchain API
ENTITY_EMBEDDING	Optional	False	If set to True, It will add embeddings for each entity in database
LLM_MODEL_CONFIG_ollama_<model_name>	Optional		Set ollama config as - model_name,model_local_url for local deployments
RAGAS_EMBEDDING_MODEL	Optional	openai	embedding model used by ragas evaluation framework

FRONTEND ENV
VITE_BACKEND_API_URL	Optional	http://localhost:8000	URL for backend API
VITE_BLOOM_URL	Optional	https://workspace-preview.neo4j.io/workspace/explore?connectURL={CONNECT_URL}&search=Show+me+a+graph&featureGenAISuggestions=true&featureGenAISuggestionsInternal=true	URL for Bloom visualization
VITE_REACT_APP_SOURCES	Mandatory	local,youtube,wiki,s3	List of input sources that will be available
VITE_CHAT_MODES	Mandatory	vector,graph+vector,graph,hybrid	Chat modes available for Q&A
VITE_ENV	Mandatory	DEV or PROD	Environment variable for the app
VITE_TIME_PER_PAGE	Optional	50	Time per page for processing
VITE_CHUNK_SIZE	Optional	5242880	Size of each chunk of file for upload
VITE_GOOGLE_CLIENT_ID	Optional		Client ID for Google authentication
VITE_LLM_MODELS_PROD	Optional	openai_gpt_4o,openai_gpt_4o_mini,diffbot,gemini_1.5_flash	To Distinguish models based on the Enviornment PROD or DEV
VITE_LLM_MODELS	Optional	'diffbot,openai_gpt_3.5,openai_gpt_4o,openai_gpt_4o_mini,gemini_1.5_pro,gemini_1.5_flash,azure_ai_gpt_35,azure_ai_gpt_4o,ollama_llama3,groq_llama3_70b,anthropic_claude_3_5_sonnet'	Supported Models For the application
VITE_AUTH0_CLIENT_ID	Mandatory if you are enabling Authentication otherwise it is optional		Okta Oauth Client ID for authentication
VITE_AUTH0_DOMAIN	Mandatory if you are enabling Authentication otherwise it is optional		Okta Oauth Cliend Domain
VITE_SKIP_AUTH	Optional	true	Flag to skip the authentication
VITE_CHUNK_OVERLAP	Optional	20	variable to configure chunk overlap
VITE_TOKENS_PER_CHUNK	Optional	100	variable to configure tokens count per chunk.This gives flexibility for users who may require different chunk sizes for various tokenization tasks, especially when working with large datasets or specific language models.
VITE_CHUNK_TO_COMBINE	Optional	1	variable to configure number of chunks to combine for parllel processing.

LLMs Supported

OpenAI
Gemini
Diffbot
Azure OpenAI(dev deployed version)
Anthropic(dev deployed version)
Fireworks(dev deployed version)
Groq(dev deployed version)
Amazon Bedrock(dev deployed version)
Ollama(dev deployed version)
Deepseek(dev deployed version)
Other OpenAI compabtile baseurl models(dev deployed version)

For local llms (Ollama)

Pull the docker imgage of ollama

docker pull ollama/ollama

Run the ollama docker image

docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

Pull specific ollama model.

ollama pull llama3

Execute any llm model ex🦙3

docker exec -it ollama ollama run llama3

Configure env variable in docker compose.

LLM_MODEL_CONFIG_ollama_<model_name>
#example
LLM_MODEL_CONFIG_ollama_llama3=${LLM_MODEL_CONFIG_ollama_llama3-llama3,
http://host.docker.internal:11434}

Configure the backend API url

VITE_BACKEND_API_URL=${VITE_BACKEND_API_URL-backendurl}

Open the application in browser and select the ollama model for the extraction.
Enjoy Graph Building.

Usage

Connect to Neo4j Aura Instance which can be both AURA DS or AURA DB by passing URI and password through Backend env, fill using login dialog or drag and drop the Neo4j credentials file.
To differntiate we have added different icons. For AURA DB we have a database icon and for AURA DS we have scientific molecule icon right under Neo4j Connection details label.
Choose your source from a list of Unstructured sources to create graph.
Change the LLM (if required) from drop down, which will be used to generate graph.
Optionally, define schema(nodes and relationship labels) in entity graph extraction settings.
Either select multiple files to 'Generate Graph' or all the files in 'New' status will be processed for graph creation.
Have a look at the graph for individual files using 'View' in grid or select one or more files and 'Preview Graph'
Ask questions related to the processed/completed sources to chat-bot, Also get detailed information about your answers generated by LLM.

Links

LLM Knowledge Graph Builder Application

Neo4j Workspace

Reference

Demo of application

Contact

For any inquiries or support, feel free to raise Github Issue

Happy Graph Building!

For Tasks:

Click tags to check more tools for each tasks

extract information create graph upload pdf integrate s3 visualize data

For Jobs:

data scientist knowledge engineer ai researcher software developer data analyst

Alternative AI tools for llm-graph-builder

Similar Open Source Tools

llm-graph-builder

github

: 3.2k

FFAIVideo

FFAIVideo is a lightweight node.js project that utilizes popular AI LLM to intelligently generate short videos. It supports multiple AI LLM models such as OpenAI, Moonshot, Azure, g4f, Google Gemini, etc. Users can input text to automatically synthesize exciting video content with subtitles, background music, and customizable settings. The project integrates Microsoft Edge's online text-to-speech service for voice options and uses Pexels website for video resources. Installation of FFmpeg is essential for smooth operation. Inspired by MoneyPrinterTurbo, MoneyPrinter, and MsEdgeTTS, FFAIVideo is designed for front-end developers with minimal dependencies and simple usage.

github

: 55

SemanticFinder

SemanticFinder is a frontend-only live semantic search tool that calculates embeddings and cosine similarity client-side using transformers.js and SOTA embedding models from Huggingface. It allows users to search through large texts like books with pre-indexed examples, customize search parameters, and offers data privacy by keeping input text in the browser. The tool can be used for basic search tasks, analyzing texts for recurring themes, and has potential integrations with various applications like wikis, chat apps, and personal history search. It also provides options for building browser extensions and future ideas for further enhancements and integrations.

github

: 204

TableLLM

TableLLM is a large language model designed for efficient tabular data manipulation tasks in real office scenarios. It can generate code solutions or direct text answers for tasks like insert, delete, update, query, merge, and chart operations on tables embedded in spreadsheets or documents. The model has been fine-tuned based on CodeLlama-7B and 13B, offering two scales: TableLLM-7B and TableLLM-13B. Evaluation results show its performance on benchmarks like WikiSQL, Spider, and self-created table operation benchmark. Users can use TableLLM for code and text generation tasks on tabular data.

github

: 77

last_layer

last_layer is a security library designed to protect LLM applications from prompt injection attacks, jailbreaks, and exploits. It acts as a robust filtering layer to scrutinize prompts before they are processed by LLMs, ensuring that only safe and appropriate content is allowed through. The tool offers ultra-fast scanning with low latency, privacy-focused operation without tracking or network calls, compatibility with serverless platforms, advanced threat detection mechanisms, and regular updates to adapt to evolving security challenges. It significantly reduces the risk of prompt-based attacks and exploits but cannot guarantee complete protection against all possible threats.

github

: 79

GenAIExamples

This project provides a collective list of Generative AI (GenAI) and Retrieval-Augmented Generation (RAG) examples such as chatbot with question and answering (ChatQnA), code generation (CodeGen), document summary (DocSum), etc.

github

: 398

ramalama

The Ramalama project simplifies working with AI by utilizing OCI containers. It automatically detects GPU support, pulls necessary software in a container, and runs AI models. Users can list, pull, run, and serve models easily. The tool aims to support various GPUs and platforms in the future, making AI setup hassle-free.

github

: 1.5k

crewAI-quickstart

CrewAI quickstart is a small project providing starter templates for an easy start with CrewAI. It includes notebooks, Python scripts, GUI with Streamlit, and Local LLMs for various tasks like web search, CSV lookup, web scraping, PDF search, and more. Contributions are welcome to enhance the project.

github

: 209

floneum

Floneum is a graph editor that makes it easy to develop your own AI workflows. It uses large language models (LLMs) to run AI models locally, without any external dependencies or even a GPU. This makes it easy to use LLMs with your own data, without worrying about privacy. Floneum also has a plugin system that allows you to improve the performance of LLMs and make them work better for your specific use case. Plugins can be used in any language that supports web assembly, and they can control the output of LLMs with a process similar to JSONformer or guidance.

github

: 1.8k

cambrian

Cambrian-1 is a fully open project focused on exploring multimodal Large Language Models (LLMs) with a vision-centric approach. It offers competitive performance across various benchmarks with models at different parameter levels. The project includes training configurations, model weights, instruction tuning data, and evaluation details. Users can interact with Cambrian-1 through a Gradio web interface for inference. The project is inspired by LLaVA and incorporates contributions from Vicuna, LLaMA, and Yi. Cambrian-1 is licensed under Apache 2.0 and utilizes datasets and checkpoints subject to their respective original licenses.

github

: 1.4k

SimpleAICV_pytorch_training_examples

SimpleAICV_pytorch_training_examples is a repository that provides simple training and testing examples for various computer vision tasks such as image classification, object detection, semantic segmentation, instance segmentation, knowledge distillation, contrastive learning, masked image modeling, OCR text detection, OCR text recognition, human matting, salient object detection, interactive segmentation, image inpainting, and diffusion model tasks. The repository includes support for multiple datasets and networks, along with instructions on how to prepare datasets, train and test models, and use gradio demos. It also offers pretrained models and experiment records for download from huggingface or Baidu-Netdisk. The repository requires specific environments and package installations to run effectively.

github

: 429

terraform-genai-doc-summarization

This solution showcases how to summarize a large corpus of documents using Generative AI. It provides an end-to-end demonstration of document summarization going all the way from raw documents, detecting text in the documents and summarizing the documents on-demand using Vertex AI LLM APIs, Cloud Vision Optical Character Recognition (OCR) and BigQuery.

github

: 85

dl_model_infer

This project is a c++ version of the AI reasoning library that supports the reasoning of tensorrt models. It provides accelerated deployment cases of deep learning CV popular models and supports dynamic-batch image processing, inference, decode, and NMS. The project has been updated with various models and provides tutorials for model exports. It also includes a producer-consumer inference model for specific tasks. The project directory includes implementations for model inference applications, backend reasoning classes, post-processing, pre-processing, and target detection and tracking. Speed tests have been conducted on various models, and onnx downloads are available for different models.

github

: 87

airdcpp-windows

AirDC++ for Windows 10/11 is a file sharing client with a focus on ease of use and performance. It is designed to provide a seamless experience for users looking to share and download files over the internet. The tool is built using Visual Studio 2022 and offers a range of features to enhance the file sharing process. Users can easily clone the repository to access the latest version and contribute to the development of the tool.

github

: 86

llms-with-matlab

This repository contains example code to demonstrate how to connect MATLAB to the OpenAI™ Chat Completions API (which powers ChatGPT™) as well as OpenAI Images API (which powers DALL·E™). This allows you to leverage the natural language processing capabilities of large language models directly within your MATLAB environment.

github

: 143

aip-community-registry

AIP Community Registry is a collection of community-built applications and projects leveraging Palantir's AIP Platform. It showcases real-world implementations from developers using AIP in production. The registry features various solutions demonstrating practical implementations and integration patterns across different use cases.

github

: 104

For similar tasks

llm-graph-builder

github

: 3.2k

langchain_dart

LangChain.dart is a Dart port of the popular LangChain Python framework created by Harrison Chase. LangChain provides a set of ready-to-use components for working with language models and a standard interface for chaining them together to formulate more advanced use cases (e.g. chatbots, Q&A with RAG, agents, summarization, extraction, etc.). The components can be grouped into a few core modules: * **Model I/O:** LangChain offers a unified API for interacting with various LLM providers (e.g. OpenAI, Google, Mistral, Ollama, etc.), allowing developers to switch between them with ease. Additionally, it provides tools for managing model inputs (prompt templates and example selectors) and parsing the resulting model outputs (output parsers). * **Retrieval:** assists in loading user data (via document loaders), transforming it (with text splitters), extracting its meaning (using embedding models), storing (in vector stores) and retrieving it (through retrievers) so that it can be used to ground the model's responses (i.e. Retrieval-Augmented Generation or RAG). * **Agents:** "bots" that leverage LLMs to make informed decisions about which available tools (such as web search, calculators, database lookup, etc.) to use to accomplish the designated task. The different components can be composed together using the LangChain Expression Language (LCEL).

github

: 497

x-crawl

x-crawl is a flexible Node.js AI-assisted crawler library that offers powerful AI assistance functions to make crawler work more efficient, intelligent, and convenient. It consists of a crawler API and various functions that can work normally even without relying on AI. The AI component is currently based on a large AI model provided by OpenAI, simplifying many tedious operations. The library supports crawling dynamic pages, static pages, interface data, and file data, with features like control page operations, device fingerprinting, asynchronous sync, interval crawling, failed retry handling, rotation proxy, priority queue, crawl information control, and TypeScript support.

github

: 1.5k

nlp-llms-resources

The 'nlp-llms-resources' repository is a comprehensive resource list for Natural Language Processing (NLP) and Large Language Models (LLMs). It covers a wide range of topics including traditional NLP datasets, data acquisition, libraries for NLP, neural networks, sentiment analysis, optical character recognition, information extraction, semantics, topic modeling, multilingual NLP, domain-specific LLMs, vector databases, ethics, costing, books, courses, surveys, aggregators, newsletters, papers, conferences, and societies. The repository provides valuable information and resources for individuals interested in NLP and LLMs.

github

: 82

sycamore

Sycamore is a conversational search and analytics platform for complex unstructured data, such as documents, presentations, transcripts, embedded tables, and internal knowledge repositories. It retrieves and synthesizes high-quality answers through bringing AI to data preparation, indexing, and retrieval. Sycamore makes it easy to prepare unstructured data for search and analytics, providing a toolkit for data cleaning, information extraction, enrichment, summarization, and generation of vector embeddings that encapsulate the semantics of data. Sycamore uses your choice of generative AI models to make these operations simple and effective, and it enables quick experimentation and iteration. Additionally, Sycamore uses OpenSearch for indexing, enabling hybrid (vector + keyword) search, retrieval-augmented generation (RAG) pipelining, filtering, analytical functions, conversational memory, and other features to improve information retrieval.

github

: 489

langroid

Langroid is a Python framework that makes it easy to build LLM-powered applications. It uses a multi-agent paradigm inspired by the Actor Framework, where you set up Agents, equip them with optional components (LLM, vector-store and tools/functions), assign them tasks, and have them collaboratively solve a problem by exchanging messages. Langroid is a fresh take on LLM app-development, where considerable thought has gone into simplifying the developer experience; it does not use Langchain.

github

: 3.2k

ontogpt

OntoGPT is a Python package for extracting structured information from text using large language models, instruction prompts, and ontology-based grounding. It provides a command line interface and a minimal web app for easy usage. The tool has been evaluated on test data and is used in related projects like TALISMAN for gene set analysis. OntoGPT enables users to extract information from text by specifying relevant terms and provides the extracted objects as output.

github

: 584

document-ai-samples

The Google Cloud Document AI Samples repository contains code samples and Community Samples demonstrating how to analyze, classify, and search documents using Google Cloud Document AI. It includes various projects showcasing different functionalities such as integrating with Google Drive, processing documents using Python, content moderation with Dialogflow CX, fraud detection, language extraction, paper summarization, tax processing pipeline, and more. The repository also provides access to test document files stored in a publicly-accessible Google Cloud Storage Bucket. Additionally, there are codelabs available for optical character recognition (OCR), form parsing, specialized processors, and managing Document AI processors. Community samples, like the PDF Annotator Sample, are also included. Contributions are welcome, and users can seek help or report issues through the repository's issues page. Please note that this repository is not an officially supported Google product and is intended for demonstrative purposes only.

github

: 235

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 855

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.3k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 30.6k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675