ChatOpsLLM
To simplify and streamline LLM operations, empowering developers and organizations to harness the full potential of large language models with ease.
Stars: 87
ChatOpsLLM is a project designed to empower chatbots with effortless DevOps capabilities. It provides an intuitive interface and streamlined workflows for managing and scaling language models. The project incorporates robust MLOps practices, including CI/CD pipelines with Jenkins and Ansible, monitoring with Prometheus and Grafana, and centralized logging with the ELK stack. Developers can find detailed documentation and instructions on the project's website.
README:
ChatOpsLLM: Empowering Chatbots with Effortless DevOps.
ChatOpsLLM is a project built with Open WebUI that can be deployed on Google Kubernetes Engine (GKE) for managing and scaling language models. It offers both Terraform and manual deployment methods, and incorporates robust MLOps practices. This includes CI/CD pipelines with Jenkins and Ansible for automation, monitoring with Prometheus and Grafana for performance insights, and centralized logging with the ELK stack for troubleshooting and analysis. Developers can find detailed documentation and instructions on the project's website.
https://github.com/user-attachments/assets/cf84a434-0dae-47b9-a93d-49a37965d968
- Ease of Use: ChatOpsLLM provides an intuitive interface and streamlined workflows that make managing LLMs simple and efficient, regardless of your experience level.
- Scalability & Flexibility: Scale your LLM deployments effortlessly, adapt to evolving needs, and integrate seamlessly with your existing infrastructure.
- Reduced Complexity: Eliminate the hassle of complex configurations and infrastructure management, allowing you to focus on building and deploying powerful LLM applications.
- Enhanced Productivity: Accelerate your LLM development lifecycle, optimize performance, and maximize the impact of your language models.
Developers building and deploying LLM-powered applications. Data scientists and machine learning engineers working with LLMs. DevOps teams responsible for managing LLM infrastructure. Organizations looking to integrate LLMs into their operations.
- Introduction
- Features
- Target Audience
-
Getting Started
- Quick Start
- Using Terraform for Google Kubernetes Engine (GKE)
- Manual Deployment to GKE
- Continuous Integration/Continuous Deployment (CI/CD) with Jenkins and Ansible
- Monitoring with Prometheus and Grafana
- Logging with Filebeat + Logstash + Elasticsearch + Kibana
- Optimize Cluster with Cast AI
- Log and Trace with Langfuse and Supabase
- Contributing
- License
- Citation
- Contact
In case you don't want to spend much time, please run this script and enjoy your coffee:
chmod +x ./cluster.sh
./cluster.sh
Remember to authenticate with GCP before using Terraform:
gcloud auth application-default login
This section provides a very quick start guide to get the application up and running as soon as possible. Please refer to the following sections for more detailed instructions.
If you're deploying the application to GKE, you can use Terraform to automate the setup of your Kubernetes cluster. Navigate to the iac/terraform
directory and initialize Terraform:
cd iac/terraform
terraform init
Plan and Apply Configuration:
Generate an execution plan to verify the resources that Terraform will create or modify, and then apply the configuration to set up the cluster:
terraform plan
terraform apply
2. Retrieve Cluster Information:
To interact with your GKE cluster, you'll need to retrieve its configuration. You can view the current cluster configuration with the following command:
cat ~/.kube/config
https://github.com/user-attachments/assets/3133c2a8-8475-45c6-8900-96c2af8c5ad5
Ensure your kubectl
context is set correctly to manage the cluster.
For a more hands-on deployment process, follow these steps:
1. Deploy Nginx Ingress Controller:
The Nginx Ingress Controller manages external access to services in your Kubernetes cluster. Create a namespace and install the Ingress Controller using Helm:
kubectl create ns nginx-system
kubens nginx-system
helm upgrade --install nginx-ingress ./deployments/nginx-ingress
Please story the Nginx Ingress Controller's IP address, as you'll need it later.
https://github.com/user-attachments/assets/f329a8ee-cd4d-44e8-bb12-d1ff39dce4b8
Store your environment variables, such as API keys, securely in Kubernetes secrets. Create a namespace for model serving and create a secret from your .env
file:
kubectl create ns model-serving
kubens model-serving
kubectl delete secret ChatOpsLLM-env
kubectl create secret generic ChatOpsLLM-env --from-env-file=.env -n model-serving
kubectl describe secret ChatOpsLLM-env -n model-serving
https://github.com/user-attachments/assets/fab6aa93-2f68-4f36-a4d8-4a1d955596f2
Kubernetes resources often require specific permissions. Apply the necessary roles and bindings:
cd deployments/infrastructure
kubectl apply -f role.yaml
kubectl apply -f rolebinding.yaml
https://github.com/user-attachments/assets/9c1aa6e1-6b8c-4332-ab11-513428ef763b
4. Deploy caching service using Redis:
Now, deploy the semantic caching service using Redis:
cd ./deployments/redis
helm dependency build
helm upgrade --install redis .
https://github.com/user-attachments/assets/ef37626a-9a98-473e-a7e0-effcaa262ad5
Deploy the LiteLLM service:
kubens model-serving
helm upgrade --install litellm ./deployments/litellm
https://github.com/user-attachments/assets/0c98fe90-f958-42fc-9fa6-224dcf417e29
Next, Deploy the web UI to your GKE cluster:
cd open-webui
kubectl apply -f ./kubernetes/manifest/base -n model-serving
https://github.com/user-attachments/assets/60ad30e3-e8f8-49a6-ab96-d895fe7986cb
7. Play around with the Application:
Open browser and navigate to the URL of your GKE cluster (e.g. http://172.0.0.0
in step 1) and add .nip.io
to the end of the URL (e.g. http://172.0.0.0.nip.io
). You should see the Open WebUI:
https://github.com/user-attachments/assets/4115a1f0-e513-4c58-a359-1d49683905a8
For automated CI/CD pipelines, use Jenkins and Ansible as follows:
First, create a Service Account and assign it the Compute Admin
role. Then create a Json key file for the Service Account and store it in the iac/ansible/secrets
directory.
Next create a Google Compute Engine instance named "jenkins-server" running Ubuntu 22.04 with a firewall rule allowing traffic on ports 8081 and 50000.
ansible-playbook iac/ansible/deploy_jenkins/create_compute_instance.yaml
Deploy Jenkins on a server by installing prerequisites, pulling a Docker image, and creating a privileged container with access to the Docker socket and exposed ports 8081 and 50000.
ansible-playbook -i iac/ansible/inventory iac/ansible/deploy_jenkins/deploy_jenkins.yaml
https://github.com/user-attachments/assets/35dae326-aa8f-4779-bf67-2b8d9f71487b
To access the Jenkins server through SSH, we need to create a public/private key pair. Run the following command to create a key pair:
ssh-keygen
Open Metadata
and copy the ssh-keys
value.
https://github.com/user-attachments/assets/8fd956be-d2db-4d85-aa7c-f78df160c00c
We need to find the Jenkins server password to be able to access the server. First, access the Jenkins server:
ssh <USERNAME>:<EXTERNAL_IP>
Then run the following command to get the password:
sudo docker exec -it jenkins-server bash
cat /var/jenkins_home/secrets/initialAdminPassword
https://github.com/user-attachments/assets/08cb4183-a383-4dd2-89e3-da6e74b92d04
Once Jenkins is deployed, access it via your browser:
http://<EXTERNAL_IP>:8081
https://github.com/user-attachments/assets/4f0d3287-39ec-40e7-b333-9287ee37f9fc
Install the following plugins to integrate Jenkins with Docker, Kubernetes, and GKE:
- Docker
- Docker Pipeline
- Kubernetes
- Google Kubernetes Engine
After installing the plugins, restart Jenkins.
sudo docker restart jenkins-server
https://github.com/user-attachments/assets/923f7aff-3983-4b3d-8ef5-17d2285aed63
4.1. Add webhooks to your GitHub repository to trigger Jenkins builds.
Go to the GitHub repository and click on Settings
. Click on Webhooks
and then click on Add Webhook
. Enter the URL of your Jenkins server (e.g. http://<EXTERNAL_IP>:8081/github-webhook/
). Then click on Let me select individual events
and select Let me select individual events
. Select Push
and Pull Request
and click on Add Webhook
.
https://github.com/user-attachments/assets/d6ec020a-3e93-4ce8-bf80-b9f63b227635
4.2. Add Github repository as a Jenkins source code repository.
Go to Jenkins dashboard and click on New Item
. Enter a name for your project (e.g. easy-llmops
) and select Multibranch Pipeline
. Click on OK
. Click on Configure
and then click on Add Source
. Select GitHub
and click on Add
. Enter the URL of your GitHub repository (e.g. https://github.com/bmd1905/ChatOpsLLM
). In the Credentials
field, select Add
and select Username with password
. Enter your GitHub username and password (or use a personal access token). Click on Test Connection
and then click on Save
.
https://github.com/user-attachments/assets/57c97866-caf3-4864-92c9-b91863822591
4.3. Setup docker hub credentials.
First, create a Docker Hub account. Go to the Docker Hub website and click on Sign Up
. Enter your username and password. Click on Sign Up
. Click on Create Repository
. Enter a name for your repository (e.g. easy-llmops
) and click on Create
.
From Jenkins dashboard, go to Manage Jenkins
> Credentials
. Click on Add Credentials
. Select Username with password
and click on Add
. Enter your Docker Hub username, access token, and set ID
to dockerhub
.
https://github.com/user-attachments/assets/3df2f7e2-d284-4da9-82fb-cc65ebb6240b
4.4. Setup Kubernetes credentials.
First, create a Service Account for the Jenkins server to access the GKE cluster. Go to the GCP console and navigate to IAM & Admin > Service Accounts. Create a new service account with the Kubernetes Engine Admin
role. Give the service account a name and description. Click on the service account and then click on the Keys
tab. Click on Add Key
and select JSON
as the key type. Click on Create
and download the JSON file.
https://github.com/user-attachments/assets/d294a5a3-8a3d-4271-b20c-3ebf237f4005
Then, from Jenkins dashboard, go to Manage Jenkins
> Cloud
. Click on New cloud
. Select Kubernetes
. Enter the name of your cluster (e.g. gke-easy-llmops-cluster-1), enter the URL and Certificate from your GKE cluster. In the
Kubernetes Namespace, enter the namespace of your cluster (e.g.
model-serving). In the
Credentialsfield, select
Addand select
Google Service Account from private`. Enter your project-id and the path to the JSON file.
https://github.com/user-attachments/assets/489ce405-a31f-4f56-94bb-faebe1edd849
Push a new commit to your GitHub repository. You should see a new build in Jenkins.
https://github.com/user-attachments/assets/7f4d9286-b41f-4218-a970-fd45c8ecd01c
First, create a Discord webhook. Go to the Discord website and click on Server Settings
. Click on Integrations
. Click on Create Webhook
. Enter a name for your webhook (e.g. easy-llmops-discord-webhook
) and click on Create
. Copy the webhook URL.
https://github.com/user-attachments/assets/2f1258f0-b3c7-4b3b-8cc4-802034600a82
2. Configure Helm Repositories
First, we need to add the necessary Helm repositories for Prometheus and Grafana:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
These commands add the official Prometheus and Grafana Helm repositories and update your local Helm chart information.
Prometheus requires certain dependencies that can be managed with Helm. Navigate to the monitoring directory and build these dependencies:
helm dependency build ./deployments/monitoring/kube-prometheus-stack
Now, we'll deploy Prometheus and its associated services using Helm:
kubectl create namespace monitoring
helm upgrade --install -f deployments/monitoring/kube-prometheus-stack.expanded.yaml kube-prometheus-stack deployments/monitoring/kube-prometheus-stack -n monitoring
This command does the following:
-
helm upgrade --install
: This will install Prometheus if it doesn't exist, or upgrade it if it does. -
-f deployments/monitoring/kube-prometheus-stack.expanded.yaml
: This specifies a custom values file for configuration. -
kube-prometheus-stack
: This is the release name for the Helm installation. -
deployments/monitoring/kube-prometheus-stack
: This is the chart to use for installation. -
-n monitoring
: This specifies the namespace to install into.
https://github.com/user-attachments/assets/6828527c-9561-42bc-a221-fbbaf9097233
By default, the services are not exposed externally. To access them, you can use port-forwarding:
For Prometheus:
kubectl port-forward -n monitoring svc/kube-prometheus-stack-prometheus 9090:9090
Then access Prometheus at http://localhost:9090
For Grafana:
kubectl port-forward -n monitoring svc/kube-prometheus-stack-grafana 3000:80
Then access Grafana at http://localhost:3000
The default credentials for Grafana are usually:
- Username: admin
- Password: prom-operator (you should change this immediately)
https://github.com/user-attachments/assets/a9a2e7f7-0a88-4e21-ba63-7a3f993d1c78
First we need to create a sample alert. Navigate to the monitoring
directory and run the following command:
kubectl port-forward -n monitoring svc/alertmanager-operated 9093:9093
Then, in a new terminal, run the following command:
curl -XPOST -H "Content-Type: application/json" -d '[
{
"labels": {
"alertname": "DiskSpaceLow",
"severity": "critical",
"instance": "server02",
"job": "node_exporter",
"mountpoint": "/data"
},
"annotations": {
"summary": "Disk space critically low",
"description": "Server02 has only 5% free disk space on /data volume"
},
"startsAt": "2023-09-01T12:00:00Z",
"generatorURL": "http://prometheus.example.com/graph?g0.expr=node_filesystem_free_bytes+%2F+node_filesystem_size_bytes+%2A+100+%3C+5"
},
{
"labels": {
"alertname": "HighMemoryUsage",
"severity": "warning",
"instance": "server03",
"job": "node_exporter"
},
"annotations": {
"summary": "High memory usage detected",
"description": "Server03 is using over 90% of its available memory"
},
"startsAt": "2023-09-01T12:05:00Z",
"generatorURL": "http://prometheus.example.com/graph?g0.expr=node_memory_MemAvailable_bytes+%2F+node_memory_MemTotal_bytes+%2A+100+%3C+10"
}
]' http://localhost:9093/api/v2/alerts
This command creates a sample alert. You can verify that the alert was created by running the following command:
curl http://localhost:9093/api/v2/status
Or, you can manually check the Discord channel.
https://github.com/user-attachments/assets/a5716e8c-ecd1-4457-80e9-27f23518bd1b
This setup provides comprehensive monitoring capabilities for your Kubernetes cluster. With Prometheus collecting metrics and Grafana visualizing them, you can effectively track performance, set up alerts for potential issues, and gain valuable insights into your infrastructure and applications.
Centralized logging is essential for monitoring and troubleshooting applications deployed on Kubernetes. This section guides you through setting up an ELK stack (Elasticsearch, Logstash, Kibana) with Filebeat for logging your GKE cluster.
You can use this single helmfile script to kick off the ELK stack:
cd deployments/ELK
helmfile sync
1. Install ELK Stack with Helm
We will use Helm to deploy the ELK stack components:
- Elasticsearch: Stores the logs.
- Logstash: Processes and filters the logs.
- Kibana: Provides a web UI for visualizing and searching logs.
- Filebeat: Collects logs from your pods and forwards them to Logstash.
First, create a namespace for the logging components:
kubectl create ns logging
kubens logging
Next, install Elasticsearch:
helm install elk-elasticsearch elastic/elasticsearch -f deployments/ELK/elastic.expanded.yaml --namespace logging --create-namespace
Wait for Elasticsearch to be ready:
echo "Waiting for Elasticsearch to be ready..."
kubectl wait --for=condition=ready pod -l app=elasticsearch-master --timeout=300s
Create a secret for Logstash to access Elasticsearch:
kubectl create secret generic logstash-elasticsearch-credentials \
--from-literal=username=elastic \
--from-literal=password=$(kubectl get secrets --namespace=logging elasticsearch-master-credentials -ojsonpath='{.data.password}' | base64 -d)
Install Kibana:
helm install elk-kibana elastic/kibana -f deployments/ELK/kibana.expanded.yaml
Install Logstash:
helm install elk-logstash elastic/logstash -f deployments/ELK/logstash.expanded.yaml
Install Filebeat:
helm install elk-filebeat elastic/filebeat -f deployments/ELK/filebeat.expanded.yaml
https://github.com/user-attachments/assets/75dbde44-6ce4-432d-9851-143e13a60fce
Expose Kibana using a service and access it through your browser:
kubectl port-forward -n logging svc/elk-kibana-kibana 5601:5601
Please use this script to get the Kibana password:
kubectl get secrets --namespace=logging elasticsearch-master-credentials -ojsonpath='{.data.password}' | base64 -d
Open your browser and navigate to http://localhost:5601
.
You should now be able to see logs from your Kubernetes pods in Kibana. You can create dashboards and visualizations to analyze your logs and gain insights into your application's behavior.
https://github.com/user-attachments/assets/a767e143-4fd2-406c-bf9f-9c5714b7404d
Please go to Cast AI to sign up for a free account and get the TOKEN.
Then run this line to connect to GKE:
curl -H "Authorization: Token <TOKEN>" "https://api.cast.ai/v1/agent.yaml?provider=gke" | kubectl apply -f -
Hit I ran this script
on Cast AI's UI, then copy the configuration code and paste it into the terminal:
CASTAI_API_TOKEN=<API_TOKEN> CASTAI_CLUSTER_ID=<CASTAI_CLUSTER_ID> CLUSTER_NAME=easy-llmops-gke INSTALL_AUTOSCALER=true INSTALL_POD_PINNER=true INSTALL_SECURITY_AGENT=true LOCATION=asia-southeast1-b PROJECT_ID=easy-llmops /bin/bash -c "$(curl -fsSL 'https://api.cast.ai/v1/scripts/gke/onboarding.sh')"
Hit I ran this script
again and waite for the installation to complete.
Then you can see your dashboards on Cast AI's UI:
It's time to optimize your cluster with Cast AI! Go go the Available savings
seaction and click Rebalance
button.
- Langfuse is an open source LLM engineering platform - LLM observability, metrics, evaluations, prompt management.
- Supabase is an open source Firebase alternative. Start your project with a Postgres database, Authentication, instant APIs, Edge Functions, Realtime subscriptions, Storage, and Vector embeddings.
Please go to Langfuse and Supabase to sign up for a free account and get API keys, then replace the placehoders in .env.example file with your API keys.
We welcome contributions to ChatOpsLLM! Please see our CONTRIBUTING.md for more information on how to get started.
ChatOpsLLM is released under the MIT License. See the LICENSE file for more details.
If you use ChatOpsLLM in your research, please cite it as follows:
@software{ChatOpsLLM2024,
author = {Minh-Duc Bui},
title = {ChatOpsLLM: Effortless MLOps for Powerful Language Models.},
year = {2024},
url = {https://github.com/bmd1905/ChatOpsLLM}
}
For questions, issues, or collaborations, please open an issue on our GitHub repository or contact the maintainers directly.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for ChatOpsLLM
Similar Open Source Tools
ChatOpsLLM
ChatOpsLLM is a project designed to empower chatbots with effortless DevOps capabilities. It provides an intuitive interface and streamlined workflows for managing and scaling language models. The project incorporates robust MLOps practices, including CI/CD pipelines with Jenkins and Ansible, monitoring with Prometheus and Grafana, and centralized logging with the ELK stack. Developers can find detailed documentation and instructions on the project's website.
langstream
LangStream is a tool for natural language processing tasks, providing a CLI for easy installation and usage. Users can try sample applications like Chat Completions and create their own applications using the developer documentation. It supports running on Kubernetes for production-ready deployment, with support for various Kubernetes distributions and external components like Apache Kafka or Apache Pulsar cluster. Users can deploy LangStream locally using minikube and manage the cluster with mini-langstream. Development requirements include Docker, Java 17, Git, Python 3.11+, and PIP, with the option to test local code changes using mini-langstream.
holmesgpt
HolmesGPT is an open-source DevOps assistant powered by OpenAI or any tool-calling LLM of your choice. It helps in troubleshooting Kubernetes, incident response, ticket management, automated investigation, and runbook automation in plain English. The tool connects to existing observability data, is compliance-friendly, provides transparent results, supports extensible data sources, runbook automation, and integrates with existing workflows. Users can install HolmesGPT using Brew, prebuilt Docker container, Python Poetry, or Docker. The tool requires an API key for functioning and supports OpenAI, Azure AI, and self-hosted LLMs.
shellChatGPT
ShellChatGPT is a shell wrapper for OpenAI's ChatGPT, DALL-E, Whisper, and TTS, featuring integration with LocalAI, Ollama, Gemini, Mistral, Groq, and GitHub Models. It provides text and chat completions, vision, reasoning, and audio models, voice-in and voice-out chatting mode, text editor interface, markdown rendering support, session management, instruction prompt manager, integration with various service providers, command line completion, file picker dialogs, color scheme personalization, stdin and text file input support, and compatibility with Linux, FreeBSD, MacOS, and Termux for a responsive experience.
patchwork
PatchWork is an open-source framework designed for automating development tasks using large language models. It enables users to automate workflows such as PR reviews, bug fixing, security patching, and more through a self-hosted CLI agent and preferred LLMs. The framework consists of reusable atomic actions called Steps, customizable LLM prompts known as Prompt Templates, and LLM-assisted automations called Patchflows. Users can run Patchflows locally in their CLI/IDE or as part of CI/CD pipelines. PatchWork offers predefined patchflows like AutoFix, PRReview, GenerateREADME, DependencyUpgrade, and ResolveIssue, with the flexibility to create custom patchflows. Prompt templates are used to pass queries to LLMs and can be customized. Contributions to new patchflows, steps, and the core framework are encouraged, with chat assistants available to aid in the process. The roadmap includes expanding the patchflow library, introducing a debugger and validation module, supporting large-scale code embeddings, parallelization, fine-tuned models, and an open-source GUI. PatchWork is licensed under AGPL-3.0 terms, while custom patchflows and steps can be shared using the Apache-2.0 licensed patchwork template repository.
llm-vscode
llm-vscode is an extension designed for all things LLM, utilizing llm-ls as its backend. It offers features such as code completion with 'ghost-text' suggestions, the ability to choose models for code generation via HTTP requests, ensuring prompt size fits within the context window, and code attribution checks. Users can configure the backend, suggestion behavior, keybindings, llm-ls settings, and tokenization options. Additionally, the extension supports testing models like Code Llama 13B, Phind/Phind-CodeLlama-34B-v2, and WizardLM/WizardCoder-Python-34B-V1.0. Development involves cloning llm-ls, building it, and setting up the llm-vscode extension for use.
distilabel
Distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency. It helps you synthesize data and provide AI feedback to improve the quality of your AI models. With Distilabel, you can: * **Synthesize data:** Generate synthetic data to train your AI models. This can help you to overcome the challenges of data scarcity and bias. * **Provide AI feedback:** Get feedback from AI models on your data. This can help you to identify errors and improve the quality of your data. * **Improve your AI output quality:** By using Distilabel to synthesize data and provide AI feedback, you can improve the quality of your AI models and get better results.
letta
Letta is an open source framework for building stateful LLM applications. It allows users to build stateful agents with advanced reasoning capabilities and transparent long-term memory. The framework is white box and model-agnostic, enabling users to connect to various LLM API backends. Letta provides a graphical interface, the Letta ADE, for creating, deploying, interacting, and observing with agents. Users can access Letta via REST API, Python, Typescript SDKs, and the ADE. Letta supports persistence by storing agent data in a database, with PostgreSQL recommended for data migrations. Users can install Letta using Docker or pip, with Docker defaulting to PostgreSQL and pip defaulting to SQLite. Letta also offers a CLI tool for interacting with agents. The project is open source and welcomes contributions from the community.
ai-starter-kit
SambaNova AI Starter Kits is a collection of open-source examples and guides designed to facilitate the deployment of AI-driven use cases for developers and enterprises. The kits cover various categories such as Data Ingestion & Preparation, Model Development & Optimization, Intelligent Information Retrieval, and Advanced AI Capabilities. Users can obtain a free API key using SambaNova Cloud or deploy models using SambaStudio. Most examples are written in Python but can be applied to any programming language. The kits provide resources for tasks like text extraction, fine-tuning embeddings, prompt engineering, question-answering, image search, post-call analysis, and more.
WindowsAgentArena
Windows Agent Arena (WAA) is a scalable Windows AI agent platform designed for testing and benchmarking multi-modal, desktop AI agents. It provides researchers and developers with a reproducible and realistic Windows OS environment for AI research, enabling testing of agentic AI workflows across various tasks. WAA supports deploying agents at scale using Azure ML cloud infrastructure, allowing parallel running of multiple agents and delivering quick benchmark results for hundreds of tasks in minutes.
expo-stable-diffusion
The `expo-stable-diffusion` repository provides a tool for generating images using Stable Diffusion natively on iOS devices within Expo and React Native apps. Users can install and configure the module to create images based on prompts. The repository includes information on updating iOS deployment targets, enabling increased memory limits, and building iOS apps. Additionally, users can obtain Stable Diffusion models from various sources. The repository also addresses troubleshooting tips related to model load times and image generation durations. The developer seeks sponsorship to further enhance the project, including adding Android support.
codepair
CodePair is an open-source real-time collaborative markdown editor with AI intelligence, allowing users to collaboratively edit documents, share documents with external parties, and utilize AI intelligence within the editor. It is built using React, NestJS, and LangChain. The repository contains frontend and backend code, with detailed instructions for setting up and running each part. Users can choose between Frontend Development Only Mode or Full Stack Development Mode based on their needs. CodePair also integrates GitHub OAuth for Social Login feature. Contributors are welcome to submit patches and follow the contribution workflow.
chatgpt-cli
ChatGPT CLI provides a powerful command-line interface for seamless interaction with ChatGPT models via OpenAI and Azure. It features streaming capabilities, extensive configuration options, and supports various modes like streaming, query, and interactive mode. Users can manage thread-based context, sliding window history, and provide custom context from any source. The CLI also offers model and thread listing, advanced configuration options, and supports GPT-4, GPT-3.5-turbo, and Perplexity's models. Installation is available via Homebrew or direct download, and users can configure settings through default values, a config.yaml file, or environment variables.
humanoid-gym
Humanoid-Gym is a reinforcement learning framework designed for training locomotion skills for humanoid robots, focusing on zero-shot transfer from simulation to real-world environments. It integrates a sim-to-sim framework from Isaac Gym to Mujoco for verifying trained policies in different physical simulations. The codebase is verified with RobotEra's XBot-S and XBot-L humanoid robots. It offers comprehensive training guidelines, step-by-step configuration instructions, and execution scripts for easy deployment. The sim2sim support allows transferring trained policies to accurate simulated environments. The upcoming features include Denoising World Model Learning and Dexterous Hand Manipulation. Installation and usage guides are provided along with examples for training PPO policies and sim-to-sim transformations. The code structure includes environment and configuration files, with instructions on adding new environments. Troubleshooting tips are provided for common issues, along with a citation and acknowledgment section.
NeoGPT
NeoGPT is an AI assistant that transforms your local workspace into a powerhouse of productivity from your CLI. With features like code interpretation, multi-RAG support, vision models, and LLM integration, NeoGPT redefines how you work and create. It supports executing code seamlessly, multiple RAG techniques, vision models, and interacting with various language models. Users can run the CLI to start using NeoGPT and access features like Code Interpreter, building vector database, running Streamlit UI, and changing LLM models. The tool also offers magic commands for chat sessions, such as resetting chat history, saving conversations, exporting settings, and more. Join the NeoGPT community to experience a new era of efficiency and contribute to its evolution.
crewAI-tools
This repository provides a guide for setting up tools for crewAI agents to enhance functionality. It offers steps to equip agents with ready-to-use tools and create custom ones. Tools are expected to return strings for generating responses. Users can create tools by subclassing BaseTool or using the tool decorator. Contributions are welcome to enrich the toolset, and guidelines are provided for contributing. The development setup includes installing dependencies, activating virtual environment, setting up pre-commit hooks, running tests, static type checking, packaging, and local installation. The goal is to empower AI solutions through advanced tooling.
For similar tasks
sfdx-hardis
sfdx-hardis is a toolbox for Salesforce DX, developed by Cloudity, that simplifies tasks which would otherwise take minutes or hours to complete manually. It enables users to define complete CI/CD pipelines for Salesforce projects, backup metadata, and monitor any Salesforce org. The tool offers a wide range of commands that can be accessed via the command line interface or through a Visual Studio Code extension. Additionally, sfdx-hardis provides Docker images for easy integration into CI workflows. The tool is designed to be natively compliant with various platforms and tools, making it a versatile solution for Salesforce developers.
omnia
Omnia is a deployment tool designed to turn servers with RPM-based Linux images into functioning Slurm/Kubernetes clusters. It provides an Ansible playbook-based deployment for Slurm and Kubernetes on servers running an RPM-based Linux OS. The tool simplifies the process of setting up and managing clusters, making it easier for users to deploy and maintain their infrastructure.
ChatOpsLLM
ChatOpsLLM is a project designed to empower chatbots with effortless DevOps capabilities. It provides an intuitive interface and streamlined workflows for managing and scaling language models. The project incorporates robust MLOps practices, including CI/CD pipelines with Jenkins and Ansible, monitoring with Prometheus and Grafana, and centralized logging with the ELK stack. Developers can find detailed documentation and instructions on the project's website.
tt-metal
TT-NN is a python & C++ Neural Network OP library. It provides a low-level programming model, TT-Metalium, enabling kernel development for Tenstorrent hardware.
mscclpp
MSCCL++ is a GPU-driven communication stack for scalable AI applications. It provides a highly efficient and customizable communication stack for distributed GPU applications. MSCCL++ redefines inter-GPU communication interfaces, delivering a highly efficient and customizable communication stack for distributed GPU applications. Its design is specifically tailored to accommodate diverse performance optimization scenarios often encountered in state-of-the-art AI applications. MSCCL++ provides communication abstractions at the lowest level close to hardware and at the highest level close to application API. The lowest level of abstraction is ultra light weight which enables a user to implement logics of data movement for a collective operation such as AllReduce inside a GPU kernel extremely efficiently without worrying about memory ordering of different ops. The modularity of MSCCL++ enables a user to construct the building blocks of MSCCL++ in a high level abstraction in Python and feed them to a CUDA kernel in order to facilitate the user's productivity. MSCCL++ provides fine-grained synchronous and asynchronous 0-copy 1-sided abstracts for communication primitives such as `put()`, `get()`, `signal()`, `flush()`, and `wait()`. The 1-sided abstractions allows a user to asynchronously `put()` their data on the remote GPU as soon as it is ready without requiring the remote side to issue any receive instruction. This enables users to easily implement flexible communication logics, such as overlapping communication with computation, or implementing customized collective communication algorithms without worrying about potential deadlocks. Additionally, the 0-copy capability enables MSCCL++ to directly transfer data between user's buffers without using intermediate internal buffers which saves GPU bandwidth and memory capacity. MSCCL++ provides consistent abstractions regardless of the location of the remote GPU (either on the local node or on a remote node) or the underlying link (either NVLink/xGMI or InfiniBand). This simplifies the code for inter-GPU communication, which is often complex due to memory ordering of GPU/CPU read/writes and therefore, is error-prone.
mlir-air
This repository contains tools and libraries for building AIR platforms, runtimes and compilers.
free-for-life
A massive list including a huge amount of products and services that are completely free! ⭐ Star on GitHub • 🤝 Contribute # Table of Contents * APIs, Data & ML * Artificial Intelligence * BaaS * Code Editors * Code Generation * DNS * Databases * Design & UI * Domains * Email * Font * For Students * Forms * Linux Distributions * Messaging & Streaming * PaaS * Payments & Billing * SSL
AIMr
AIMr is an AI aimbot tool written in Python that leverages modern technologies to achieve an undetected system with a pleasing appearance. It works on any game that uses human-shaped models. To optimize its performance, users should build OpenCV with CUDA. For Valorant, additional perks in the Discord and an Arduino Leonardo R3 are required.
For similar jobs
flux-aio
Flux All-In-One is a lightweight distribution optimized for running the GitOps Toolkit controllers as a single deployable unit on Kubernetes clusters. It is designed for bare clusters, edge clusters, clusters with restricted communication, clusters with egress via proxies, and serverless clusters. The distribution follows semver versioning and provides documentation for specifications, installation, upgrade, OCI sync configuration, Git sync configuration, and multi-tenancy configuration. Users can deploy Flux using Timoni CLI and a Timoni Bundle file, fine-tune installation options, sync from public Git repositories, bootstrap repositories, and uninstall Flux without affecting reconciled workloads.
paddler
Paddler is an open-source load balancer and reverse proxy designed specifically for optimizing servers running llama.cpp. It overcomes typical load balancing challenges by maintaining a stateful load balancer that is aware of each server's available slots, ensuring efficient request distribution. Paddler also supports dynamic addition or removal of servers, enabling integration with autoscaling tools.
DaoCloud-docs
DaoCloud Enterprise 5.0 Documentation provides detailed information on using DaoCloud, a Certified Kubernetes Service Provider. The documentation covers current and legacy versions, workflow control using GitOps, and instructions for opening a PR and previewing changes locally. It also includes naming conventions, writing tips, references, and acknowledgments to contributors. Users can find guidelines on writing, contributing, and translating pages, along with using tools like MkDocs, Docker, and Poetry for managing the documentation.
ztncui-aio
This repository contains a Docker image with ZeroTier One and ztncui to set up a standalone ZeroTier network controller with a web user interface. It provides features like Golang auto-mkworld for generating a planet file, supports local persistent storage configuration, and includes a public file server. Users can build the Docker image, set up the container with specific environment variables, and manage the ZeroTier network controller through the web interface.
devops-gpt
DevOpsGPT is a revolutionary tool designed to streamline your workflow and empower you to build systems and automate tasks with ease. Tired of spending hours on repetitive DevOps tasks? DevOpsGPT is here to help! Whether you're setting up infrastructure, speeding up deployments, or tackling any other DevOps challenge, our app can make your life easier and more productive. With DevOpsGPT, you can expect faster task completion, simplified workflows, and increased efficiency. Ready to experience the DevOpsGPT difference? Visit our website, sign in or create an account, start exploring the features, and share your feedback to help us improve. DevOpsGPT will become an essential tool in your DevOps toolkit.
ChatOpsLLM
ChatOpsLLM is a project designed to empower chatbots with effortless DevOps capabilities. It provides an intuitive interface and streamlined workflows for managing and scaling language models. The project incorporates robust MLOps practices, including CI/CD pipelines with Jenkins and Ansible, monitoring with Prometheus and Grafana, and centralized logging with the ELK stack. Developers can find detailed documentation and instructions on the project's website.
weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.
agentcloud
AgentCloud is an open-source platform that enables companies to build and deploy private LLM chat apps, empowering teams to securely interact with their data. It comprises three main components: Agent Backend, Webapp, and Vector Proxy. To run this project locally, clone the repository, install Docker, and start the services. The project is licensed under the GNU Affero General Public License, version 3 only. Contributions and feedback are welcome from the community.