
pluto
Pluto provides a unified programming interface that allows you to seamlessly tap into cloud capabilities and develop your cloud and AI applications.
Stars: 90

Pluto is a development tool dedicated to helping developers **build cloud and AI applications more conveniently** , resolving issues such as the challenging deployment of AI applications and open-source models. Developers are able to write applications in familiar programming languages like **Python and TypeScript** , **directly defining and utilizing the cloud resources necessary for the application within their code base** , such as AWS SageMaker, DynamoDB, and more. Pluto automatically deduces the infrastructure resource needs of the app through **static program analysis** and proceeds to create these resources on the specified cloud platform, **simplifying the resources creation and application deployment process**.
README:
Pluto is a development tool dedicated to helping developers build cloud and AI applications more conveniently, resolving issues such as the challenging deployment of AI applications and open-source models.
Developers are able to write applications in familiar programming languages like Python and TypeScript, directly defining and utilizing the cloud resources necessary for the application within their code base, such as AWS SageMaker, DynamoDB, and more. Pluto automatically deduces the infrastructure resource needs of the app through static program analysis and proceeds to create these resources on the specified cloud platform, simplifying the resources creation and application deployment process.
Let's develop a text generation application based on GPT2, where the user input is processed by the GPT2 model to generate and return text. Below is how the development process with Pluto looks:
AWS SageMaker is utilized as the model deployment platform, and AWS Api Gateway and Lambda support the application's HTTP services. The deployed application architecture, as shown in the top right graphic
The top left graphic new SageMaker()
, you can directly interact with the SageMaker model using methods like sagemaker.invoke()
and obtain the model endpoint URL using sagemaker.endpointUrl()
. Establishing an Api Gateway requires only creating a new variable router
with new Router()
, and the function arguments within the methods of router
, such as router.get()
, router.post()
, etc., will automatically be converted into Lambda functions. The same application could be implemented in Python as well.
Once the application code has been written, executing pluto deploy
allows Pluto to deduce the application's infrastructure needs and automatically provision around 30 cloud resources, which includes instances such as SageMaker, Lambda, Api Gateway, along with setups like triggers, IAM roles, and policy permissions.
Finally, Pluto hands back the URL of the Api Gateway, providing direct access to use the application.
Interested in exploring more examples?
- TypeScript applications:
- Python applications:
Online Experience:CodeSandbox offers an online development environmentwhere we've set up Pluto templates in both Python and TypeScript. This allows you to explore Pluto directly in your browser. To create your own project, simply open the template and click the "Fork" button in the top right corner. The environment comes pre-configured with AWS CLI, Pulumi, and Pluto's basic dependencies, adhering to the README for operations.
Alternatively, you can also enjoy an online development experience by clicking
Container Experience: We offer a container image plutolang/pluto:latest
for application development, which contains essential dependencies like AWS CLI, Pulumi, and Pluto, along with Node.js 20.x and Python 3.10 environments pre-configured. If you are interested in developing only TypeScript applications, you can use the plutolang/pluto:latest-typescript
image. You can partake in Pluto development within a container using the following command:
docker run -it --name pluto-app plutolang/pluto:latest bash
Local Experience: For local use, please follow these steps for setup:
npm install -g @plutolang/cli
pluto new # Interactively create a new project, allowing selection of TypeScript or Python
cd <project_dir> # Enter your project directory
npm install # Download dependencies
# If it's a Python project, in addition to npm install, Python dependencies must also be installed.
pip install -r requirements.txt
pluto deploy # Deploy with one click!
- If the target platform is AWS, Pluto attempts to read your AWS configuration file to acquire the default AWS Region, or alternatively, tries to fetch it from the environment variable
AWS_REGION
. Deployment will fail if neither is set. - If the target platform is Kubernetes, Knative must firstly be installed within K8s and the scale-to-zero feature should be deactivated (as Pluto doesn't yet support Ingress forwarding to Knative serving). You can configure the required Kubernetes environment following this document.
For detailed steps, refer to the Getting Started Guide.
Currently, Pluto only supports single-file configurations. Inside each handler function, access is provided to literal constants and plain functions outside of the handler's scope; however, Python allows direct access to classes, interfaces, etc., outside of the scope, whereas TypeScript requires encapsulating these within functions for access.
Here you can find out why Pluto was created. To put it simply, we aim to address several pain points you might often encounter:
- High learning curve: Developing a cloud application requires mastery of both the business and infrastructure skills, and it often demands significant efforts in testing and debugging. Thus, developers spend a considerable amount of energy on aspects beyond writing the core business logic.
- High cognitive load: With cloud service providers offering hundreds of capabilities and Kubernetes offering nearly limitless possibilities, average developers often lack a deep understanding of cloud infrastructure, making it challenging to choose the proper architecture for their particular needs.
- Poor programming experience: Developers must maintain separate codebases for infrastructure and business logic or intertwine infrastructure configuration within the business logic, leading to a sub-optimal programming experience that falls short of the simplicity of creating a local standalone program.
- Vendor lock-in: Coding for a specific cloud provider can lead to poor flexibility in the resulting code. When it becomes necessary to migrate to another cloud platform due to cost or other factors, adapting the existing code to the new environment can require substantial changes.
- No learning curve: The programming interface is fully compatible with TypeScript, Python, and supports the majority of dependency libraries such as LangChain, LangServe, FastAPI, etc.
- Focus on pure business logic: Developers only need to write the business logic. Pluto, via static analysis, automatically deduces the infrastructure requirements of the application.
- One-click cloud deployment: The CLI provides basic capabilities such as compilation and deployment. Beyond coding and basic configuration, everything else is handled automatically by Pluto.
- Support for various runtime environments: With a unified abstraction based on the SDK, it allows developers to migrate between different runtime environments without altering the source code.
Overall, the Pluto deployment process comprises three stages—deduction, generation, and deployment:
- Deduction Phase: The deducer analyzes the application code to derive the required cloud resources and their interdependencies, resulting in an architecture reference. It also splits user business code into business modules, which, along with the dependent SDK, form the business bundle.
- Generation Phase: The generator creates IaC code that is independent of user code, guided by the architecture reference.
- Deployment Phase: Depending on the IaC code type, Pluto invokes the corresponding adapter, which, in turn, works with the respective IaC engine to execute the IaC code, managing infrastructure configuration and application deployment.
Components such as the deducer, generator, and adapter are extendable, which allows support for a broader range of programming languages and platform integration methods. Currently, Pluto provides deducers for Python and TypeScript, and a generator and adapter for Pulumi. Learn more about Pluto's processes in detail in this document.
Pluto distinguishes itself from other offerings by leveraging static program analysis techniques to infer resource dependencies directly from application code and generate infrastructure code that remains separate from business logic. This approach ensures infrastructure configuration does not intrude into business logic, providing developers with a development experience free from infrastructure concerns.
- Compared to BaaS (Backend as a Service) products like Supabase or Appwrite, Pluto assists developers in creating the necessary infrastructure environment within their own cloud account, rather than offering managed components.
- Differing from PaaS (Platform as a Service) offerings like Fly.io, Render, Heroku, or LeptonAI, Pluto does not handle application hosting. Instead, it compiles application into finely-grained compute modules, and integrates with rich cloud platform capabilities like FaaS, GPU instances, and message queues, enabling deployment to cloud platforms without requiring developers to write extra configurations.
- In contrast to scaffolding tools such as the Serverless Framework or Serverless Devs, Pluto does not impose an application programming framework specific to particular cloud providers or frameworks, but instead offers a uniform programming interface.
- Unlike IfC (Infrastructure from Code) products based purely on annotations like Klotho, Pluto infers resource dependencies directly from user code, eliminating the need for extra annotations.
- Different from other IfC products that rely on dynamic analysis, like Shuttle, Nitric, and Winglang, Pluto employs static program analysis to identify application resource dependencies, generating independent infrastructure code without having to execute user code.
You can learn more about the differences with other projects in this document.
Pluto is still in its infancy, and we warmly welcome contributions from those who are interested. Any suggestions or ideas about the issues Pluto aims to solve, the features it offers, or its code implementation can be shared and contributed to the community. Please refer to our project contribution guide for more information.
- Complete implementation of the resource static deduction process
- 🚧 Resource type checking
- ❌ Conversion of local variables into cloud resources
- SDK development
- 🚧 Client SDK development
- 🚧 Infra SDK development
- ❌ Support for additional resources and more platforms
- Engine extension support
- 🚧 Pulumi
- ❌ Terraform
- 🚧 Local simulation and testing functionality
Please see the Issue list for further details.
✅: Indicates that all user-visible interfaces are available
🚧: Indicates that some of the user-visible interfaces are available
❌: Indicates not yet supported
Resource Type | AWS | Kubernetes | Alibaba Cloud | Simulation |
---|---|---|---|---|
Router | ✅ | 🚧 | 🚧 | 🚧 |
Queue | ✅ | ✅ | ❌ | ✅ |
KVStore | ✅ | ✅ | ❌ | ✅ |
Function | ✅ | ✅ | ✅ | ✅ |
Schedule | ✅ | ✅ | ❌ | ❌ |
Tester | ✅ | ❌ | ❌ | ✅ |
SageMaker | ✅ | ❌ | ❌ | ❌ |
Resource Type | AWS | Kubernetes | Alibaba Cloud | Simulation |
---|---|---|---|---|
Router | ✅ | ❌ | ❌ | ❌ |
Queue | ✅ | ❌ | ❌ | ❌ |
KVStore | ✅ | ❌ | ❌ | ❌ |
Function | ✅ | ❌ | ❌ | ❌ |
Schedule | ✅ | ❌ | ❌ | ❌ |
Tester | ❌ | ❌ | ❌ | ❌ |
SageMaker | ✅ | ❌ | ❌ | ❌ |
Join our Slack community to communicate and contribute ideas.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for pluto
Similar Open Source Tools

pluto
Pluto is a development tool dedicated to helping developers **build cloud and AI applications more conveniently** , resolving issues such as the challenging deployment of AI applications and open-source models. Developers are able to write applications in familiar programming languages like **Python and TypeScript** , **directly defining and utilizing the cloud resources necessary for the application within their code base** , such as AWS SageMaker, DynamoDB, and more. Pluto automatically deduces the infrastructure resource needs of the app through **static program analysis** and proceeds to create these resources on the specified cloud platform, **simplifying the resources creation and application deployment process**.

Conversation-Knowledge-Mining-Solution-Accelerator
The Conversation Knowledge Mining Solution Accelerator enables customers to leverage intelligence to uncover insights, relationships, and patterns from conversational data. It empowers users to gain valuable knowledge and drive targeted business impact by utilizing Azure AI Foundry, Azure OpenAI, Microsoft Fabric, and Azure Search for topic modeling, key phrase extraction, speech-to-text transcription, and interactive chat experiences.

humanlayer
HumanLayer is a Python toolkit designed to enable AI agents to interact with humans in tool-based and asynchronous workflows. By incorporating humans-in-the-loop, agentic tools can access more powerful and meaningful tasks. The toolkit provides features like requiring human approval for function calls, human as a tool for contacting humans, omni-channel contact capabilities, granular routing, and support for various LLMs and orchestration frameworks. HumanLayer aims to ensure human oversight of high-stakes function calls, making AI agents more reliable and safe in executing impactful tasks.

leapfrogai
LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.

hass-ollama-conversation
The Ollama Conversation integration adds a conversation agent powered by Ollama in Home Assistant. This agent can be used in automations to query information provided by Home Assistant about your house, including areas, devices, and their states. Users can install the integration via HACS and configure settings such as API timeout, model selection, context size, maximum tokens, and other parameters to fine-tune the responses generated by the AI language model. Contributions to the project are welcome, and discussions can be held on the Home Assistant Community platform.

llm-twin-course
The LLM Twin Course is a free, end-to-end framework for building production-ready LLM systems. It teaches you how to design, train, and deploy a production-ready LLM twin of yourself powered by LLMs, vector DBs, and LLMOps good practices. The course is split into 11 hands-on written lessons and the open-source code you can access on GitHub. You can read everything and try out the code at your own pace.

rag-time
RAG Time is a 5-week AI learning series focusing on Retrieval-Augmented Generation (RAG) concepts. The repository contains code samples, step-by-step guides, and resources to help users master RAG. It aims to teach foundational and advanced RAG concepts, demonstrate real-world applications, and provide hands-on samples for practical implementation.

CogVideo
CogVideo is an open-source repository that provides pretrained text-to-video models for generating videos based on input text. It includes models like CogVideoX-2B and CogVideo, offering powerful video generation capabilities. The repository offers tools for inference, fine-tuning, and model conversion, along with demos showcasing the model's capabilities through CLI, web UI, and online experiences. CogVideo aims to facilitate the creation of high-quality videos from textual descriptions, catering to a wide range of applications.

OpenContracts
OpenContracts is an Apache-2 licensed enterprise document analytics tool that supports multiple formats, including PDF and txt-based formats. It features multiple document ingestion pipelines with a pluggable architecture for easy format and ingestion engine support. Users can create custom document analytics tools with beautiful result displays, support mass document data extraction with a LlamaIndex wrapper, and manage document collections, layout parsing, automatic vector embeddings, and human annotation. The tool also offers pluggable parsing pipelines, human annotation interface, LlamaIndex integration, data extraction capabilities, and custom data extract pipelines for bulk document querying.

AI-Gateway
The AI-Gateway repository explores the AI Gateway pattern through a series of experimental labs, focusing on Azure API Management for handling AI services APIs. The labs provide step-by-step instructions using Jupyter notebooks with Python scripts, Bicep files, and APIM policies. The goal is to accelerate experimentation of advanced use cases and pave the way for further innovation in the rapidly evolving field of AI. The repository also includes a Mock Server to mimic the behavior of the OpenAI API for testing and development purposes.

extractous
Extractous offers a fast and efficient solution for extracting content and metadata from various document types such as PDF, Word, HTML, and many other formats. It is built with Rust, providing high performance, memory safety, and multi-threading capabilities. The tool eliminates the need for external services or APIs, making data processing pipelines faster and more efficient. It supports multiple file formats, including Microsoft Office, OpenOffice, PDF, spreadsheets, web documents, e-books, text files, images, and email formats. Extractous provides a clear and simple API for extracting text and metadata content, with upcoming support for JavaScript/TypeScript. It is free for commercial use under the Apache 2.0 License.

XLearning
XLearning is a scheduling platform for big data and artificial intelligence, supporting various machine learning and deep learning frameworks. It runs on Hadoop Yarn and integrates frameworks like TensorFlow, MXNet, Caffe, Theano, PyTorch, Keras, XGBoost. XLearning offers scalability, compatibility, multiple deep learning framework support, unified data management based on HDFS, visualization display, and compatibility with code at native frameworks. It provides functions for data input/output strategies, container management, TensorBoard service, and resource usage metrics display. XLearning requires JDK >= 1.7 and Maven >= 3.3 for compilation, and deployment on CentOS 7.2 with Java >= 1.7 and Hadoop 2.6, 2.7, 2.8.

ai-prompts
Instructa AI Prompts is an open-source repository dedicated to collecting and sharing AI prompts, best practices, and curated rules for developers. The goal is to help users quickly set up and refine their workflow with ready-to-use prompts. Users can dynamically include prompts in AI-assisted coding tools like Cursor, GitHub Copilot, Zed, Windsurf, and Cline to adhere to project-specific coding standards, best practices, and automation workflows.

second-brain-ai-assistant-course
This open-source course teaches how to build an advanced RAG and LLM system using LLMOps and ML systems best practices. It helps you create an AI assistant that leverages your personal knowledge base to answer questions, summarize documents, and provide insights. The course covers topics such as LLM system architecture, pipeline orchestration, large-scale web crawling, model fine-tuning, and advanced RAG features. It is suitable for ML/AI engineers and data/software engineers & data scientists looking to level up to production AI systems. The course is free, with minimal costs for tools like OpenAI's API and Hugging Face's Dedicated Endpoints. Participants will build two separate Python applications for offline ML pipelines and online inference pipeline.

katib
Katib is a Kubernetes-native project for automated machine learning (AutoML). Katib supports Hyperparameter Tuning, Early Stopping and Neural Architecture Search. Katib is the project which is agnostic to machine learning (ML) frameworks. It can tune hyperparameters of applications written in any language of the users’ choice and natively supports many ML frameworks, such as TensorFlow, Apache MXNet, PyTorch, XGBoost, and others. Katib can perform training jobs using any Kubernetes Custom Resources with out of the box support for Kubeflow Training Operator, Argo Workflows, Tekton Pipelines and many more.

langkit
LangKit is an open-source text metrics toolkit for monitoring language models. It offers methods for extracting signals from input/output text, compatible with whylogs. Features include text quality, relevance, security, sentiment, toxicity analysis. Installation via PyPI. Modules contain UDFs for whylogs. Benchmarks show throughput on AWS instances. FAQs available.
For similar tasks

griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.

AI-in-a-Box
AI-in-a-Box is a curated collection of solution accelerators that can help engineers establish their AI/ML environments and solutions rapidly and with minimal friction, while maintaining the highest standards of quality and efficiency. It provides essential guidance on the responsible use of AI and LLM technologies, specific security guidance for Generative AI (GenAI) applications, and best practices for scaling OpenAI applications within Azure. The available accelerators include: Azure ML Operationalization in-a-box, Edge AI in-a-box, Doc Intelligence in-a-box, Image and Video Analysis in-a-box, Cognitive Services Landing Zone in-a-box, Semantic Kernel Bot in-a-box, NLP to SQL in-a-box, Assistants API in-a-box, and Assistants API Bot in-a-box.

spring-ai
The Spring AI project provides a Spring-friendly API and abstractions for developing AI applications. It offers a portable client API for interacting with generative AI models, enabling developers to easily swap out implementations and access various models like OpenAI, Azure OpenAI, and HuggingFace. Spring AI also supports prompt engineering, providing classes and interfaces for creating and parsing prompts, as well as incorporating proprietary data into generative AI without retraining the model. This is achieved through Retrieval Augmented Generation (RAG), which involves extracting, transforming, and loading data into a vector database for use by AI models. Spring AI's VectorStore abstraction allows for seamless transitions between different vector database implementations.

ragstack-ai
RAGStack is an out-of-the-box solution simplifying Retrieval Augmented Generation (RAG) in GenAI apps. RAGStack includes the best open-source for implementing RAG, giving developers a comprehensive Gen AI Stack leveraging LangChain, CassIO, and more. RAGStack leverages the LangChain ecosystem and is fully compatible with LangSmith for monitoring your AI deployments.

breadboard
Breadboard is a library for prototyping generative AI applications. It is inspired by the hardware maker community and their boundless creativity. Breadboard makes it easy to wire prototypes and share, remix, reuse, and compose them. The library emphasizes ease and flexibility of wiring, as well as modularity and composability.

cloudflare-ai-web
Cloudflare-ai-web is a lightweight and easy-to-use tool that allows you to quickly deploy a multi-modal AI platform using Cloudflare Workers AI. It supports serverless deployment, password protection, and local storage of chat logs. With a size of only ~638 kB gzip, it is a great option for building AI-powered applications without the need for a dedicated server.

app-builder
AppBuilder SDK is a one-stop development tool for AI native applications, providing basic cloud resources, AI capability engine, Qianfan large model, and related capability components to improve the development efficiency of AI native applications.

cookbook
This repository contains community-driven practical examples of building AI applications and solving various tasks with AI using open-source tools and models. Everyone is welcome to contribute, and we value everybody's contribution! There are several ways you can contribute to the Open-Source AI Cookbook: Submit an idea for a desired example/guide via GitHub Issues. Contribute a new notebook with a practical example. Improve existing examples by fixing issues/typos. Before contributing, check currently open issues and pull requests to avoid working on something that someone else is already working on.
For similar jobs

sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.