surfkit

A toolkit for building computer use AI agents

Stars: 144

Visit

Surfkit is a versatile toolkit designed for building and sharing AI agents that can operate on various devices. Users can create multimodal agents, share them with the community, run them locally or in the cloud, manage agent tasks at scale, and track and observe agent actions. The toolkit provides functionalities for creating agents, devices, solving tasks, managing devices, tracking tasks, and publishing agents. It also offers integrations with libraries like MLLM, Taskara, Skillpacks, and Threadmem. Surfkit aims to simplify the development and deployment of AI agents across different environments.

README:

Surfkit

A toolkit for building and sharing AI agents that operate on devices
Explore the docs »

View Demo · Report Bug · Request Feature

Features

Build multimodal agents that can operate on devices
Share agents with the community
Run agents and devices locally or in the cloud
Manage agent tasks at scale
Track and observe agent actions

Demo

https://github.com/agentsea/surfkit/assets/5533189/98b7714d-9692-4369-8fbf-88aff61e741c

Installation

pip install surfkit

Quickstart

Prerequisites

Docker
Python >= 3.10
MacOS or Linux

Python

Use an agent to solve a task

from surfkit import solve

task = solve(
    "Search for the most common variety of french duck",
    agent_type="pbarker/SurfPizza",
    device_type="desktop",
  )

task.wait_for_done()

result = task.result

CLI

Create an Agent

Find available agents on the Hub

surfkit find

Create a new agent

surfkit create agent -t pbarker/SurfPizza -n agent01

List running agents

surfkit list agents

Create a Device

Create an Ubuntu desktop for our agent to use.

surfkit create device --provider docker -n desktop01

List running devices

surfkit list devices

Solve a task

Use the agent to solve a task on the device

surfkit solve "Search for the most common variety of french duck" \
  --agent agent01 \
  --device desktop01

Documentation

View our documentation for more in depth information.

Usage

Building Agents

Initialize a new project

surfkit new

Build a docker container for the agent

surfkit build

Running Agents

Create an agent locally

surfkit create agent --name foo -t pbarker/SurfPizza

Create an agent on kubernetes

surfkit create agent --runtime kube -t pbarker/SurfPizza

List running agents

surfkit list agents

Get details about a specific agent

surfkit get agent foo

Fetch logs for a specific agent

surfkit logs foo

Delete an agent

surfkit delete agent foo

Managing Devices

Create a device

surfkit create device --type desktop --provicer gce --name bar

List devices

surfkit list devices

View device in UI

surfkit view device bar

Delete a device

surfkit delete device bar

Tracking Tasks

Create a tracker

surfkit create tracker

List trackers

surfkit list trackers

Delete a tracker

surfkit delete tracker foo

Solving Tasks

Solve a task with an existing setup

surfkit solve "search for common french ducks" --agent foo --device bar

Solve a task creating the agent ad hoc

surfkit solve "search for alpaca sweaters" \
--device bar --agent-file ./agent.yaml

List tasks

surfkit list tasks

Publishing Agents

surfkit login

Publish the agent

surfkit publish

List published agent types

surfkit find

Integrations

Skillpacks is integrated with:

MLLM A prompt management, routing, and schema validation library for multimodal LLMs
Taskara A task management library for AI agents
Skillpacks A library to fine tune AI agents on tasks.
Threadmem A thread management library for AI agents

Community

Come join us on Discord.

Developing

Add the following function to your ~/.zshrc (or similar)

function sk() {
  local project_dir="/path/to/surfkit/repo"
  local venv_dir="$project_dir/.venv"
  local ssh_auth_sock="$SSH_AUTH_SOCK"
  local ssh_agent_pid="$SSH_AGENT_PID"

  export SSH_AUTH_SOCK="$ssh_auth_sock"
  export SSH_AGENT_PID="$ssh_agent_pid"

  # Add the Poetry environment's bin directory to the PATH
  export PATH="$venv_dir/bin:$PATH"

  # Execute the surfkit.cli.main module using python -m
  surfkit "$@"
}

Replacing /path/to/surfkit/repo with the absolute path to your local repo.

Then calling sk will execute the working code in your repo from any location.

For Tasks:

Click tags to check more tools for each tasks

solve task create agent manage device track task publish agent

For Jobs:

ai engineer machine learning engineer data scientist software developer research scientist

Alternative AI tools for surfkit

Similar Open Source Tools

surfkit

github

: 144

llm-functions

LLM Functions is a project that enables the enhancement of large language models (LLMs) with custom tools and agents developed in bash, javascript, and python. Users can create tools for their LLM to execute system commands, access web APIs, or perform other complex tasks triggered by natural language prompts. The project provides a framework for building tools and agents, with tools being functions written in the user's preferred language and automatically generating JSON declarations based on comments. Agents combine prompts, function callings, and knowledge (RAG) to create conversational AI agents. The project is designed to be user-friendly and allows users to easily extend the capabilities of their language models.

github

: 263

elasticsearch-labs

This repository contains executable Python notebooks, sample apps, and resources for testing out the Elastic platform. Users can learn how to use Elasticsearch as a vector database for storing embeddings, build use cases like retrieval augmented generation (RAG), summarization, and question answering (QA), and test Elastic's leading-edge capabilities like the Elastic Learned Sparse Encoder and reciprocal rank fusion (RRF). It also allows integration with projects like OpenAI, Hugging Face, and LangChain to power LLM-powered applications. The repository enables modern search experiences powered by AI/ML.

github

: 801

tiledesk-dashboard

Tiledesk is an open-source live chat platform with integrated chatbots written in Node.js and Express. It is designed to be a multi-channel platform for web, Android, and iOS, and it can be used to increase sales or provide post-sales customer service. Tiledesk's chatbot technology allows for automation of conversations, and it also provides APIs and webhooks for connecting external applications. Additionally, it offers a marketplace for apps and features such as CRM, ticketing, and data export.

github

: 258

BuildCLI

github

: 104

pacha

Pacha is an AI tool designed for retrieving context for natural language queries using a SQL interface and Python programming environment. It is optimized for working with Hasura DDN for multi-source querying. Pacha is used in conjunction with language models to produce informed responses in AI applications, agents, and chatbots.

github

: 75

MindSearch

MindSearch is an open-source AI Search Engine Framework that mimics human minds to provide deep AI search capabilities. It allows users to deploy their own search engine using either close-source or open-source language models. MindSearch offers features such as answering any question using web knowledge, in-depth knowledge discovery, detailed solution paths, optimized UI experience, and dynamic graph construction process.

github

: 4.8k

clickclickclick

ClickClickClick is a framework designed to enable autonomous Android and computer use using various LLM models, both locally and remotely. It supports tasks such as drafting emails, opening browsers, and starting games, with current support for local models via Ollama, Gemini, and GPT 4o. The tool is highly experimental and evolving, with the best results achieved using specific model combinations. Users need prerequisites like `adb` installation and USB debugging enabled on Android phones. The tool can be installed via cloning the repository, setting up a virtual environment, and installing dependencies. It can be used as a CLI tool or script, allowing users to configure planner and finder models for different tasks. Additionally, it can be used as an API to execute tasks based on provided prompts, platform, and models.

github

: 314

cursor-talk-to-figma-mcp

This project implements a Model Context Protocol (MCP) integration between Cursor AI and Figma, allowing Cursor to communicate with Figma for reading designs and modifying them programmatically. It provides tools for interacting with Figma such as creating elements, modifying text content, styling, layout & organization, components & styles, export & advanced features, and connection management. The project structure includes a TypeScript MCP server for Figma integration, a Figma plugin for communicating with Cursor, and a WebSocket server for facilitating communication between the MCP server and Figma plugin.

github

: 1.4k

galxe-aio

Galxe AIO is a versatile tool designed to automate various tasks on social media platforms like Twitter, email, and Discord. It supports tasks such as following, retweeting, liking, and quoting on Twitter, as well as solving quizzes, submitting surveys, and more. Users can link their Twitter accounts, email accounts (IMAP or mail3.me), and Discord accounts to the tool to streamline their activities. Additionally, the tool offers features like claiming rewards, quiz solving, submitting surveys, and managing referral links and account statistics. It also supports different types of rewards like points, mystery boxes, gas-less OATs, gas OATs and NFTs, and participation in raffles. The tool provides settings for managing EVM wallets, proxies, twitters, emails, and discords, along with custom configurations in the `config.toml` file. Users can run the tool using Python 3.11 and install dependencies using `pip` and `playwright`. The tool generates results and logs in specific folders and allows users to donate using TRC-20 or ERC-20 tokens.

github

: 264

rclip

rclip is a command-line photo search tool powered by the OpenAI's CLIP neural network. It allows users to search for images using text queries, similar image search, and combining multiple queries. The tool extracts features from photos to enable searching and indexing, with options for previewing results in supported terminals or custom viewers. Users can install rclip on Linux, macOS, and Windows using different installation methods. The repository follows the Conventional Commits standard and welcomes contributions from the community.

github

: 781

frontend

A frontend for Trading Strategy protocol.

github

: 134

Discord-AI-Chatbot

Discord AI Chatbot is a versatile tool that seamlessly integrates into your Discord server, offering a wide range of capabilities to enhance your communication and engagement. With its advanced language model, the bot excels at imaginative generation, providing endless possibilities for creative expression. Additionally, it offers secure credential management, ensuring the privacy of your data. The bot's hybrid command system combines the best of slash and normal commands, providing flexibility and ease of use. It also features mention recognition, ensuring prompt responses whenever you mention it or use its name. The bot's message handling capabilities prevent confusion by recognizing when you're replying to others. You can customize the bot's behavior by selecting from a range of pre-existing personalities or creating your own. The bot's web access feature unlocks a new level of convenience, allowing you to interact with it from anywhere. With its open-source nature, you have the freedom to modify and adapt the bot to your specific needs.

github

: 1.3k

cursor-tools

cursor-tools is a CLI tool designed to enhance AI agents with advanced skills, such as web search, repository context, documentation generation, GitHub integration, Xcode tools, and browser automation. It provides features like Perplexity for web search, Gemini 2.0 for codebase context, and Stagehand for browser operations. The tool requires API keys for Perplexity AI and Google Gemini, and supports global installation for system-wide access. It offers various commands for different tasks and integrates with Cursor Composer for AI agent usage.

github

: 3.5k

claude-task-master

github

: 616

raycast_api_proxy

The Raycast AI Proxy is a tool that acts as a proxy for the Raycast AI application, allowing users to utilize the application without subscribing. It intercepts and forwards Raycast requests to various AI APIs, then reformats the responses for Raycast. The tool supports multiple AI providers and allows for custom model configurations. Users can generate self-signed certificates, add them to the system keychain, and modify DNS settings to redirect requests to the proxy. The tool is designed to work with providers like OpenAI, Azure OpenAI, Google, and more, enabling tasks such as AI chat completions, translations, and image generation.

github

: 317

For similar tasks

nagato-ai

Nagato-AI is an intuitive AI Agent library that supports multiple LLMs including OpenAI's GPT, Anthropic's Claude, Google's Gemini, and Groq LLMs. Users can create agents from these models and combine them to build an effective AI Agent system. The library is named after the powerful ninja Nagato from the anime Naruto, who can control multiple bodies with different abilities. Nagato-AI acts as a linchpin to summon and coordinate AI Agents for specific missions. It provides flexibility in programming and supports tools like Coordinator, Researcher, Critic agents, and HumanConfirmInputTool.

github

: 76

surfkit

github

: 144

labo

LABO is a time series forecasting and analysis framework that integrates pre-trained and fine-tuned LLMs with multi-domain agent-based systems. It allows users to create and tune agents easily for various scenarios, such as stock market trend prediction and web public opinion analysis. LABO requires a specific runtime environment setup, including system requirements, Python environment, dependency installations, and configurations. Users can fine-tune their own models using LABO's Low-Rank Adaptation (LoRA) for computational efficiency and continuous model updates. Additionally, LABO provides a Python library for building model training pipelines and customizing agents for specific tasks.

github

: 160

intentkit

IntentKit is an autonomous agent framework that enables the creation and management of AI agents with capabilities including blockchain interactions, social media management, and custom skill integration. It supports multiple agents, autonomous agent management, blockchain integration, social media integration, extensible skill system, and plugin system. The project is in alpha stage and not recommended for production use. It provides quick start guides for Docker and local development, integrations with Twitter and Coinbase, configuration options using environment variables or AWS Secrets Manager, project structure with core application code, entry points, configuration management, database models, skills, skill sets, and utility functions. Developers can add new skills by creating, implementing, and registering them in the skill directory.

github

: 5.3k

For similar jobs

sweep

Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

github

: 7.1k

teams-ai

The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

github

: 502

ai-guide

This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

github

: 159

classifai

Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

github

: 620

chatbot-ui

Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

github

: 27.7k

BricksLLM

BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

github

: 953

uAgents

uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

github

: 1.3k

griptape

Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.

github

: 2.2k