pocketgroq

PocketGroq is a powerful Python library that simplifies integration with the Groq API, offering advanced features for natural language processing, web scraping, and autonomous agent capabilities. Key Features Seamless integration with Groq API for text generation and completion Chain of Thought (CoT) reasoning for complex problem-solving and more.

Stars: 178

Visit

PocketGroq is a tool that provides advanced functionalities for text generation, web scraping, web search, and AI response evaluation. It includes features like an Autonomous Agent for answering questions, web crawling and scraping capabilities, enhanced web search functionality, and flexible integration with Ollama server. Users can customize the agent's behavior, evaluate responses using AI, and utilize various methods for text generation, conversation management, and Chain of Thought reasoning. The tool offers comprehensive methods for different tasks, such as initializing RAG, error handling, and tool management. PocketGroq is designed to enhance development processes and enable the creation of AI-powered applications with ease.

README:

PocketGroq v0.5.6: Vision and Speech Processing Meets Autonomous Agents!

What's New in v0.5.6

Vision Capabilities

PocketGroq now includes powerful vision analysis capabilities, allowing you to process both images and screen content:

from pocketgroq import GroqProvider

groq = GroqProvider()

# Analyze an image from URL
image_url = "https://example.com/image.jpg"
response = groq.process_image(
    prompt="What do you see in this image?",
    image_source=image_url
)

print(f"Analysis: {response}")

# Analyze your screen
screen_analysis = groq.process_image_desktop(
    prompt="What applications are open on my screen?"
)

print(f"Screen analysis: {screen_analysis}")

# Analyze specific screen region
region_analysis = groq.process_image_desktop_region(
    prompt="What's in this part of the screen?",
    x1=0,    # Top-left corner
    y1=0,    # Top-left corner
    x2=400,  # Width
    y2=300   # Height
)

print(f"Region analysis: {region_analysis}")

You can also have multi-turn conversations about images:

# Start a conversation about an image
messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "What do you see in this image?"
            },
            {
                "type": "image_url",
                "image_url": {"url": "https://example.com/image.jpg"}
            }
        ]
    }
]

response1 = groq.process_image_conversation(messages=messages)
print(f"First response: {response1}")

# Add follow-up question
messages.append({
    "role": "assistant",
    "content": response1
})
messages.append({
    "role": "user",
    "content": "What colors are most prominent?"
})

response2 = groq.process_image_conversation(messages=messages)
print(f"Second response: {response2}")

Speech Processing

PocketGroq now supports advanced speech processing with transcription and translation capabilities:

from pocketgroq import GroqProvider

groq = GroqProvider()

# Transcribe audio
response = groq.transcribe_audio(
    audio_file="recording.wav",
    language="en",
    model="distil-whisper-large-v3-en"  # Fastest for English
)

print(f"Transcription: {response}")

# Translate audio to English
translation = groq.translate_audio(
    audio_file="french_speech.wav",
    model="whisper-large-v3",  # Required for translation
    prompt="This is a French conversation about cooking."
)

print(f"Translation: {translation}")

Speech Model Selection

PocketGroq offers three Whisper models with different capabilities:

whisper-large-v3: Best for multilingual tasks and translation ($0.111/hour)
whisper-large-v3-turbo: Fast multilingual transcription without translation ($0.04/hour)
distil-whisper-large-v3-en: Fastest English-only transcription ($0.02/hour)

Choose your model based on your needs:

For translation: Use whisper-large-v3
For fast multilingual transcription: Use whisper-large-v3-turbo
For English-only transcription: Use distil-whisper-large-v3-en

Additional Speech Settings

Fine-tune your speech processing:

# Transcription with advanced options
response = groq.transcribe_audio(
    audio_file="recording.wav",
    language="en",              # Specify language
    prompt="Technical terms",   # Context for better accuracy
    response_format="json",     # 'json' or 'text'
    temperature=0.3            # Control variation
)

# Translation with custom settings
translation = groq.translate_audio(
    audio_file="speech.wav",
    prompt="Medical terminology",  # Context for accuracy
    response_format="json",        # Structured output
    temperature=0              # Maximum accuracy
)

What's NEW in v0.5.4!

Autonomous Agent

PocketGroq now includes an AutonomousAgent class that can autonomously research and answer questions:

from pocketgroq import GroqProvider
from pocketgroq.autonomous_agent import AutonomousAgent

groq = GroqProvider()
agent = AutonomousAgent(groq)

request = "What is the current temperature in Sheboygan, Wisconsin?"
response = agent.process_request(request)

print(f"Final response: {response}")

The AutonomousAgent:

Attempts to answer the question using its initial knowledge.
If unsuccessful, it uses web search tools to find relevant information.
Evaluates each potential response for accuracy and completeness.
Keeps the user informed of its progress throughout the process.
Handles rate limiting and errors gracefully.

You can customize the agent's behavior:

# Set a custom maximum number of sources to check
agent = AutonomousAgent(groq, max_sources=10)

# Or specify it for a single request
response = agent.process_request(request, max_sources=8)

The agent will search up to the specified number of sources, waiting at least 2 seconds between requests to avoid overwhelming the search services.

ALSO: get_available_models()

(It does what you think it does.)

What's New in v0.4.9

Response Evaluation

PocketGroq now includes a method to evaluate whether a response satisfies a given request using AI:

from pocketgroq import GroqProvider

groq = GroqProvider()

request = "What is the current temperature in Sheboygan?"
response1 = "58 degrees"
response2 = "As a large language model, I do not have access to current temperature data"

is_satisfactory1 = groq.evaluate_response(request, response1)
is_satisfactory2 = groq.evaluate_response(request, response2)

print(f"Response 1 is satisfactory: {is_satisfactory1}")  # Expected: True
print(f"Response 2 is satisfactory: {is_satisfactory2}")  # Expected: False

This method uses an AI LLM to analyze the request-response pair and determine if the response is satisfactory based on informativeness, correctness, and lack of uncertainty.

What's New in v0.4.8

PocketGroq v0.4.8 brings significant enhancements to web-related functionalities and improves the flexibility of Ollama integration:

Advanced Web Scraping: Improved capabilities for crawling websites and extracting content.
Flexible Ollama Integration: PocketGroq now operates more flexibly with or without an active Ollama server.
Enhanced Web Search: Upgraded web search functionality with more robust result parsing.
Improved Error Handling: Better management of web-related errors and Ollama server status.
Updated Test Suite: Comprehensive tests for new web capabilities and Ollama integration.

Web Capabilities

Web Crawling

PocketGroq now offers advanced web crawling capabilities:

from pocketgroq import GroqProvider

groq = GroqProvider()

# Crawl a website
results = groq.crawl_website(
    "https://example.com",
    formats=["markdown", "html"],
    max_depth=2,
    max_pages=5
)

for page in results:
    print(f"URL: {page['url']}")
    print(f"Title: {page['metadata']['title']}")
    print(f"Markdown content: {page['markdown'][:100]}...")  # First 100 characters
    print("---")

URL Scraping

Extract content from a single URL in various formats:

url = "https://example.com"
result = groq.scrape_url(url, formats=["markdown", "html", "structured_data"])

print(f"Markdown content length: {len(result['markdown'])}")
print(f"HTML content length: {len(result['html'])}")
if 'structured_data' in result:
    print("Structured data:", json.dumps(result['structured_data'], indent=2))

Enhanced Web Search

Perform web searches with improved result parsing:

query = "Latest developments in AI"
search_results = groq.web_search(query)

for result in search_results:
    print(f"Title: {result['title']}")
    print(f"URL: {result['url']}")
    print(f"Description: {result['description']}")
    print("---")

Flexible Ollama Integration

PocketGroq v0.4.8 introduces more flexible integration with Ollama:

Optional Ollama: Core features of PocketGroq now work without requiring an active Ollama server.
Graceful Degradation: When Ollama is not available, PocketGroq provides clear error messages for Ollama-dependent features.
Persistent Features: Ollama is still required for certain persistence features, including RAG functionality.

Initializing RAG with Flexible Ollama Integration

from pocketgroq import GroqProvider

groq = GroqProvider()

try:
    groq.initialize_rag()
    print("RAG initialized successfully with Ollama.")
except OllamaServerNotRunningError:
    print("Ollama server is not running. RAG features will be limited.")
    # Proceed with non-RAG features

Error Handling

PocketGroq v0.4.8 introduces a new exception for Ollama-related errors:

from pocketgroq import GroqProvider, OllamaServerNotRunningError

groq = GroqProvider()

try:
    groq.initialize_rag()
    # Use RAG features
except OllamaServerNotRunningError:
    print("Ollama server is not running. Proceeding with limited functionality.")
    # Use non-RAG features

Updated Test Suite

The test suite has been expanded to cover the new web capabilities and Ollama integration. To run the tests:

Navigate to the PocketGroq directory.
Run the test script:

python test.py

You will see an updated menu with options to run individual tests or groups of tests:

PocketGroq Test Menu:
1. Basic Chat Completion
2. Streaming Chat Completion
3. Override Default Model
4. Chat Completion with Stop Sequence
5. Asynchronous Generation
6. Streaming Async Chat Completion
7. JSON Mode
8. Tool Usage
9. Vision
10. Chain of Thought Problem Solving
11. Chain of Thought Step Generation
12. Chain of Thought Synthesis
13. Test RAG Initialization
14. Test Document Loading
15. Test Document Querying
16. Test RAG Error Handling
17. Test Persistent Conversation
18. Test Disposable Conversation
19. Web Search
20. Get Web Content
21. Crawl Website
22. Scrape URL
23. Run All Web Tests
24. Run All RAG Tests
25. Run All Conversation Tests
26. Run All Tests
0. Exit

Select the desired option by entering the corresponding number.

Configuration

PocketGroq uses environment variables for configuration. Set GROQ_API_KEY in your environment or in a .env file in your project root. This API key is essential for authenticating with the Groq API.

Additionally, you may need to set a USER_AGENT environment variable for certain web-related functionalities. Here are a couple of ways to set these variables:

Using a .env file:

GROQ_API_KEY=your_api_key_here
USER_AGENT=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36

Setting environment variables in your script:

import os

os.environ['GROQ_API_KEY'] = 'your_api_key_here'
os.environ['USER_AGENT'] = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'

Make sure to keep your API key confidential and never commit it to version control.

Comprehensive List of PocketGroq Methods

Here's a comprehensive list of all the methods/functions available in PocketGroq, grouped logically by function:

GroqProvider Class (Main Interface)

Initialization and Configuration

__init__(api_key: str = None, rag_persistent: bool = True, rag_index_path: str = "faiss_index.pkl"): Initializes the GroqProvider with API key and RAG settings.
set_api_key(api_key: str): Updates the API key and reinitializes the Groq clients.

Text Generation

generate(prompt: str, session_id: Optional[str] = None, **kwargs) -> Union[str, AsyncIterator[str]]: Generates text based on the given prompt.
_create_completion(messages: List[Dict[str, str]], **kwargs) -> Union[str, AsyncIterator[str]]: Internal method for API call to Groq for text generation.
_sync_create_completion(**kwargs) -> Union[str, AsyncIterator[str]]: Synchronous version of completion creation.
_async_create_completion(**kwargs) -> Union[str, AsyncIterator[str]]: Asynchronous version of completion creation.

Vision Processing

process_image(prompt: str, image_source: str) -> str: Analyzes an image with given prompt.
process_image_desktop(prompt: str, region=None) -> str: Analyzes screen content.
process_image_desktop_region(prompt: str, x1: int, y1: int, x2: int, y2: int) -> str: Analyzes specific screen region.
process_image_conversation(messages: List[Dict[str, Any]], model: str = None, **kwargs) -> str: Handles multi-turn conversations about images.

Speech Processing

transcribe_audio(audio_file: str, language: Optional[str] = None, model: str = "distil-whisper-large-v3-en", **kwargs) -> str: Transcribes audio to text.
translate_audio(audio_file: str, model: str = "whisper-large-v3", **kwargs) -> str: Translates non-English audio to English text.

Conversation Management

start_conversation(session_id: str): Initializes a new conversation session.
reset_conversation(session_id: str): Resets an existing conversation session.
end_conversation(conversation_id: str): Ends and removes a conversation session.
get_conversation_history(session_id: str) -> List[Dict[str, str]]: Retrieves conversation history.

Web Tools

web_search(query: str, num_results: int = 10) -> List[Dict[str, Any]]: Performs a web search.
get_web_content(url: str) -> str: Retrieves content of a web page.
is_url(text: str) -> bool: Checks if given text is a valid URL.
crawl_website(url: str, formats: List[str] = ["markdown"], max_depth: int = 3, max_pages: int = 100) -> List[Dict[str, Any]]: Crawls a website.
scrape_url(url: str, formats: List[str] = ["markdown"]) -> Dict[str, Any]: Scrapes a single URL.

Chain of Thought Reasoning

solve_problem_with_cot(problem: str, **kwargs) -> str: Solves a problem using Chain of Thought reasoning.
generate_cot(problem: str, **kwargs) -> List[str]: Generates Chain of Thought steps.
synthesize_cot(cot_steps: List[str], **kwargs) -> str: Synthesizes a final answer from CoT steps.

RAG (Retrieval-Augmented Generation)

initialize_rag(ollama_base_url: str = "http://localhost:11434", model_name: str = "nomic-embed-text", index_path: str = "faiss_index.pkl"): Initializes the RAG system.
load_documents(source: str, chunk_size: int = 1000, chunk_overlap: int = 200, progress_callback: Callable[[int, int], None] = None, timeout: int = 300, persistent: bool = None): Loads and processes documents for RAG.
query_documents(query: str, session_id: Optional[str] = None, **kwargs) -> str: Queries loaded documents using RAG.

Tool Management

register_tool(name: str, func: callable): Registers a custom tool for use in text generation.

Utility Methods

is_ollama_server_running() -> bool: Checks if the Ollama server is running.
ensure_ollama_server_running: Decorator to ensure Ollama server is running for functions that require it.
get_available_models() -> List[Dict[str, Any]]: Retrieves list of available models.
evaluate_response(request: str, response: str) -> bool: Evaluates response quality.

WebTool Class

search(query: str) -> List[Dict[str, Any]]: Performs a web search and returns filtered, deduplicated results.
get_web_content(url: str) -> str: Retrieves and processes the content of a web page.
is_url(text: str) -> bool: Checks if the given text is a valid URL.

EnhancedWebTool Class

crawl(start_url: str, formats: List[str] = ["markdown"]) -> List[Dict[str, Any]]: Crawls a website and returns its content in specified formats.
scrape_page(url: str, formats: List[str]) -> Dict[str, Any]: Scrapes a single page and returns its content in specified formats.

RAGManager Class

load_and_process_documents(source: str, chunk_size: int = 1000, chunk_overlap: int = 200, progress_callback: Callable[[int, int], None] = None, timeout: int = 300): Loads, processes, and indexes documents for RAG.
query_documents(llm, query: str) -> Dict[str, Any]: Queries the indexed documents using the provided language model.

ChainOfThoughtManager Class

generate_cot(problem: str) -> List[str]: Generates Chain of Thought steps for a given problem.
synthesize_response(cot_steps: List[str]) -> str: Synthesizes a final answer from Chain of Thought steps.
solve_problem(problem: str) -> str: Completes the entire Chain of Thought process to solve a problem.

AutonomousAgent Class

process_request(request: str, max_sources: int = None, verify: bool = False) -> str: Processes a request autonomously.
_select_best_response(verified_sources: List[tuple], verify: bool) -> str: Selects the best response from verified sources.
_generate_search_query(request: str) -> str: Generates an optimized search query.
_evaluate_response(request: str, response: str) -> bool: Evaluates response quality.

License

This project is licensed under the MIT License. When using PocketGroq in your projects, please include a mention of J. Gravelle in your code and/or documentation.

Thank you for using PocketGroq! We hope this tool enhances your development process and enables you to create amazing AI-powered applications with ease. If you have any questions or need further assistance, don't hesitate to reach out to the community or check the documentation. Happy coding!

For Tasks:

Click tags to check more tools for each tasks

generate text crawl website evaluate response solve problem search web

For Jobs:

ai researcher data scientist web developer software engineer machine learning engineer

Alternative AI tools for pocketgroq

Similar Open Source Tools

pocketgroq

github

: 178

langchainrb

Langchain.rb is a Ruby library that makes it easy to build LLM-powered applications. It provides a unified interface to a variety of LLMs, vector search databases, and other tools, making it easy to build and deploy RAG (Retrieval Augmented Generation) systems and assistants. Langchain.rb is open source and available under the MIT License.

github

: 1.7k

agentpress

github

: 67

UnrealGenAISupport

github

: 106

clarifai-python

The Clarifai Python SDK offers a comprehensive set of tools to integrate Clarifai's AI platform to leverage computer vision capabilities like classification , detection ,segementation and natural language capabilities like classification , summarisation , generation , Q&A ,etc into your applications. With just a few lines of code, you can leverage cutting-edge artificial intelligence to unlock valuable insights from visual and textual content.

github

: 392

LightRAG

LightRAG is a repository hosting the code for LightRAG, a system that supports seamless integration of custom knowledge graphs, Oracle Database 23ai, Neo4J for storage, and multiple file types. It includes features like entity deletion, batch insert, incremental insert, and graph visualization. LightRAG provides an API server implementation for RESTful API access to RAG operations, allowing users to interact with it through HTTP requests. The repository also includes evaluation scripts, code for reproducing results, and a comprehensive code structure.

github

: 13.5k

LLMVoX

LLMVoX is a lightweight 30M-parameter, LLM-agnostic, autoregressive streaming Text-to-Speech (TTS) system designed to convert text outputs from Large Language Models into high-fidelity streaming speech with low latency. It achieves significantly lower Word Error Rate compared to speech-enabled LLMs while operating at comparable latency and speech quality. Key features include being lightweight & fast with only 30M parameters, LLM-agnostic for easy integration with existing models, multi-queue streaming for continuous speech generation, and multilingual support for easy adaptation to new languages.

github

: 167

solana-agent-kit

Solana Agent Kit is an open-source toolkit designed for connecting AI agents to Solana protocols. It enables agents, regardless of the model used, to autonomously perform various Solana actions such as trading tokens, launching new tokens, lending assets, sending compressed airdrops, executing blinks, and more. The toolkit integrates core blockchain features like token operations, NFT management via Metaplex, DeFi integration, Solana blinks, AI integration features with LangChain, autonomous modes, and AI tools. It provides ready-to-use tools for blockchain operations, supports autonomous agent actions, and offers features like memory management, real-time feedback, and error handling. Solana Agent Kit facilitates tasks such as deploying tokens, creating NFT collections, swapping tokens, lending tokens, staking SOL, and sending SPL token airdrops via ZK compression. It also includes functionalities for fetching price data from Pyth and relies on key Solana and Metaplex libraries for its operations.

github

: 1.1k

e2m

E2M is a Python library that can parse and convert various file types into Markdown format. It supports the conversion of multiple file formats, including doc, docx, epub, html, htm, url, pdf, ppt, pptx, mp3, and m4a. The ultimate goal of the E2M project is to provide high-quality data for Retrieval-Augmented Generation (RAG) and model training or fine-tuning. The core architecture consists of a Parser responsible for parsing various file types into text or image data, and a Converter responsible for converting text or image data into Markdown format.

github

: 143

instructor

Instructor is a Python library that makes it a breeze to work with structured outputs from large language models (LLMs). Built on top of Pydantic, it provides a simple, transparent, and user-friendly API to manage validation, retries, and streaming responses. Get ready to supercharge your LLM workflows!

github

: 7.7k

client-python

The Mistral Python Client is a tool inspired by cohere-python that allows users to interact with the Mistral AI API. It provides functionalities to access and utilize the AI capabilities offered by Mistral. Users can easily install the client using pip and manage dependencies using poetry. The client includes examples demonstrating how to use the API for various tasks, such as chat interactions. To get started, users need to obtain a Mistral API Key and set it as an environment variable. Overall, the Mistral Python Client simplifies the integration of Mistral AI services into Python applications.

github

: 570

mcphub.nvim

MCPHub.nvim is a powerful Neovim plugin that integrates MCP (Model Context Protocol) servers into your workflow. It offers a centralized config file for managing servers and tools, with an intuitive UI for testing resources. Ideal for LLM integration, it provides programmatic API access and interactive testing through the `:MCPHub` command.

github

: 448

syncode

github

: 251

acte

Acte is a framework designed to build GUI-like tools for AI Agents. It aims to address the issues of cognitive load and freedom degrees when interacting with multiple APIs in complex scenarios. By providing a graphical user interface (GUI) for Agents, Acte helps reduce cognitive load and constraints interaction, similar to how humans interact with computers through GUIs. The tool offers APIs for starting new sessions, executing actions, and displaying screens, accessible via HTTP requests or the SessionManager class.

github

: 113

ai-gateway

LangDB AI Gateway is an open-source enterprise AI gateway built in Rust. It provides a unified interface to all LLMs using the OpenAI API format, focusing on high performance, enterprise readiness, and data control. The gateway offers features like comprehensive usage analytics, cost tracking, rate limiting, data ownership, and detailed logging. It supports various LLM providers and provides OpenAI-compatible endpoints for chat completions, model listing, embeddings generation, and image generation. Users can configure advanced settings, such as rate limiting, cost control, dynamic model routing, and observability with OpenTelemetry tracing. The gateway can be run with Docker Compose and integrated with MCP tools for server communication.

github

: 389

llm-sandbox

LLM Sandbox is a lightweight and portable sandbox environment designed to securely execute large language model (LLM) generated code in a safe and isolated manner using Docker containers. It provides an easy-to-use interface for setting up, managing, and executing code in a controlled Docker environment, simplifying the process of running code generated by LLMs. The tool supports multiple programming languages, offers flexibility with predefined Docker images or custom Dockerfiles, and allows scalability with support for Kubernetes and remote Docker hosts.

github

: 151

For similar tasks

firecrawl

Firecrawl is an API service that takes a URL, crawls it, and converts it into clean markdown. It crawls all accessible subpages and provides clean markdown for each, without requiring a sitemap. The API is easy to use and can be self-hosted. It also integrates with Langchain and Llama Index. The Python SDK makes it easy to crawl and scrape websites in Python code.

github

: 34.1k

pocketgroq

github

: 178

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

LocalAI

LocalAI is a free and open-source OpenAI alternative that acts as a drop-in replacement REST API compatible with OpenAI (Elevenlabs, Anthropic, etc.) API specifications for local AI inferencing. It allows users to run LLMs, generate images, audio, and more locally or on-premises with consumer-grade hardware, supporting multiple model families and not requiring a GPU. LocalAI offers features such as text generation with GPTs, text-to-audio, audio-to-text transcription, image generation with stable diffusion, OpenAI functions, embeddings generation for vector databases, constrained grammars, downloading models directly from Huggingface, and a Vision API. It provides a detailed step-by-step introduction in its Getting Started guide and supports community integrations such as custom containers, WebUIs, model galleries, and various bots for Discord, Slack, and Telegram. LocalAI also offers resources like an LLM fine-tuning guide, instructions for local building and Kubernetes installation, projects integrating LocalAI, and a how-tos section curated by the community. It encourages users to cite the repository when utilizing it in downstream projects and acknowledges the contributions of various software from the community.

github

: 31.5k

AiTreasureBox

AiTreasureBox is a versatile AI tool that provides a collection of pre-trained models and algorithms for various machine learning tasks. It simplifies the process of implementing AI solutions by offering ready-to-use components that can be easily integrated into projects. With AiTreasureBox, users can quickly prototype and deploy AI applications without the need for extensive knowledge in machine learning or deep learning. The tool covers a wide range of tasks such as image classification, text generation, sentiment analysis, object detection, and more. It is designed to be user-friendly and accessible to both beginners and experienced developers, making AI development more efficient and accessible to a wider audience.

github

: 368

glide

Glide is a cloud-native LLM gateway that provides a unified REST API for accessing various large language models (LLMs) from different providers. It handles LLMOps tasks such as model failover, caching, key management, and more, making it easy to integrate LLMs into applications. Glide supports popular LLM providers like OpenAI, Anthropic, Azure OpenAI, AWS Bedrock (Titan), Cohere, Google Gemini, OctoML, and Ollama. It offers high availability, performance, and observability, and provides SDKs for Python and NodeJS to simplify integration.

github

: 110

jupyter-ai

Jupyter AI connects generative AI with Jupyter notebooks. It provides a user-friendly and powerful way to explore generative AI models in notebooks and improve your productivity in JupyterLab and the Jupyter Notebook. Specifically, Jupyter AI offers: * An `%%ai` magic that turns the Jupyter notebook into a reproducible generative AI playground. This works anywhere the IPython kernel runs (JupyterLab, Jupyter Notebook, Google Colab, Kaggle, VSCode, etc.). * A native chat UI in JupyterLab that enables you to work with generative AI as a conversational assistant. * Support for a wide range of generative model providers, including AI21, Anthropic, AWS, Cohere, Gemini, Hugging Face, NVIDIA, and OpenAI. * Local model support through GPT4All, enabling use of generative AI models on consumer grade machines with ease and privacy.

github

: 3.5k

langchain_dart

LangChain.dart is a Dart port of the popular LangChain Python framework created by Harrison Chase. LangChain provides a set of ready-to-use components for working with language models and a standard interface for chaining them together to formulate more advanced use cases (e.g. chatbots, Q&A with RAG, agents, summarization, extraction, etc.). The components can be grouped into a few core modules: * **Model I/O:** LangChain offers a unified API for interacting with various LLM providers (e.g. OpenAI, Google, Mistral, Ollama, etc.), allowing developers to switch between them with ease. Additionally, it provides tools for managing model inputs (prompt templates and example selectors) and parsing the resulting model outputs (output parsers). * **Retrieval:** assists in loading user data (via document loaders), transforming it (with text splitters), extracting its meaning (using embedding models), storing (in vector stores) and retrieving it (through retrievers) so that it can be used to ground the model's responses (i.e. Retrieval-Augmented Generation or RAG). * **Agents:** "bots" that leverage LLMs to make informed decisions about which available tools (such as web search, calculators, database lookup, etc.) to use to accomplish the designated task. The different components can be composed together using the LangChain Expression Language (LCEL).

github

: 497

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 855

LLMStack

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.3k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 30.6k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675