pocketgroq
PocketGroq is a powerful Python library that simplifies integration with the Groq API, offering advanced features for natural language processing, web scraping, and autonomous agent capabilities. Key Features Seamless integration with Groq API for text generation and completion Chain of Thought (CoT) reasoning for complex problem-solving and more.
Stars: 178
PocketGroq is a tool that provides advanced functionalities for text generation, web scraping, web search, and AI response evaluation. It includes features like an Autonomous Agent for answering questions, web crawling and scraping capabilities, enhanced web search functionality, and flexible integration with Ollama server. Users can customize the agent's behavior, evaluate responses using AI, and utilize various methods for text generation, conversation management, and Chain of Thought reasoning. The tool offers comprehensive methods for different tasks, such as initializing RAG, error handling, and tool management. PocketGroq is designed to enhance development processes and enable the creation of AI-powered applications with ease.
README:
PocketGroq now includes powerful vision analysis capabilities, allowing you to process both images and screen content:
from pocketgroq import GroqProvider
groq = GroqProvider()
# Analyze an image from URL
image_url = "https://example.com/image.jpg"
response = groq.process_image(
prompt="What do you see in this image?",
image_source=image_url
)
print(f"Analysis: {response}")
# Analyze your screen
screen_analysis = groq.process_image_desktop(
prompt="What applications are open on my screen?"
)
print(f"Screen analysis: {screen_analysis}")
# Analyze specific screen region
region_analysis = groq.process_image_desktop_region(
prompt="What's in this part of the screen?",
x1=0, # Top-left corner
y1=0, # Top-left corner
x2=400, # Width
y2=300 # Height
)
print(f"Region analysis: {region_analysis}")
You can also have multi-turn conversations about images:
# Start a conversation about an image
messages = [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What do you see in this image?"
},
{
"type": "image_url",
"image_url": {"url": "https://example.com/image.jpg"}
}
]
}
]
response1 = groq.process_image_conversation(messages=messages)
print(f"First response: {response1}")
# Add follow-up question
messages.append({
"role": "assistant",
"content": response1
})
messages.append({
"role": "user",
"content": "What colors are most prominent?"
})
response2 = groq.process_image_conversation(messages=messages)
print(f"Second response: {response2}")
PocketGroq now supports advanced speech processing with transcription and translation capabilities:
from pocketgroq import GroqProvider
groq = GroqProvider()
# Transcribe audio
response = groq.transcribe_audio(
audio_file="recording.wav",
language="en",
model="distil-whisper-large-v3-en" # Fastest for English
)
print(f"Transcription: {response}")
# Translate audio to English
translation = groq.translate_audio(
audio_file="french_speech.wav",
model="whisper-large-v3", # Required for translation
prompt="This is a French conversation about cooking."
)
print(f"Translation: {translation}")
PocketGroq offers three Whisper models with different capabilities:
-
whisper-large-v3
: Best for multilingual tasks and translation ($0.111/hour) -
whisper-large-v3-turbo
: Fast multilingual transcription without translation ($0.04/hour) -
distil-whisper-large-v3-en
: Fastest English-only transcription ($0.02/hour)
Choose your model based on your needs:
- For translation: Use
whisper-large-v3
- For fast multilingual transcription: Use
whisper-large-v3-turbo
- For English-only transcription: Use
distil-whisper-large-v3-en
Fine-tune your speech processing:
# Transcription with advanced options
response = groq.transcribe_audio(
audio_file="recording.wav",
language="en", # Specify language
prompt="Technical terms", # Context for better accuracy
response_format="json", # 'json' or 'text'
temperature=0.3 # Control variation
)
# Translation with custom settings
translation = groq.translate_audio(
audio_file="speech.wav",
prompt="Medical terminology", # Context for accuracy
response_format="json", # Structured output
temperature=0 # Maximum accuracy
)
PocketGroq now includes an AutonomousAgent class that can autonomously research and answer questions:
from pocketgroq import GroqProvider
from pocketgroq.autonomous_agent import AutonomousAgent
groq = GroqProvider()
agent = AutonomousAgent(groq)
request = "What is the current temperature in Sheboygan, Wisconsin?"
response = agent.process_request(request)
print(f"Final response: {response}")
The AutonomousAgent:
- Attempts to answer the question using its initial knowledge.
- If unsuccessful, it uses web search tools to find relevant information.
- Evaluates each potential response for accuracy and completeness.
- Keeps the user informed of its progress throughout the process.
- Handles rate limiting and errors gracefully.
You can customize the agent's behavior:
# Set a custom maximum number of sources to check
agent = AutonomousAgent(groq, max_sources=10)
# Or specify it for a single request
response = agent.process_request(request, max_sources=8)
The agent will search up to the specified number of sources, waiting at least 2 seconds between requests to avoid overwhelming the search services.
(It does what you think it does.)
PocketGroq now includes a method to evaluate whether a response satisfies a given request using AI:
from pocketgroq import GroqProvider
groq = GroqProvider()
request = "What is the current temperature in Sheboygan?"
response1 = "58 degrees"
response2 = "As a large language model, I do not have access to current temperature data"
is_satisfactory1 = groq.evaluate_response(request, response1)
is_satisfactory2 = groq.evaluate_response(request, response2)
print(f"Response 1 is satisfactory: {is_satisfactory1}") # Expected: True
print(f"Response 2 is satisfactory: {is_satisfactory2}") # Expected: False
This method uses an AI LLM to analyze the request-response pair and determine if the response is satisfactory based on informativeness, correctness, and lack of uncertainty.
PocketGroq v0.4.8 brings significant enhancements to web-related functionalities and improves the flexibility of Ollama integration:
- Advanced Web Scraping: Improved capabilities for crawling websites and extracting content.
- Flexible Ollama Integration: PocketGroq now operates more flexibly with or without an active Ollama server.
- Enhanced Web Search: Upgraded web search functionality with more robust result parsing.
- Improved Error Handling: Better management of web-related errors and Ollama server status.
- Updated Test Suite: Comprehensive tests for new web capabilities and Ollama integration.
PocketGroq now offers advanced web crawling capabilities:
from pocketgroq import GroqProvider
groq = GroqProvider()
# Crawl a website
results = groq.crawl_website(
"https://example.com",
formats=["markdown", "html"],
max_depth=2,
max_pages=5
)
for page in results:
print(f"URL: {page['url']}")
print(f"Title: {page['metadata']['title']}")
print(f"Markdown content: {page['markdown'][:100]}...") # First 100 characters
print("---")
Extract content from a single URL in various formats:
url = "https://example.com"
result = groq.scrape_url(url, formats=["markdown", "html", "structured_data"])
print(f"Markdown content length: {len(result['markdown'])}")
print(f"HTML content length: {len(result['html'])}")
if 'structured_data' in result:
print("Structured data:", json.dumps(result['structured_data'], indent=2))
Perform web searches with improved result parsing:
query = "Latest developments in AI"
search_results = groq.web_search(query)
for result in search_results:
print(f"Title: {result['title']}")
print(f"URL: {result['url']}")
print(f"Description: {result['description']}")
print("---")
PocketGroq v0.4.8 introduces more flexible integration with Ollama:
- Optional Ollama: Core features of PocketGroq now work without requiring an active Ollama server.
- Graceful Degradation: When Ollama is not available, PocketGroq provides clear error messages for Ollama-dependent features.
- Persistent Features: Ollama is still required for certain persistence features, including RAG functionality.
from pocketgroq import GroqProvider
groq = GroqProvider()
try:
groq.initialize_rag()
print("RAG initialized successfully with Ollama.")
except OllamaServerNotRunningError:
print("Ollama server is not running. RAG features will be limited.")
# Proceed with non-RAG features
PocketGroq v0.4.8 introduces a new exception for Ollama-related errors:
from pocketgroq import GroqProvider, OllamaServerNotRunningError
groq = GroqProvider()
try:
groq.initialize_rag()
# Use RAG features
except OllamaServerNotRunningError:
print("Ollama server is not running. Proceeding with limited functionality.")
# Use non-RAG features
The test suite has been expanded to cover the new web capabilities and Ollama integration. To run the tests:
- Navigate to the PocketGroq directory.
- Run the test script:
python test.py
- You will see an updated menu with options to run individual tests or groups of tests:
PocketGroq Test Menu:
1. Basic Chat Completion
2. Streaming Chat Completion
3. Override Default Model
4. Chat Completion with Stop Sequence
5. Asynchronous Generation
6. Streaming Async Chat Completion
7. JSON Mode
8. Tool Usage
9. Vision
10. Chain of Thought Problem Solving
11. Chain of Thought Step Generation
12. Chain of Thought Synthesis
13. Test RAG Initialization
14. Test Document Loading
15. Test Document Querying
16. Test RAG Error Handling
17. Test Persistent Conversation
18. Test Disposable Conversation
19. Web Search
20. Get Web Content
21. Crawl Website
22. Scrape URL
23. Run All Web Tests
24. Run All RAG Tests
25. Run All Conversation Tests
26. Run All Tests
0. Exit
- Select the desired option by entering the corresponding number.
PocketGroq uses environment variables for configuration. Set GROQ_API_KEY
in your environment or in a .env
file in your project root. This API key is essential for authenticating with the Groq API.
Additionally, you may need to set a USER_AGENT
environment variable for certain web-related functionalities. Here are a couple of ways to set these variables:
- Using a
.env
file:
GROQ_API_KEY=your_api_key_here
USER_AGENT=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36
- Setting environment variables in your script:
import os
os.environ['GROQ_API_KEY'] = 'your_api_key_here'
os.environ['USER_AGENT'] = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
Make sure to keep your API key confidential and never commit it to version control.
Here's a comprehensive list of all the methods/functions available in PocketGroq, grouped logically by function:
-
__init__(api_key: str = None, rag_persistent: bool = True, rag_index_path: str = "faiss_index.pkl")
: Initializes the GroqProvider with API key and RAG settings. -
set_api_key(api_key: str)
: Updates the API key and reinitializes the Groq clients.
-
generate(prompt: str, session_id: Optional[str] = None, **kwargs) -> Union[str, AsyncIterator[str]]
: Generates text based on the given prompt. -
_create_completion(messages: List[Dict[str, str]], **kwargs) -> Union[str, AsyncIterator[str]]
: Internal method for API call to Groq for text generation. -
_sync_create_completion(**kwargs) -> Union[str, AsyncIterator[str]]
: Synchronous version of completion creation. -
_async_create_completion(**kwargs) -> Union[str, AsyncIterator[str]]
: Asynchronous version of completion creation.
-
process_image(prompt: str, image_source: str) -> str
: Analyzes an image with given prompt. -
process_image_desktop(prompt: str, region=None) -> str
: Analyzes screen content. -
process_image_desktop_region(prompt: str, x1: int, y1: int, x2: int, y2: int) -> str
: Analyzes specific screen region. -
process_image_conversation(messages: List[Dict[str, Any]], model: str = None, **kwargs) -> str
: Handles multi-turn conversations about images.
-
transcribe_audio(audio_file: str, language: Optional[str] = None, model: str = "distil-whisper-large-v3-en", **kwargs) -> str
: Transcribes audio to text. -
translate_audio(audio_file: str, model: str = "whisper-large-v3", **kwargs) -> str
: Translates non-English audio to English text.
-
start_conversation(session_id: str)
: Initializes a new conversation session. -
reset_conversation(session_id: str)
: Resets an existing conversation session. -
end_conversation(conversation_id: str)
: Ends and removes a conversation session. -
get_conversation_history(session_id: str) -> List[Dict[str, str]]
: Retrieves conversation history.
-
web_search(query: str, num_results: int = 10) -> List[Dict[str, Any]]
: Performs a web search. -
get_web_content(url: str) -> str
: Retrieves content of a web page. -
is_url(text: str) -> bool
: Checks if given text is a valid URL. -
crawl_website(url: str, formats: List[str] = ["markdown"], max_depth: int = 3, max_pages: int = 100) -> List[Dict[str, Any]]
: Crawls a website. -
scrape_url(url: str, formats: List[str] = ["markdown"]) -> Dict[str, Any]
: Scrapes a single URL.
-
solve_problem_with_cot(problem: str, **kwargs) -> str
: Solves a problem using Chain of Thought reasoning. -
generate_cot(problem: str, **kwargs) -> List[str]
: Generates Chain of Thought steps. -
synthesize_cot(cot_steps: List[str], **kwargs) -> str
: Synthesizes a final answer from CoT steps.
-
initialize_rag(ollama_base_url: str = "http://localhost:11434", model_name: str = "nomic-embed-text", index_path: str = "faiss_index.pkl")
: Initializes the RAG system. -
load_documents(source: str, chunk_size: int = 1000, chunk_overlap: int = 200, progress_callback: Callable[[int, int], None] = None, timeout: int = 300, persistent: bool = None)
: Loads and processes documents for RAG. -
query_documents(query: str, session_id: Optional[str] = None, **kwargs) -> str
: Queries loaded documents using RAG.
-
register_tool(name: str, func: callable)
: Registers a custom tool for use in text generation.
-
is_ollama_server_running() -> bool
: Checks if the Ollama server is running. -
ensure_ollama_server_running
: Decorator to ensure Ollama server is running for functions that require it. -
get_available_models() -> List[Dict[str, Any]]
: Retrieves list of available models. -
evaluate_response(request: str, response: str) -> bool
: Evaluates response quality.
-
search(query: str) -> List[Dict[str, Any]]
: Performs a web search and returns filtered, deduplicated results. -
get_web_content(url: str) -> str
: Retrieves and processes the content of a web page. -
is_url(text: str) -> bool
: Checks if the given text is a valid URL.
-
crawl(start_url: str, formats: List[str] = ["markdown"]) -> List[Dict[str, Any]]
: Crawls a website and returns its content in specified formats. -
scrape_page(url: str, formats: List[str]) -> Dict[str, Any]
: Scrapes a single page and returns its content in specified formats.
-
load_and_process_documents(source: str, chunk_size: int = 1000, chunk_overlap: int = 200, progress_callback: Callable[[int, int], None] = None, timeout: int = 300)
: Loads, processes, and indexes documents for RAG. -
query_documents(llm, query: str) -> Dict[str, Any]
: Queries the indexed documents using the provided language model.
-
generate_cot(problem: str) -> List[str]
: Generates Chain of Thought steps for a given problem. -
synthesize_response(cot_steps: List[str]) -> str
: Synthesizes a final answer from Chain of Thought steps. -
solve_problem(problem: str) -> str
: Completes the entire Chain of Thought process to solve a problem.
-
process_request(request: str, max_sources: int = None, verify: bool = False) -> str
: Processes a request autonomously. -
_select_best_response(verified_sources: List[tuple], verify: bool) -> str
: Selects the best response from verified sources. -
_generate_search_query(request: str) -> str
: Generates an optimized search query. -
_evaluate_response(request: str, response: str) -> bool
: Evaluates response quality.
This project is licensed under the MIT License. When using PocketGroq in your projects, please include a mention of J. Gravelle in your code and/or documentation.
Thank you for using PocketGroq! We hope this tool enhances your development process and enables you to create amazing AI-powered applications with ease. If you have any questions or need further assistance, don't hesitate to reach out to the community or check the documentation. Happy coding!
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for pocketgroq
Similar Open Source Tools
pocketgroq
PocketGroq is a tool that provides advanced functionalities for text generation, web scraping, web search, and AI response evaluation. It includes features like an Autonomous Agent for answering questions, web crawling and scraping capabilities, enhanced web search functionality, and flexible integration with Ollama server. Users can customize the agent's behavior, evaluate responses using AI, and utilize various methods for text generation, conversation management, and Chain of Thought reasoning. The tool offers comprehensive methods for different tasks, such as initializing RAG, error handling, and tool management. PocketGroq is designed to enhance development processes and enable the creation of AI-powered applications with ease.
clarifai-python
The Clarifai Python SDK offers a comprehensive set of tools to integrate Clarifai's AI platform to leverage computer vision capabilities like classification , detection ,segementation and natural language capabilities like classification , summarisation , generation , Q&A ,etc into your applications. With just a few lines of code, you can leverage cutting-edge artificial intelligence to unlock valuable insights from visual and textual content.
e2m
E2M is a Python library that can parse and convert various file types into Markdown format. It supports the conversion of multiple file formats, including doc, docx, epub, html, htm, url, pdf, ppt, pptx, mp3, and m4a. The ultimate goal of the E2M project is to provide high-quality data for Retrieval-Augmented Generation (RAG) and model training or fine-tuning. The core architecture consists of a Parser responsible for parsing various file types into text or image data, and a Converter responsible for converting text or image data into Markdown format.
Webscout
WebScout is a versatile tool that allows users to search for anything using Google, DuckDuckGo, and phind.com. It contains AI models, can transcribe YouTube videos, generate temporary email and phone numbers, has TTS support, webai (terminal GPT and open interpreter), and offline LLMs. It also supports features like weather forecasting, YT video downloading, temp mail and number generation, text-to-speech, advanced web searches, and more.
solana-agent-kit
Solana Agent Kit is an open-source toolkit designed for connecting AI agents to Solana protocols. It enables agents, regardless of the model used, to autonomously perform various Solana actions such as trading tokens, launching new tokens, lending assets, sending compressed airdrops, executing blinks, and more. The toolkit integrates core blockchain features like token operations, NFT management via Metaplex, DeFi integration, Solana blinks, AI integration features with LangChain, autonomous modes, and AI tools. It provides ready-to-use tools for blockchain operations, supports autonomous agent actions, and offers features like memory management, real-time feedback, and error handling. Solana Agent Kit facilitates tasks such as deploying tokens, creating NFT collections, swapping tokens, lending tokens, staking SOL, and sending SPL token airdrops via ZK compression. It also includes functionalities for fetching price data from Pyth and relies on key Solana and Metaplex libraries for its operations.
candle-vllm
Candle-vllm is an efficient and easy-to-use platform designed for inference and serving local LLMs, featuring an OpenAI compatible API server. It offers a highly extensible trait-based system for rapid implementation of new module pipelines, streaming support in generation, efficient management of key-value cache with PagedAttention, and continuous batching. The tool supports chat serving for various models and provides a seamless experience for users to interact with LLMs through different interfaces.
client-python
The Mistral Python Client is a tool inspired by cohere-python that allows users to interact with the Mistral AI API. It provides functionalities to access and utilize the AI capabilities offered by Mistral. Users can easily install the client using pip and manage dependencies using poetry. The client includes examples demonstrating how to use the API for various tasks, such as chat interactions. To get started, users need to obtain a Mistral API Key and set it as an environment variable. Overall, the Mistral Python Client simplifies the integration of Mistral AI services into Python applications.
acte
Acte is a framework designed to build GUI-like tools for AI Agents. It aims to address the issues of cognitive load and freedom degrees when interacting with multiple APIs in complex scenarios. By providing a graphical user interface (GUI) for Agents, Acte helps reduce cognitive load and constraints interaction, similar to how humans interact with computers through GUIs. The tool offers APIs for starting new sessions, executing actions, and displaying screens, accessible via HTTP requests or the SessionManager class.
langcorn
LangCorn is an API server that enables you to serve LangChain models and pipelines with ease, leveraging the power of FastAPI for a robust and efficient experience. It offers features such as easy deployment of LangChain models and pipelines, ready-to-use authentication functionality, high-performance FastAPI framework for serving requests, scalability and robustness for language processing applications, support for custom pipelines and processing, well-documented RESTful API endpoints, and asynchronous processing for faster response times.
funcchain
Funcchain is a Python library that allows you to easily write cognitive systems by leveraging Pydantic models as output schemas and LangChain in the backend. It provides a seamless integration of LLMs into your apps, utilizing OpenAI Functions or LlamaCpp grammars (json-schema-mode) for efficient structured output. Funcchain compiles the Funcchain syntax into LangChain runnables, enabling you to invoke, stream, or batch process your pipelines effortlessly.
ai00_server
AI00 RWKV Server is an inference API server for the RWKV language model based upon the web-rwkv inference engine. It supports VULKAN parallel and concurrent batched inference and can run on all GPUs that support VULKAN. No need for Nvidia cards!!! AMD cards and even integrated graphics can be accelerated!!! No need for bulky pytorch, CUDA and other runtime environments, it's compact and ready to use out of the box! Compatible with OpenAI's ChatGPT API interface. 100% open source and commercially usable, under the MIT license. If you are looking for a fast, efficient, and easy-to-use LLM API server, then AI00 RWKV Server is your best choice. It can be used for various tasks, including chatbots, text generation, translation, and Q&A.
aio-pika
Aio-pika is a wrapper around aiormq for asyncio and humans. It provides a completely asynchronous API, object-oriented API, transparent auto-reconnects with complete state recovery, Python 3.7+ compatibility, transparent publisher confirms support, transactions support, and complete type-hints coverage.
mergoo
Mergoo is a library for easily merging multiple LLM experts and efficiently training the merged LLM. With Mergoo, you can efficiently integrate the knowledge of different generic or domain-based LLM experts. Mergoo supports several merging methods, including Mixture-of-Experts, Mixture-of-Adapters, and Layer-wise merging. It also supports various base models, including LLaMa, Mistral, and BERT, and trainers, including Hugging Face Trainer, SFTrainer, and PEFT. Mergoo provides flexible merging for each layer and supports training choices such as only routing MoE layers or fully fine-tuning the merged LLM.
aiocache
Aiocache is an asyncio cache library that supports multiple backends such as memory, redis, and memcached. It provides a simple interface for functions like add, get, set, multi_get, multi_set, exists, increment, delete, clear, and raw. Users can easily install and use the library for caching data in Python applications. Aiocache allows for easy instantiation of caches and setup of cache aliases for reusing configurations. It also provides support for backends, serializers, and plugins to customize cache operations. The library offers detailed documentation and examples for different use cases and configurations.
llm.nvim
llm.nvim is a universal plugin for a large language model (LLM) designed to enable users to interact with LLM within neovim. Users can customize various LLMs such as gpt, glm, kimi, and local LLM. The plugin provides tools for optimizing code, comparing code, translating text, and more. It also supports integration with free models from Cloudflare, Github models, siliconflow, and others. Users can customize tools, chat with LLM, quickly translate text, and explain code snippets. The plugin offers a flexible window interface for easy interaction and customization.
For similar tasks
firecrawl
Firecrawl is an API service that takes a URL, crawls it, and converts it into clean markdown. It crawls all accessible subpages and provides clean markdown for each, without requiring a sitemap. The API is easy to use and can be self-hosted. It also integrates with Langchain and Llama Index. The Python SDK makes it easy to crawl and scrape websites in Python code.
pocketgroq
PocketGroq is a tool that provides advanced functionalities for text generation, web scraping, web search, and AI response evaluation. It includes features like an Autonomous Agent for answering questions, web crawling and scraping capabilities, enhanced web search functionality, and flexible integration with Ollama server. Users can customize the agent's behavior, evaluate responses using AI, and utilize various methods for text generation, conversation management, and Chain of Thought reasoning. The tool offers comprehensive methods for different tasks, such as initializing RAG, error handling, and tool management. PocketGroq is designed to enhance development processes and enable the creation of AI-powered applications with ease.
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
LocalAI
LocalAI is a free and open-source OpenAI alternative that acts as a drop-in replacement REST API compatible with OpenAI (Elevenlabs, Anthropic, etc.) API specifications for local AI inferencing. It allows users to run LLMs, generate images, audio, and more locally or on-premises with consumer-grade hardware, supporting multiple model families and not requiring a GPU. LocalAI offers features such as text generation with GPTs, text-to-audio, audio-to-text transcription, image generation with stable diffusion, OpenAI functions, embeddings generation for vector databases, constrained grammars, downloading models directly from Huggingface, and a Vision API. It provides a detailed step-by-step introduction in its Getting Started guide and supports community integrations such as custom containers, WebUIs, model galleries, and various bots for Discord, Slack, and Telegram. LocalAI also offers resources like an LLM fine-tuning guide, instructions for local building and Kubernetes installation, projects integrating LocalAI, and a how-tos section curated by the community. It encourages users to cite the repository when utilizing it in downstream projects and acknowledges the contributions of various software from the community.
AiTreasureBox
AiTreasureBox is a versatile AI tool that provides a collection of pre-trained models and algorithms for various machine learning tasks. It simplifies the process of implementing AI solutions by offering ready-to-use components that can be easily integrated into projects. With AiTreasureBox, users can quickly prototype and deploy AI applications without the need for extensive knowledge in machine learning or deep learning. The tool covers a wide range of tasks such as image classification, text generation, sentiment analysis, object detection, and more. It is designed to be user-friendly and accessible to both beginners and experienced developers, making AI development more efficient and accessible to a wider audience.
glide
Glide is a cloud-native LLM gateway that provides a unified REST API for accessing various large language models (LLMs) from different providers. It handles LLMOps tasks such as model failover, caching, key management, and more, making it easy to integrate LLMs into applications. Glide supports popular LLM providers like OpenAI, Anthropic, Azure OpenAI, AWS Bedrock (Titan), Cohere, Google Gemini, OctoML, and Ollama. It offers high availability, performance, and observability, and provides SDKs for Python and NodeJS to simplify integration.
jupyter-ai
Jupyter AI connects generative AI with Jupyter notebooks. It provides a user-friendly and powerful way to explore generative AI models in notebooks and improve your productivity in JupyterLab and the Jupyter Notebook. Specifically, Jupyter AI offers: * An `%%ai` magic that turns the Jupyter notebook into a reproducible generative AI playground. This works anywhere the IPython kernel runs (JupyterLab, Jupyter Notebook, Google Colab, Kaggle, VSCode, etc.). * A native chat UI in JupyterLab that enables you to work with generative AI as a conversational assistant. * Support for a wide range of generative model providers, including AI21, Anthropic, AWS, Cohere, Gemini, Hugging Face, NVIDIA, and OpenAI. * Local model support through GPT4All, enabling use of generative AI models on consumer grade machines with ease and privacy.
langchain_dart
LangChain.dart is a Dart port of the popular LangChain Python framework created by Harrison Chase. LangChain provides a set of ready-to-use components for working with language models and a standard interface for chaining them together to formulate more advanced use cases (e.g. chatbots, Q&A with RAG, agents, summarization, extraction, etc.). The components can be grouped into a few core modules: * **Model I/O:** LangChain offers a unified API for interacting with various LLM providers (e.g. OpenAI, Google, Mistral, Ollama, etc.), allowing developers to switch between them with ease. Additionally, it provides tools for managing model inputs (prompt templates and example selectors) and parsing the resulting model outputs (output parsers). * **Retrieval:** assists in loading user data (via document loaders), transforming it (with text splitters), extracting its meaning (using embedding models), storing (in vector stores) and retrieving it (through retrievers) so that it can be used to ground the model's responses (i.e. Retrieval-Augmented Generation or RAG). * **Agents:** "bots" that leverage LLMs to make informed decisions about which available tools (such as web search, calculators, database lookup, etc.) to use to accomplish the designated task. The different components can be composed together using the LangChain Expression Language (LCEL).
For similar jobs
weave
Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.
LLMStack
LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.
VisionCraft
The VisionCraft API is a free API for using over 100 different AI models. From images to sound.
kaito
Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.
PyRIT
PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.
tabby
Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.
spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.
Magick
Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.