Google_GenerativeAI

Unofficial C# .Net Google Generative AI SDK (Google Gemini) with function calls support and much more!

Stars: 71

Visit

Google GenerativeAI (Gemini) is an unofficial C# .Net SDK based on REST APIs for accessing Google Gemini models. It offers a complete rewrite of the previous SDK with improved performance, flexibility, and ease of use. The SDK seamlessly integrates with LangChain.net, providing easy methods for JSON-based interactions and function calling with Google Gemini models. It includes features like enhanced JSON mode handling, function calling with code generator, multi-modal functionality, Vertex AI support, multimodal live API, image generation and captioning, retrieval-augmented generation with Vertex RAG Engine and Google AQA, easy JSON handling, Gemini tools and function calling, multimodal live API, and more.

README:

Google GenerativeAI (Gemini)

Google GenerativeAI (Gemini)

Introduction

Unofficial C# .Net Google GenerativeAI (Gemini Pro) SDK based on REST APIs.
This new version is a complete rewrite of the previous SDK, designed to improve performance, flexibility, and ease of use. It seamlessly integrates with LangChain.net, providing easy methods for JSON-based interactions and function calling with Google Gemini models.

Highlights of this release include:

Complete Rewrite – The SDK has been entirely rebuilt for improved reliability and maintainability.
LangChain.net Support – Enables you to directly use this SDK within LangChain.net workflows.
Enhanced JSON Mode – Includes straightforward methods to handle Google Gemini’s JSON mode.
Function Calling with Code Generator – Simplifies function calling by providing a source generator that creates argument classes and extension methods automatically.
Multi-Modal Functionality – Provides methods to easily incorporate text, images, and other data for multimodal operations with Google Gemini.
Vertex AI Support – Introducing direct support for Vertex AI, including multiple authentication methods such as OAuth, Service Account, and ADC (Application Default Credentials).
Multimodal Live API - Enables real-time interaction with multimodal content (text, images, audio) for dynamic and responsive applications.
Build grounded AI: Simple APIs for RAG with Vertex RAG Engine and Google AQA.
NativeAOT/Trimming SDK is fully NativeAOT/Trimming Compatible since v2.4.0
New Packages – Modularizes features to help you tailor the SDK to your needs:

Package	Version	Description
Google_GenerativeAI.Tools		Provides function tooling and code generation using tryAgi CSharpToJsonSchema. Ideal for scenarios where you need to define functions and automate their JSON schema generation.
Google_GenerativeAI.Auth		Offers various Google authentication mechanisms, including OAuth, Service Account, and Application Default Credentials (ADC). Streamlines credential management.
Google_GenerativeAI.Microsoft		Implements the IChatClient interface from Microsoft.Extensions.AI, enabling seamless integration with Microsoft’s AI ecosystem and services.
Google_GenerativeAI.Web		Contains extension methods to integrate GenerativeAI into .NET web applications, simplifying setup for web projects that utilize Gemini models.
Google_GenerativeAI.Live		Enables Google Multimodal Live API integration for advanced realtime communication in .NET applications.

By merging the best of the old version with these new capabilities, the SDK provides a smoother developer experience and a wide range of features to leverage Google Gemini.

Usage

Use this library to access Google Gemini (Generative AI) models easily. You can start by installing the NuGet package and obtaining the necessary API key from your Google account.

Quick Start

Below are two common ways to initialize and use the SDK. For a full list of supported approaches, please refer to our Wiki Page

1. Using Google AI

Obtain an API Key
Visit Google AI Studio and generate your API key.
Install the NuGet Package
You can install the package via NuGet Package Manager:
```
Install-Package Google_GenerativeAI
```
Or using the .NET CLI:
```
dotnet add package Google_GenerativeAI
```
Initialize GoogleAI
Provide the API key when creating an instance of the GoogleAI class:
```
var googleAI = new GoogleAI("Your_API_Key");
```
Obtain a GenerativeModel
Create a generative model using a model name (for example, "models/gemini-1.5-flash"):
```
var model = googleAI.CreateGenerativeModel("models/gemini-1.5-flash");
```

Generate Content
Call the GenerateContentAsync method to get a response:

var response = await model.GenerateContentAsync("How is the weather today?");
Console.WriteLine(response.Text());

Full Code at a Glance

var apiKey = "YOUR_GOOGLE_API_KEY";
var googleAI = new GoogleAI(apiKey);

var googleModel = googleAI.CreateGenerativeModel("models/gemini-1.5-flash");
var googleResponse = await googleModel.GenerateContentAsync("How is the weather today?");
Console.WriteLine("Google AI Response:");
Console.WriteLine(googleResponse.Text());
Console.WriteLine();

2. Using Vertex AI

Install the Google Cloud SDK (CLI)
By default, Vertex AI uses Application Default Credentials (ADC). Follow Google’s official instructions to install and set up the Google Cloud CLI.
Initialize VertexAI
Once the SDK is set up locally, create an instance of the VertexAI class:
```
var vertexAI = new VertexAI();
```
Obtain a GenerativeModel
Just like with GoogleAI, choose a model name and create the generative model:
```
var vertexModel = vertexAI.CreateGenerativeModel("models/gemini-1.5-flash");
```

Generate Content
Use the GenerateContentAsync method to produce text:

var response = await vertexModel.GenerateContentAsync("Hello from Vertex AI!");
Console.WriteLine(response.Text());

Full code at a Glance

var vertexAI = new VertexAI(); //usage Google Cloud CLI's ADC to get the Access token
var vertexModel = vertexAI.CreateGenerativeModel("models/gemini-1.5-flash");
var vertexResponse = await vertexModel.GenerateContentAsync("Hello from Vertex AI!");
Console.WriteLine("Vertex AI Response:");
Console.WriteLine(vertexResponse.Text());

Chat Mode

For multi-turn, conversational use cases, you can start a chat session by calling the StartChat method on an instance of GenerativeModel. You can use any of the previously mentioned initialization methods (environment variables, direct constructor, configuration files, ADC, service accounts, etc.) to set up credentials for your AI service first. Then you would:

Create a GenerativeModel instance (e.g., via googleAI.CreateGenerativeModel(...) or vertexAI.CreateGenerativeModel(...)).
Call StartChat() on the generated model to initialize a conversation.
Use GenerateContentAsync(...) to exchange messages in the conversation.

Below is an example using the model name "gemini-1.5-flash":

// Example: Starting a chat session with a Google AI GenerativeModel

// 1) Initialize your AI instance (GoogleAI) with credentials or environment variables
var googleAI = new GoogleAI("YOUR_GOOGLE_API_KEY");

// 2) Create a GenerativeModel using the model name "gemini-1.5-flash"
var generativeModel = googleAI.CreateGenerativeModel("models/gemini-1.5-flash");

// 3) Start a chat session from the GenerativeModel
var chatSession = generativeModel.StartChat();

// 4) Send and receive messages
var firstResponse = await chatSession.GenerateContentAsync("Welcome to the Gemini 1.5 Flash chat!");
Console.WriteLine("First response: " + firstResponse.Text());

// Continue the conversation
var secondResponse = await chatSession.GenerateContentAsync("How can you help me with my AI development?");
Console.WriteLine("Second response: " + secondResponse.Text());

The same approach applies if you’re using Vertex AI:

// Example: Starting a chat session with a Vertex AI GenerativeModel

// 1) Initialize your AI instance (VertexAI) using one of the available authentication methods
var vertexAI = new VertexAI(); 

// 2) Create a GenerativeModel using "gemini-1.5-flash"
var generativeModel = vertexAI.CreateGenerativeModel("models/gemini-1.5-flash");

// 3) Start a chat
var chatSession = generativeModel.StartChat();

// 4) Send a chat message and read the response
var response = await chatSession.GenerateContentAsync("Hello from Vertex AI Chat using Gemini 1.5 Flash!");
Console.WriteLine(response.Text());

Each conversation preserves the context from previous messages, making it ideal for multi-turn or multi-step reasoning tasks. For more details, please check our Wiki.

Streaming

The GenerativeAI SDK supports streaming responses, allowing you to receive and process parts of the model's output as they become available, rather than waiting for the entire response to be generated. This is particularly useful for long-running generation tasks or for creating more responsive user interfaces.

StreamContentAsync(): Use this method for streaming text responses. It returns an IAsyncEnumerable<GenerateContentResponse>, which you can iterate over using await foreach.

Example (StreamContentAsync()):

using GenerativeAI;

// ... (Assume model is already initialized) ...

var prompt = "Write a long story about a cat.";
await foreach (var chunk in model.StreamContentAsync(prompt))
{
    Console.Write(chunk.Text); // Print each chunk as it arrives
}
Console.WriteLine(); // Newline after the complete response

Multimodal Capabilities with Overloaded `GenerateContentAsync` Methods

Google Gemini models can work with more than just text – they can handle images, audio, and videos too! This opens up a lot of possibilities for developers. The GenerativeAI SDK makes it super easy to use these features.

Below are several examples showcasing how to incorporate files into your AI prompts:

Directly providing a local file path.
Referencing a remote file with its MIME type.
Creating a request object to add multiple files (local or remote).

1. Generating Content with a Local File

If you have a file available locally, simply pass in the file path:

// Generate content from a local file (e.g., an image)
var response = await geminiModel.GenerateContentAsync(
    "Describe the details in this uploaded image",
    @"C:\path\to\local\image.jpg"
);

Console.WriteLine(response.Text());

2. Generating Content with a Remote File

When your file is hosted remotely, provide the file URI and its corresponding MIME type:

// Generate content from a remote file (e.g., a PDF)
var response = await geminiModel.GenerateContentAsync(
    "Summarize the information in this PDF document",
    "https://example.com/path/to/sample.pdf",
    "application/pdf"
);

Console.WriteLine(response.Text());

3. Initializing a Request and Attaching Files

For granular control, you can create a GenerateContentRequest, set a prompt, and attach one or more files (local or remote) before calling GenerateContentAsync:

// Create a request with a text prompt
var request = new GenerateContentRequest();
request.AddText("Describe what's in this document");

// Attach a local file
request.AddInlineFile(@"C:\files\example.png");

// Attach a remote file with its MIME type
request.AddRemoteFile("https://example.com/path/to/sample.pdf", "application/pdf");

// Generate the content with attached files
var response = await geminiModel.GenerateContentAsync(request);
Console.WriteLine(response.Text());

With these overloads and request-based approaches, you can seamlessly integrate additional file-based context into your prompts, enabling richer answers and unlocking more advanced AI-driven workflows.

Easy JSON Handling

The GenerativeAI SDK makes it simple to work with JSON data from Gemini. You have several ways some of those are:

1. Automatic JSON Handling:

Use GenerateObjectAsync<T> to directly get the deserialized object:

var myObject = await model.GenerateObjectAsync<SampleJsonClass>(request);

Use GenerateContentAsync and then ToObject<T> to deserialize the response:

var response = await model.GenerateContentAsync<SampleJsonClass>(request);
var myObject = response.ToObject<SampleJsonClass>();

Request: Use the UseJsonMode<T> extension method when creating your GenerateContentRequest. This tells the SDK to expect a JSON response of the specified type.
```
var request = new GenerateContentRequest();
request.UseJsonMode<SampleJsonClass>();
request.AddText("Give me a really good response.");
```

2. Manual JSON Parsing:

Request: Create a standard GenerateContentRequest.

var request = new GenerateContentRequest();
request.AddText("Give me some JSON.");

var request = new GenerateContentRequest();
request.GenerationConfig = new GenerationConfig()
        {
            ResponseMimeType = "application/json",
            ResponseSchema = new SampleJsonClass()
        }
request.AddText("Give me a really good response.");

Response: Use ExtractJsonBlocks() to get the raw JSON blocks from the response, and then use ToObject<T> to deserialize them.

var response = await model.GenerateContentAsync(request);
var jsonBlocks = response.ExtractJsonBlocks();
var myObjects = jsonBlocks.Select(block => block.ToObject<SampleJsonClass>());

These options give you flexibility in how you handle JSON data with the GenerativeAI SDK.

Read the wiki for more options.

Gemini Tools and Function Calling

The GenerativeAI SDK provides built-in tools to enhance Gemini's capabilities, including Google Search, Google Search Retrieval, and Code Execution. These tools allow Gemini to interact with the outside world and perform actions beyond generating text.

1. Inbuilt Tools (GoogleSearch, GoogleSearchRetrieval, and Code Execution):

You can easily enable or disable these tools by setting the corresponding properties on the GenerativeModel:

UseGoogleSearch: Enables or disables the Google Search tool.
UseGrounding: Enables or disables the Google Search Retrieval tool (often used for grounding responses in factual information).
UseCodeExecutionTool: Enables or disables the Code Execution tool.

// Example: Enabling Google Search and Code Execution
var model = new GenerativeModel(apiKey: "YOUR_API_KEY");
model.UseGoogleSearch = true;
model.UseCodeExecutionTool = true;

// Example: Disabling all inbuilt tools.
var model = new GenerativeModel(apiKey: "YOUR_API_KEY");
model.UseGoogleSearch = false;
model.UseGrounding = false; 
model.UseCodeExecutionTool = false;

2. Function Calling

Function calling lets you integrate custom functionality with Gemini by defining functions it can call. This requires the GenerativeAI.Tools package.

Setup:
1. Define an interface for your functions, using the [GenerateJsonSchema()] attribute.
2. Implement the interface.
3. Create tools and calls using AsTools() and AsCalls().
4. Create a GenericFunctionTool instance.
5. Add the tool to your GenerativeModel with AddFunctionTool().
FunctionCallingBehaviour: Customize behavior (e.g., auto-calling, error handling) using the GenerativeModel's FunctionCallingBehaviour property:
- FunctionEnabled (default: true): Enables/disables function calling.
- AutoCallFunction (default: true): Gemini automatically calls functions.
- AutoReplyFunction (default: true): Gemini automatically generates responses after function calls.
- AutoHandleBadFunctionCalls (default: false): Attempts to handle errors from incorrect calls

// Install-Package GenerativeAI.Tools
using GenerativeAI;
using GenerativeAI.Tools;

[GenerateJsonSchema()]
public interface IWeatherFunctions // Simplified Interface
{
    [Description("Get the current weather")]
    Weather GetCurrentWeather(string location);
}

public class WeatherService : IWeatherFunctions
{  // ... (Implementation - see full example in wiki) ...
    public Weather GetCurrentWeather(string location)
      =>  new Weather
        {
            Location = location,
            Temperature = 30.0,
            Unit = Unit.Celsius,
            Description = "Sunny",
        };
}

// --- Usage ---
var service = new WeatherService();
var tools = service.AsTools();
var calls = service.AsCalls();
var tool = new GenericFunctionTool(tools, calls);
var model = new GenerativeModel(apiKey: "YOUR_API_KEY");
model.AddFunctionTool(tool);
//Example for FunctionCallingBehaviour
model.FunctionCallingBehaviour = new FunctionCallingBehaviour { AutoCallFunction = false }; // Example

var result = await model.GenerateContentAsync("Weather in SF?");
Console.WriteLine(result.Text);

For more details and options, see the wiki.

Image Generation and Captioning

The Google_GenerativeAI SDK enables seamless integration with the Google Imagen image generator and the Image Text Model for tasks such as image captioning and visual question answering. It provides two model classes:

ImagenModel – For creating and generating entirely new images from text prompts.
ImageTextModel – For image captioning and visual question answering (VQA).

2. Using Imagen

Below is a snippet demonstrating how to initialize an image generation model and generate an image:

// 1. Create a Google AI client 
var googleAi = new GoogleAi(apiKey);

// 2. Create the Imagen model instance with your chosen model name.
var imageModel = googleAi.CreateImageModel("imagen-3.0-generate-002");

// 3. Generate images by providing a text prompt.
var response = await imageModel.GenerateImagesAsync("A peaceful forest clearing at sunrise");

// The response contains the generated image(s).

3. Using ImageTextModel

For captioning or visual QA tasks:

// 1. Create a Vertex AI client (example shown here).
var vertexAi = new VertexAI(projecId, region);

// 2. Instantiate the ImageTextModel.
var imageTextModel = vertexAi.CreateImageTextModel();

// 3. Generate captions or perform visual QA.
var captionResult = await imageTextModel.GenerateImageCaptionFromLocalFileAsync("path/to/local/image.jpg");
var vqaResult = await imageTextModel.VisualQuestionAnsweringFromLocalFileAsync("What is in the picture?", "path/to/local/image.jpg");

// Results now contain the model's captions or answers.

Multimodal Live API

The Google_GenerativeAI SDK now conveniently supports the Google Multimodal Live API through the Google_GenerativeAI.Live package. This module enables real-time, interactive conversations with Gemini models by leveraging WebSockets for text and audio data exchange. It’s ideally suited for building live, multimodal experiences, such as chat or voice-enabled applications.

Key Features

The Google_GenerativeAI.Live package provides a comprehensive implementation of the Multimodal Live API, offering:

Real-time Communication: Enables two-way transmission of text and audio data for live conversational experiences.
Modality Support: Allows model responses in multiple formats, including text and audio, depending on your configuration.
Asynchronous Operations: Fully asynchronous API ensures non-blocking calls for data transmission and reception.
Event-driven Design: Exposes events for key stages of interaction, including connection status, message reception, and audio streaming.
Audio Handling: Built-in support for streaming audio, with configurability for sample rates and headers.
Custom Tool Integration: Allows extending functionality by integrating custom tools directly into the interaction.
Robust Error Handling: Manages errors gracefully, along with reconnection support.
Flexible Configuration: Supports customizing generation configurations, safety settings, and system instructions before establishing a connection.

How to Get Started

To leverage the Multimodal Live API in your project, you’ll need to install the Google_GenerativeAI.Live NuGet package and create a MultiModalLiveClient. Here’s a quick overview:

Installation

Install the Google_GenerativeAI.Live package via NuGet:

Install-Package Google_GenerativeAI.Live

Example Usage

With the MultiModalLiveClient, interacting with the Multimodal Live API is simple:

using GenerativeAI.Live;

public async Task RunLiveConversationAsync()
{
    var client = new MultiModalLiveClient(
        platformAdapter: new GoogleAIPlatformAdapter(), 
        modelName: "gemini-1.5-flash-exp", 
        generationConfig: new GenerationConfig { ResponseModalities = { Modality.TEXT, Modality.AUDIO } }, 
        safetySettings: null, 
        systemInstruction: "You are a helpful assistant."
    );

    client.Connected += (s, e) => Console.WriteLine("Connected!");
    client.TextChunkReceived += (s, e) => Console.WriteLine($"Text chunk: {e.TextChunk}");
    client.AudioChunkReceived += (s, e) => Console.WriteLine($"Audio received: {e.Buffer.Length} bytes");
    
    await client.ConnectAsync();

    await client.SentTextAsync("Hello, Gemini! What's the weather like?");
    await client.SendAudioAsync(audioData: new byte[] { /* audio bytes */ }, audioContentType: "audio/pcm; rate=16000");

    Console.ReadKey();
    await client.DisconnectAsync();
}

Events

The MultiModalLiveClient provides various events to plug into for real-time updates during interaction:

Connected: Triggered when the connection is successfully established.
Disconnected: Triggered when the connection ends gracefully or abruptly.
MessageReceived: Raised when any data (text or audio) is received.
TextChunkReceived: Triggered when chunks of text are received in real time.
AudioChunkReceived: Triggered when audio chunks are streamed from Gemini.
AudioReceiveCompleted: Triggered when a complete audio response is received.
ErrorOccurred: Raised when an error occurs during interaction or connection.

For more details and examples, refer to the wiki.

Retrieval Augmented Generation

The Google_GenerativeAI library makes implementing Retrieval-Augmented Generation (RAG) incredibly easy. RAG combines the strengths of Large Language Models (LLMs) with the precision of information retrieval. Instead of relying solely on the LLM's pre-trained knowledge, a RAG system first retrieves relevant information from a knowledge base ( a "corpus" of documents) and then uses that information to augment the LLM's response. This allows the LLM to generate more accurate, factual, and context-aware answers.

Vertex RAG Engine Support: Grounded and Contextual AI

Enhance your Gemini applications with the power of the Vertex RAG Engine. This integration enables your applications to provide more accurate and contextually relevant responses by leveraging your existing knowledge bases.

Benefits:

Improved Accuracy: Gemini can now access and utilize your corpora and vector databases for more grounded responses.
Scalable Knowledge: Supports various backends (Pinecone, Weaviate, etc.) and data sources (Slack, Drive, etc.) for flexible knowledge management.
Simplified RAG Implementation: Seamlessly integrate RAG capabilities into your Gemini workflows.

Code Example:

// Initialize VertexAI with your platform configuration.
var vertexAi = new VertexAI(GetTestVertexAIPlatform());

// Create an instance of the RAG manager for corpus operations.
var ragManager = vertexAi.CreateRagManager();

// Create a new corpus for your knowledge base.
// Optional: Use overload methods to specify a vector database (Pinecone, Weaviate, etc.).
// If no specific vector database is provided, a default one will be used.
var corpus = await ragManager.CreateCorpusAsync("My New Corpus", "My description");

// Import data into the corpus from a specified source.
// Replace GcsSource with the appropriate source (Jira, Slack, SharePoint, etc.) and configure it.
var fileSource = new GcsSource() { /* Configure your GcsSource here */ };
await ragManager.ImportFilesAsync(corpus.Name, fileSource);

// Create a Gemini generative model configured to use the created corpus for RAG.
// The corpusIdForRag parameter links the model to your knowledge base.
var model = vertexAi.CreateGenerativeModel(VertexAIModels.Gemini.Gemini2Flash, corpusIdForRag: corpus.Name);

// Generate content by querying the model.
// The model will retrieve relevant information from the corpus to provide a grounded response.
var result = await model.GenerateContentAsync("query related to the corpus");

Learn More:

For a deeper dive into using the Vertex RAG Engine with the Google_GenerativeAI SDK, please visit the wiki page.

Semantic Search Retrieval (RAG) with Google AQA

This library integrates Google's Attributed Question Answering (AQA) model to enhance Retrieval-Augmented Generation (RAG) through powerful semantic search and question answering. AQA excels at understanding the intent behind a question and retrieving the most relevant passages from your corpus.

Key Features:

Deep Semantic Understanding: AQA moves beyond keyword matching, capturing the nuanced meaning of queries and documents for more accurate retrieval.
Answer Confidence with Attribution: AQA provides an "Answerable Probability" score, giving you insight into the model's confidence in the retrieved answer.
Simplified RAG Integration: The Google_GenerativeAI library offers a straightforward API for corpus creation, document ingestion, and semantic search execution.

Get Started with Google AQA for RAG:

For a comprehensive guide on implementing semantic search retrieval with Google AQA, refer to the wiki page.

Coming Soon

The following features are planned for future releases of the GenerativeAI SDK:

[x] Semantic Search Retrieval (RAG): Use Gemini as a Retrieval-Augmented Generation (RAG) system, allowing it to incorporate information from external sources into its responses. (Released on 20th Feb, 2025)
[x] Image Generation: Generate images with imagen from text prompts, expanding Gemini's capabilities beyond text and code. (Added on 24th Feb, 2025)
[x] Multimodal Live API: Bidirectional Multimodal Live Chat with Gemini 2.0 Flash (Added on 22nd Fed, 2025)
[ ] Model Tuning: Customize Gemini models to better suit your specific needs and data.

Credits

Thanks to HavenDV for LangChain.net SDK

Explore the Wiki

Dive deeper into the GenerativeAI SDK! The wiki is your comprehensive resource for:

Detailed Guides: Step-by-step tutorials on various features and use cases.
Advanced Usage: Learn about advanced configuration options, error handling, and best practices.
Complete Code Examples: Find ready-to-run code snippets and larger project examples.

We encourage you to explore the wiki to unlock the full potential of the GenerativeAI SDK!

Feel free to open an issue or submit a pull request if you encounter any problems or want to propose improvements! Your feedback helps us continue to refine and expand this SDK.

For Tasks:

Click tags to check more tools for each tasks

generate content start chat session streaming responses handle json data implement retrieval-augmented generation

For Jobs:

software developer ai engineer data scientist machine learning engineer web developer

Alternative AI tools for Google_GenerativeAI

Similar Open Source Tools

Google_GenerativeAI

github

: 71

flutter_gemma

Flutter Gemma is a family of lightweight, state-of-the art open models that bring the power of Google's Gemma language models directly to Flutter applications. It allows for local execution on user devices, supports both iOS and Android platforms, and offers LoRA support for tailored AI behavior. The tool provides a simple interface for integrating Gemma models into Flutter projects, enabling advanced AI capabilities without relying on external servers. Users can easily download pre-trained Gemma models, fine-tune them for specific use cases, and customize behavior using LoRA weights. The tool supports model and LoRA weight management, model initialization, response generation, and chat scenarios, with considerations for model size, LoRA weights, and production app deployment.

github

: 82

gpt-computer-assistant

GPT Computer Assistant (GCA) is an open-source framework designed to build vertical AI agents that can automate tasks on Windows, macOS, and Ubuntu systems. It leverages the Model Context Protocol (MCP) and its own modules to mimic human-like actions and achieve advanced capabilities. With GCA, users can empower themselves to accomplish more in less time by automating tasks like updating dependencies, analyzing databases, and configuring cloud security settings.

github

: 5.8k

py-llm-core

PyLLMCore is a light-weighted interface with Large Language Models with native support for llama.cpp, OpenAI API, and Azure deployments. It offers a Pythonic API that is simple to use, with structures provided by the standard library dataclasses module. The high-level API includes the assistants module for easy swapping between models. PyLLMCore supports various models including those compatible with llama.cpp, OpenAI, and Azure APIs. It covers use cases such as parsing, summarizing, question answering, hallucinations reduction, context size management, and tokenizing. The tool allows users to interact with language models for tasks like parsing text, summarizing content, answering questions, reducing hallucinations, managing context size, and tokenizing text.

github

: 118

emigo

github

: 91

lihil

Lihil is a performant, productive, and professional web framework designed to make Python the mainstream programming language for web development. It is 100% test covered and strictly typed, offering fast performance, ergonomic API, and built-in solutions for common problems. Lihil is suitable for enterprise web development, delivering robust and scalable solutions with best practices in microservice architecture and related patterns. It features dependency injection, OpenAPI docs generation, error response generation, data validation, message system, testability, and strong support for AI features. Lihil is ASGI compatible and uses starlette as its ASGI toolkit, ensuring compatibility with starlette classes and middlewares. The framework follows semantic versioning and has a roadmap for future enhancements and features.

github

: 72

hugging-chat-api

Unofficial HuggingChat Python API for creating chatbots, supporting features like image generation, web search, memorizing context, and changing LLMs. Users can log in, chat with the ChatBot, perform web searches, create new conversations, manage conversations, switch models, get conversation info, use assistants, and delete conversations. The API also includes a CLI mode with various commands for interacting with the tool. Users are advised not to use the application for high-stakes decisions or advice and to avoid high-frequency requests to preserve server resources.

github

: 780

llm-leaderboard

Nejumi Leaderboard 3 is a comprehensive evaluation platform for large language models, assessing general language capabilities and alignment aspects. The evaluation framework includes metrics for language processing, translation, summarization, information extraction, reasoning, mathematical reasoning, entity extraction, knowledge/question answering, English, semantic analysis, syntactic analysis, alignment, ethics/moral, toxicity, bias, truthfulness, and robustness. The repository provides an implementation guide for environment setup, dataset preparation, configuration, model configurations, and chat template creation. Users can run evaluation processes using specified configuration files and log results to the Weights & Biases project.

github

: 67

MCPSharp

MCPSharp is a .NET library that helps build Model Context Protocol (MCP) servers and clients for AI assistants and models. It allows creating MCP-compliant tools, connecting to existing MCP servers, exposing .NET methods as MCP endpoints, and handling MCP protocol details seamlessly. With features like attribute-based API, JSON-RPC support, parameter validation, and type conversion, MCPSharp simplifies the development of AI capabilities in applications through standardized interfaces.

github

: 142

RainbowGPT

RainbowGPT is a versatile tool that offers a range of functionalities, including Stock Analysis for financial decision-making, MySQL Management for database navigation, and integration of AI technologies like GPT-4 and ChatGlm3. It provides a user-friendly interface suitable for all skill levels, ensuring seamless information flow and continuous expansion of emerging technologies. The tool enhances adaptability, creativity, and insight, making it a valuable asset for various projects and tasks.

github

: 86

LLM-as-HH

LLM-as-HH is a codebase that accompanies the paper ReEvo: Large Language Models as Hyper-Heuristics with Reflective Evolution. It introduces Language Hyper-Heuristics (LHHs) that leverage LLMs for heuristic generation with minimal manual intervention and open-ended heuristic spaces. Reflective Evolution (ReEvo) is presented as a searching framework that emulates the reflective design approach of human experts while surpassing human capabilities with scalable LLM inference, Internet-scale domain knowledge, and powerful evolutionary search. The tool can improve various algorithms on problems like Traveling Salesman Problem, Capacitated Vehicle Routing Problem, Orienteering Problem, Multiple Knapsack Problems, Bin Packing Problem, and Decap Placement Problem in both black-box and white-box settings.

github

: 78

llm_aided_ocr

The LLM-Aided OCR Project is an advanced system that enhances Optical Character Recognition (OCR) output by leveraging natural language processing techniques and large language models. It offers features like PDF to image conversion, OCR using Tesseract, error correction using LLMs, smart text chunking, markdown formatting, duplicate content removal, quality assessment, support for local and cloud-based LLMs, asynchronous processing, detailed logging, and GPU acceleration. The project provides detailed technical overview, text processing pipeline, LLM integration, token management, quality assessment, logging, configuration, and customization. It requires Python 3.12+, Tesseract OCR engine, PDF2Image library, PyTesseract, and optional OpenAI or Anthropic API support for cloud-based LLMs. The installation process involves setting up the project, installing dependencies, and configuring environment variables. Users can place a PDF file in the project directory, update input file path, and run the script to generate post-processed text. The project optimizes processing with concurrent processing, context preservation, and adaptive token management. Configuration settings include choosing between local or API-based LLMs, selecting API provider, specifying models, and setting context size for local LLMs. Output files include raw OCR output and LLM-corrected text. Limitations include performance dependency on LLM quality and time-consuming processing for large documents.

github

: 1.4k

GraphRAG-SDK

Build fast and accurate GenAI applications with GraphRAG SDK, a specialized toolkit for building Graph Retrieval-Augmented Generation (GraphRAG) systems. It integrates knowledge graphs, ontology management, and state-of-the-art LLMs to deliver accurate, efficient, and customizable RAG workflows. The SDK simplifies the development process by automating ontology creation, knowledge graph agent creation, and query handling, enabling users to interact and query their knowledge graphs effectively. It supports multi-agent systems and orchestrates agents specialized in different domains. The SDK is optimized for FalkorDB, ensuring high performance and scalability for large-scale applications. By leveraging knowledge graphs, it enables semantic relationships and ontology-driven queries that go beyond standard vector similarity, enhancing retrieval-augmented generation capabilities.

github

: 292

agents-starter

A starter template for building AI-powered chat agents using Cloudflare's Agent platform, powered by agents-sdk. It provides a foundation for creating interactive chat experiences with AI, complete with a modern UI and tool integration capabilities. Features include interactive chat interface with AI, built-in tool system with human-in-the-loop confirmation, advanced task scheduling, dark/light theme support, real-time streaming responses, state management, and chat history. Prerequisites include a Cloudflare account and OpenAI API key. The project structure includes components for chat UI implementation, chat agent logic, tool definitions, and helper functions. Customization guide covers adding new tools, modifying the UI, and example use cases for customer support, development assistant, data analysis assistant, personal productivity assistant, and scheduling assistant.

github

: 503

scalene

Scalene is a high-performance CPU, GPU, and memory profiler for Python that provides detailed information and runs faster than many other profilers. It incorporates AI-powered proposed optimizations, allowing users to generate optimization suggestions by clicking on specific lines or regions of code. Scalene separates time spent in Python from native code, highlights hotspots, and identifies memory usage per line. It supports GPU profiling on NVIDIA-based systems and detects memory leaks. Users can generate reduced profiles, profile specific functions using decorators, and suspend/resume profiling for background processes. Scalene is available as a pip or conda package and works on various platforms. It offers features like profiling at the line level, memory trends, copy volume reporting, and leak detection.

github

: 12.5k

Aiwnios

Aiwnios is a HolyC Compiler/Runtime designed for 64-bit ARM, RISCV, and x86 machines, including Apple M1 Macs, with plans for supporting other architectures in the future. The project is currently a work in progress, with regular updates and improvements planned. Aiwnios includes a sockets API (currently tested on FreeBSD) and a HolyC assembler accessible through AARCH64. The heart of Aiwnios lies in `arm_backend.c`, where the compiler is located, and a powerful AARCH64 assembler in `arm64_asm.c`. The compiler uses reverse Polish notation and statements are reversed. The developer manual is intended for developers working on the C side, providing detailed explanations of the source code.

github

: 89

For similar tasks

floneum

Floneum is a graph editor that makes it easy to develop your own AI workflows. It uses large language models (LLMs) to run AI models locally, without any external dependencies or even a GPU. This makes it easy to use LLMs with your own data, without worrying about privacy. Floneum also has a plugin system that allows you to improve the performance of LLMs and make them work better for your specific use case. Plugins can be used in any language that supports web assembly, and they can control the output of LLMs with a process similar to JSONformer or guidance.

github

: 1.8k

llm-answer-engine

This repository contains the code and instructions needed to build a sophisticated answer engine that leverages the capabilities of Groq, Mistral AI's Mixtral, Langchain.JS, Brave Search, Serper API, and OpenAI. Designed to efficiently return sources, answers, images, videos, and follow-up questions based on user queries, this project is an ideal starting point for developers interested in natural language processing and search technologies.

github

: 4.5k

discourse-ai

Discourse AI is a plugin for the Discourse forum software that uses artificial intelligence to improve the user experience. It can automatically generate content, moderate posts, and answer questions. This can free up moderators and administrators to focus on other tasks, and it can help to create a more engaging and informative community.

github

: 83

Gemini-API

Gemini-API is a reverse-engineered asynchronous Python wrapper for Google Gemini web app (formerly Bard). It provides features like persistent cookies, ImageFx support, extension support, classified outputs, official flavor, and asynchronous operation. The tool allows users to generate contents from text or images, have conversations across multiple turns, retrieve images in response, generate images with ImageFx, save images to local files, use Gemini extensions, check and switch reply candidates, and control log level.

github

: 160

genai-for-marketing

This repository provides a deployment guide for utilizing Google Cloud's Generative AI tools in marketing scenarios. It includes step-by-step instructions, examples of crafting marketing materials, and supplementary Jupyter notebooks. The demos cover marketing insights, audience analysis, trendspotting, content search, content generation, and workspace integration. Users can access and visualize marketing data, analyze trends, improve search experience, and generate compelling content. The repository structure includes backend APIs, frontend code, sample notebooks, templates, and installation scripts.

github

: 220

generative-ai-dart

The Google Generative AI SDK for Dart enables developers to utilize cutting-edge Large Language Models (LLMs) for creating language applications. It provides access to the Gemini API for generating content using state-of-the-art models. Developers can integrate the SDK into their Dart or Flutter applications to leverage powerful AI capabilities. It is recommended to use the SDK for server-side API calls to ensure the security of API keys and protect against potential key exposure in mobile or web apps.

github

: 462

Dough

Dough is a tool for crafting videos with AI, allowing users to guide video generations with precision using images and example videos. Users can create guidance frames, assemble shots, and animate them by defining parameters and selecting guidance videos. The tool aims to help users make beautiful and unique video creations, providing control over the generation process. Setup instructions are available for Linux and Windows platforms, with detailed steps for installation and running the app.

github

: 395

ChaKt-KMP

ChaKt is a multiplatform app built using Kotlin and Compose Multiplatform to demonstrate the use of Generative AI SDK for Kotlin Multiplatform to generate content using Google's Generative AI models. It features a simple chat based user interface and experience to interact with AI. The app supports mobile, desktop, and web platforms, and is built with Kotlin Multiplatform, Kotlin Coroutines, Compose Multiplatform, Generative AI SDK, Calf - File picker, and BuildKonfig. Users can contribute to the project by following the guidelines in CONTRIBUTING.md. The app is licensed under the MIT License.

github

: 124

For similar jobs

sweep

Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

github

: 7.1k

teams-ai

The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

github

: 502

ai-guide

This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

github

: 159

classifai

Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

github

: 620

chatbot-ui

Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

github

: 27.7k

BricksLLM

BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

github

: 953

uAgents

uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

github

: 1.3k

griptape

Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.

github

: 2.2k

Google_GenerativeAI

README:

Google GenerativeAI (Gemini)

Introduction

Usage

Quick Start

1. Using Google AI

2. Using Vertex AI

Chat Mode

Each conversation preserves the context from previous messages, making it ideal for multi-turn or multi-step reasoning tasks. For more details, please check our Wiki.

Streaming

Multimodal Capabilities with Overloaded GenerateContentAsync Methods

1. Generating Content with a Local File

2. Generating Content with a Remote File

3. Initializing a Request and Attaching Files

Easy JSON Handling

Gemini Tools and Function Calling

For more details and options, see the wiki.

Image Generation and Captioning

2. Using Imagen

3. Using ImageTextModel

Multimodal Live API

Key Features

How to Get Started

Installation

Example Usage

Events

For more details and examples, refer to the wiki.

Retrieval Augmented Generation

Vertex RAG Engine Support: Grounded and Contextual AI

Semantic Search Retrieval (RAG) with Google AQA

Coming Soon

Credits

Explore the Wiki

For Tasks:

For Jobs:

Alternative AI tools for Google_GenerativeAI

Similar Open Source Tools

Google_GenerativeAI

flutter_gemma

gpt-computer-assistant

py-llm-core

emigo

lihil

hugging-chat-api

llm-leaderboard

MCPSharp

RainbowGPT

LLM-as-HH

llm_aided_ocr

GraphRAG-SDK

agents-starter

scalene

Aiwnios

For similar tasks

floneum

llm-answer-engine

discourse-ai

Gemini-API

genai-for-marketing

generative-ai-dart

Dough

ChaKt-KMP

For similar jobs

sweep

teams-ai

ai-guide

classifai

chatbot-ui

BricksLLM

uAgents

griptape

Multimodal Capabilities with Overloaded `GenerateContentAsync` Methods