python-aiplatform
A Python SDK for Vertex AI, a fully managed, end-to-end platform for data science and machine learning.
Stars: 855
The Vertex AI SDK for Python is a library that provides a convenient way to use the Vertex AI API. It offers a high-level interface for creating and managing Vertex AI resources, such as datasets, models, and endpoints. The SDK also provides support for training and deploying custom models, as well as using AutoML models. With the Vertex AI SDK for Python, you can quickly and easily build and deploy machine learning models on Vertex AI.
README:
|GA| |pypi| |versions| |unit-tests| |system-tests| |sample-tests|
Vertex AI_: Google Vertex AI is an integrated suite of machine learning tools and services for building and using ML models with AutoML or custom code. It offers both novices and experts the best workbench for the entire machine learning development lifecycle.
-
Client Library Documentation_ -
Product Documentation_
.. |GA| image:: https://img.shields.io/badge/support-ga-gold.svg :target: https://github.com/googleapis/google-cloud-python/blob/main/README.rst#general-availability .. |pypi| image:: https://img.shields.io/pypi/v/google-cloud-aiplatform.svg :target: https://pypi.org/project/google-cloud-aiplatform/ .. |versions| image:: https://img.shields.io/pypi/pyversions/google-cloud-aiplatform.svg :target: https://pypi.org/project/google-cloud-aiplatform/ .. |unit-tests| image:: https://storage.googleapis.com/cloud-devrel-public/python-aiplatform/badges/sdk-unit-tests.svg :target: https://storage.googleapis.com/cloud-devrel-public/python-aiplatform/badges/sdk-unit-tests.html .. |system-tests| image:: https://storage.googleapis.com/cloud-devrel-public/python-aiplatform/badges/sdk-system-tests.svg :target: https://storage.googleapis.com/cloud-devrel-public/python-aiplatform/badges/sdk-system-tests.html .. |sample-tests| image:: https://storage.googleapis.com/cloud-devrel-public/python-aiplatform/badges/sdk-sample-tests.svg :target: https://storage.googleapis.com/cloud-devrel-public/python-aiplatform/badges/sdk-sample-tests.html .. _Vertex AI: https://cloud.google.com/vertex-ai/docs .. _Client Library Documentation: https://cloud.google.com/python/docs/reference/aiplatform/latest .. _Product Documentation: https://cloud.google.com/vertex-ai/docs
Installation
.. code-block:: console
pip install google-cloud-aiplatform
With :code:`uv`:
.. code-block:: console
uv pip install google-cloud-aiplatform
Generative AI in the Vertex AI SDK
To use Gen AI features from the Vertex AI SDK, you can instantiate a Vertex SDK client with the following:
.. code-block:: Python
import vertexai
from vertexai import types
# Instantiate GenAI client from Vertex SDK
# Replace with your project ID and location
client = vertexai.Client(project='my-project', location='us-central1')
See the examples below for guidance on how to use specific features supported by the Vertex SDK client.
Gen AI Evaluation ^^^^^^^^^^^^^^^^^
To run evaluation, first generate model responses from a set of prompts.
.. code-block:: Python
import pandas as pd
prompts_df = pd.DataFrame({
"prompt": [
"What is the capital of France?",
"Write a haiku about a cat.",
"Write a Python function to calculate the factorial of a number.",
"Translate 'How are you?' to French.",
],
"reference": [
"Paris",
"Sunbeam on the floor,\nA furry puddle sleeping,\nTwitching tail tells tales.",
"def factorial(n):\n if n < 0:\n return 'Factorial does not exist for negative numbers'\n elif n == 0:\n return 1\n else:\n fact = 1\n i = 1\n while i <= n:\n fact *= i\n i += 1\n return fact",
"Comment ça va ?",
]
})
inference_results = client.evals.run_inference(
model="gemini-2.5-flash-preview-05-20",
src=prompts_df
)
Then run evaluation by providing the inference results and specifying the metric types.
.. code-block:: Python
eval_result = client.evals.evaluate(
dataset=inference_results,
metrics=[
types.Metric(name='exact_match'),
types.Metric(name='rouge_l_sum'),
types.RubricMetric.TEXT_QUALITY,
]
)
Agent Engine with Agent Development Kit (ADK) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
First, define a function that looks up the exchange rate:
.. code-block:: Python
def get_exchange_rate(
currency_from: str = "USD",
currency_to: str = "EUR",
currency_date: str = "latest",
):
"""Retrieves the exchange rate between two currencies on a specified date.
Uses the Frankfurter API (https://api.frankfurter.app/) to obtain
exchange rate data.
Returns:
dict: A dictionary containing the exchange rate information.
Example: {"amount": 1.0, "base": "USD", "date": "2023-11-24",
"rates": {"EUR": 0.95534}}
"""
import requests
response = requests.get(
f"https://api.frankfurter.app/{currency_date}",
params={"from": currency_from, "to": currency_to},
)
return response.json()
Next, define an ADK Agent:
.. code-block:: Python
from google.adk.agents import Agent
from vertexai.agent_engines import AdkApp
app = AdkApp(agent=Agent(
model="gemini-2.0-flash", # Required.
name='currency_exchange_agent', # Required.
tools=[get_exchange_rate], # Optional.
))
Test the agent locally using US dollars and Swedish Krona:
.. code-block:: Python
async for event in app.async_stream_query(
user_id="user-id",
message="What is the exchange rate from US dollars to SEK today?",
):
print(event)
To deploy the agent to Agent Engine:
.. code-block:: Python
remote_app = client.agent_engines.create(
agent=app,
config={
"requirements": ["google-cloud-aiplatform[agent_engines,adk]"],
},
)
You can also run queries against the deployed agent:
.. code-block:: Python
async for event in remote_app.async_stream_query(
user_id="user-id",
message="What is the exchange rate from US dollars to SEK today?",
):
print(event)
Prompt optimization ^^^^^^^^^^^^^^^^^^^
To do a zero-shot prompt optimization, use the optimize
method.
.. code-block:: Python
prompt = "Generate system instructions for a question-answering assistant"
response = client.prompts.optimize(prompt=prompt)
print(response.raw_text_response)
if response.parsed_response:
print(response.parsed_response.suggested_prompt)
To call the data-driven prompt optimization, call the launch_optimization_job method.
In this case however, we need to provide a VAPO (Vertex AI Prompt Optimizer) config. This config needs to
have either service account or project number and the config path.
Please refer to this tutorial
for more details on config parameter.
.. code-block:: Python
from vertexai import types
project_number = PROJECT_NUMBER # replace with your project number
service_account = f"{project_number}[email protected]"
vapo_config = vertexai.types.PromptOptimizerVAPOConfig(
config_path="gs://your-bucket/config.json",
service_account_project_number=project_number,
wait_for_completion=False
)
# Set up logging to see the progress of the optimization job
logging.basicConfig(encoding='utf-8', level=logging.INFO, force=True)
result = client.prompts.launch_optimization_job(method=types.PromptOptimizerMethod.VAPO, config=vapo_config)
If you want to use the project number instead of the service account, you can instead use the following config:
.. code-block:: Python
vapo_config = vertexai.types.PromptOptimizerVAPOConfig(
config_path="gs://your-bucket/config.json",
service_account_project_number=project_number,
wait_for_completion=False
)
We can also call the launch_optimization_job method asynchronously.
.. code-block:: Python
await client.aio.prompts.launch_optimization_job(method=types.PromptOptimizerMethod.VAPO, config=vapo_config)
Prompt Management ^^^^^^^^^^^^^^^^^
First define your prompt as a dictionary or types.Prompt object. Then call create_prompt.
.. code-block:: Python
prompt = {
"prompt_data": {
"contents": [{"parts": [{"text": "Hello, {name}! How are you?"}]}],
"system_instruction": {"parts": [{"text": "Please answer in a short sentence."}]},
"variables": [
{"name": {"text": "Alice"}},
],
"model": "gemini-2.5-flash",
},
}
prompt_resource = client.prompts.create(
prompt=prompt,
)
Note that you can also use the types.Prompt object to define your prompt. Some of the types used to do this are from the Gen AI SDK.
.. code-block:: Python
import types
from google.genai import types as genai_types
prompt = types.Prompt(
prompt_data=types.PromptData(
contents=[genai_types.Content(parts=[genai_types.Part(text="Hello, {name}! How are you?")])],
system_instruction=genai_types.Content(parts=[genai_types.Part(text="Please answer in a short sentence.")]),
variables=[
{"name": genai_types.Part(text="Alice")},
],
model="gemini-2.5-flash",
),
)
Retrieve a prompt by calling get() with the prompt_id.
.. code-block:: Python
retrieved_prompt = client.prompts.get(
prompt_id=prompt_resource.prompt_id,
)
After creating or retrieving a prompt, you can call generate_content() with that prompt using the Gen AI SDK.
The following uses a utility function available on Prompt objects to transform a Prompt object into a list of Content objects for use with generate_content. To run this you need to have the Gen AI SDK installed, which you can do via pip install google-genai.
.. code-block:: Python
from google import genai
from google.genai import types as genai_types
# Create a Client in the Gen AI SDK
genai_client = genai.Client(vertexai=True, project="your-project", location="your-location")
# Call generate_content() with the prompt
response = genai_client.models.generate_content(
model=retrieved_prompt.prompt_data.model,
contents=retrieved_prompt.assemble_contents(),
)
.. note::
The following Generative AI modules in the Vertex AI SDK are deprecated as of June 24, 2025 and will be removed on June 24, 2026:
vertexai.generative_models, vertexai.language_models, vertexai.vision_models, vertexai.tuning, vertexai.caching. Please use the
Google Gen AI SDK to access these features. See
the migration guide for details.
You can continue using all other Vertex AI SDK modules, as they are the recommended way to use the API.
In order to use this library, you first need to go through the following steps:
-
Select or create a Cloud Platform project._ -
Enable billing for your project._ -
Enable the Vertex AI API._ -
Setup Authentication._
.. _Select or create a Cloud Platform project.: https://console.cloud.google.com/project .. _Enable billing for your project.: https://cloud.google.com/billing/docs/how-to/modify-project#enable_billing_for_a_project .. _Enable the Vertex AI API.: https://cloud.google.com/vertex-ai/docs/start/use-vertex-ai-python-sdk .. _Setup Authentication.: https://googleapis.dev/python/google-api-core/latest/auth.html
Supported Python Versions ^^^^^^^^^^^^^^^^^^^^^^^^^ Python >= 3.9
Deprecated Python Versions ^^^^^^^^^^^^^^^^^^^^^^^^^^ Python <= 3.8.
The last version of this library compatible with Python 3.8 is google-cloud-aiplatform==1.90.0.
The last version of this library compatible with Python 3.7 is google-cloud-aiplatform==1.31.1.
The last version of this library compatible with Python 3.6 is google-cloud-aiplatform==1.12.1.
Overview
This section provides a brief overview of the Vertex AI SDK for Python. You can also reference the notebooks in `vertex-ai-samples`_ for examples.
.. _vertex-ai-samples: https://github.com/GoogleCloudPlatform/vertex-ai-samples/tree/main/notebooks/community/sdk
All publicly available SDK features can be found in the :code:`google/cloud/aiplatform` directory.
Under the hood, Vertex SDK builds on top of GAPIC, which stands for Google API CodeGen.
The GAPIC library code sits in :code:`google/cloud/aiplatform_v1` and :code:`google/cloud/aiplatform_v1beta1`,
and it is auto-generated from Google's service proto files.
For most developers' programmatic needs, they can follow these steps to figure out which libraries to import:
1. Look through :code:`google/cloud/aiplatform` first -- Vertex SDK's APIs will almost always be easier to use and more concise comparing with GAPIC
2. If the feature that you are looking for cannot be found there, look through :code:`aiplatform_v1` to see if it's available in GAPIC
3. If it is still in beta phase, it will be available in :code:`aiplatform_v1beta1`
If none of the above scenarios could help you find the right tools for your task, please feel free to open a github issue and send us a feature request.
Importing
^^^^^^^^^
Vertex AI SDK resource based functionality can be used by importing the following namespace:
.. code-block:: Python
from google.cloud import aiplatform
Initialization
^^^^^^^^^^^^^^
Initialize the SDK to store common configurations that you use with the SDK.
.. code-block:: Python
aiplatform.init(
# your Google Cloud Project ID or number
# environment default used is not set
project='my-project',
# the Vertex AI region you will use
# defaults to us-central1
location='us-central1',
# Google Cloud Storage bucket in same region as location
# used to stage artifacts
staging_bucket='gs://my_staging_bucket',
# custom google.auth.credentials.Credentials
# environment default credentials used if not set
credentials=my_credentials,
# customer managed encryption key resource name
# will be applied to all Vertex AI resources if set
encryption_spec_key_name=my_encryption_key_name,
# the name of the experiment to use to track
# logged metrics and parameters
experiment='my-experiment',
# description of the experiment above
experiment_description='my experiment description'
)
Datasets
^^^^^^^^
Vertex AI provides managed tabular, text, image, and video datasets. In the SDK, datasets can be used downstream to
train models.
To create a tabular dataset:
.. code-block:: Python
my_dataset = aiplatform.TabularDataset.create(
display_name="my-dataset", gcs_source=['gs://path/to/my/dataset.csv'])
You can also create and import a dataset in separate steps:
.. code-block:: Python
from google.cloud import aiplatform
my_dataset = aiplatform.TextDataset.create(
display_name="my-dataset")
my_dataset.import_data(
gcs_source=['gs://path/to/my/dataset.csv'],
import_schema_uri=aiplatform.schema.dataset.ioformat.text.multi_label_classification
)
To get a previously created Dataset:
.. code-block:: Python
dataset = aiplatform.ImageDataset('projects/my-project/location/us-central1/datasets/{DATASET_ID}')
Vertex AI supports a variety of dataset schemas. References to these schemas are available under the
:code:`aiplatform.schema.dataset` namespace. For more information on the supported dataset schemas please refer to the
`Preparing data docs`_.
.. _Preparing data docs: https://cloud.google.com/ai-platform-unified/docs/datasets/prepare
Training
^^^^^^^^
The Vertex AI SDK for Python allows you train Custom and AutoML Models.
You can train custom models using a custom Python script, custom Python package, or container.
**Preparing Your Custom Code**
Vertex AI custom training enables you to train on Vertex AI datasets and produce Vertex AI models. To do so your
script must adhere to the following contract:
It must read datasets from the environment variables populated by the training service:
.. code-block:: Python
os.environ['AIP_DATA_FORMAT'] # provides format of data
os.environ['AIP_TRAINING_DATA_URI'] # uri to training split
os.environ['AIP_VALIDATION_DATA_URI'] # uri to validation split
os.environ['AIP_TEST_DATA_URI'] # uri to test split
Please visit `Using a managed dataset in a custom training application`_ for a detailed overview.
.. _Using a managed dataset in a custom training application: https://cloud.google.com/vertex-ai/docs/training/using-managed-datasets
It must write the model artifact to the environment variable populated by the training service:
.. code-block:: Python
os.environ['AIP_MODEL_DIR']
**Running Training**
.. code-block:: Python
job = aiplatform.CustomTrainingJob(
display_name="my-training-job",
script_path="training_script.py",
container_uri="us-docker.pkg.dev/vertex-ai/training/tf-cpu.2-2:latest",
requirements=["gcsfs==0.7.1"],
model_serving_container_image_uri="us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-2:latest",
)
model = job.run(my_dataset,
replica_count=1,
machine_type="n1-standard-4",
accelerator_type='NVIDIA_TESLA_K80',
accelerator_count=1)
In the code block above `my_dataset` is managed dataset created in the `Dataset` section above. The `model` variable is a managed Vertex AI model that can be deployed or exported.
AutoMLs
-------
The Vertex AI SDK for Python supports AutoML tabular, image, text, video, and forecasting.
To train an AutoML tabular model:
.. code-block:: Python
dataset = aiplatform.TabularDataset('projects/my-project/location/us-central1/datasets/{DATASET_ID}')
job = aiplatform.AutoMLTabularTrainingJob(
display_name="train-automl",
optimization_prediction_type="regression",
optimization_objective="minimize-rmse",
)
model = job.run(
dataset=dataset,
target_column="target_column_name",
training_fraction_split=0.6,
validation_fraction_split=0.2,
test_fraction_split=0.2,
budget_milli_node_hours=1000,
model_display_name="my-automl-model",
disable_early_stopping=False,
)
Models
------
To get a model:
.. code-block:: Python
model = aiplatform.Model('/projects/my-project/locations/us-central1/models/{MODEL_ID}')
To upload a model:
.. code-block:: Python
model = aiplatform.Model.upload(
display_name='my-model',
artifact_uri="gs://python/to/my/model/dir",
serving_container_image_uri="us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-2:latest",
)
To deploy a model:
.. code-block:: Python
endpoint = model.deploy(machine_type="n1-standard-4",
min_replica_count=1,
max_replica_count=5
machine_type='n1-standard-4',
accelerator_type='NVIDIA_TESLA_K80',
accelerator_count=1)
Please visit `Importing models to Vertex AI`_ for a detailed overview:
.. _Importing models to Vertex AI: https://cloud.google.com/vertex-ai/docs/general/import-model
Model Evaluation
----------------
The Vertex AI SDK for Python currently supports getting model evaluation metrics for all AutoML models.
To list all model evaluations for a model:
.. code-block:: Python
model = aiplatform.Model('projects/my-project/locations/us-central1/models/{MODEL_ID}')
evaluations = model.list_model_evaluations()
To get the model evaluation resource for a given model:
.. code-block:: Python
model = aiplatform.Model('projects/my-project/locations/us-central1/models/{MODEL_ID}')
# returns the first evaluation with no arguments, you can also pass the evaluation ID
evaluation = model.get_model_evaluation()
eval_metrics = evaluation.metrics
You can also create a reference to your model evaluation directly by passing in the resource name of the model evaluation:
.. code-block:: Python
evaluation = aiplatform.ModelEvaluation(
evaluation_name='projects/my-project/locations/us-central1/models/{MODEL_ID}/evaluations/{EVALUATION_ID}')
Alternatively, you can create a reference to your evaluation by passing in the model and evaluation IDs:
.. code-block:: Python
evaluation = aiplatform.ModelEvaluation(
evaluation_name={EVALUATION_ID},
model_id={MODEL_ID})
Batch Prediction
----------------
To create a batch prediction job:
.. code-block:: Python
model = aiplatform.Model('/projects/my-project/locations/us-central1/models/{MODEL_ID}')
batch_prediction_job = model.batch_predict(
job_display_name='my-batch-prediction-job',
instances_format='csv',
machine_type='n1-standard-4',
gcs_source=['gs://path/to/my/file.csv'],
gcs_destination_prefix='gs://path/to/my/batch_prediction/results/',
service_account='[email protected]'
)
You can also create a batch prediction job asynchronously by including the `sync=False` argument:
.. code-block:: Python
batch_prediction_job = model.batch_predict(..., sync=False)
# wait for resource to be created
batch_prediction_job.wait_for_resource_creation()
# get the state
batch_prediction_job.state
# block until job is complete
batch_prediction_job.wait()
Endpoints
---------
To create an endpoint:
.. code-block:: Python
endpoint = aiplatform.Endpoint.create(display_name='my-endpoint')
To deploy a model to a created endpoint:
.. code-block:: Python
model = aiplatform.Model('/projects/my-project/locations/us-central1/models/{MODEL_ID}')
endpoint.deploy(model,
min_replica_count=1,
max_replica_count=5,
machine_type='n1-standard-4',
accelerator_type='NVIDIA_TESLA_K80',
accelerator_count=1)
To get predictions from endpoints:
.. code-block:: Python
endpoint.predict(instances=[[6.7, 3.1, 4.7, 1.5], [4.6, 3.1, 1.5, 0.2]])
To undeploy models from an endpoint:
.. code-block:: Python
endpoint.undeploy_all()
To delete an endpoint:
.. code-block:: Python
endpoint.delete()
Pipelines
---------
To create a Vertex AI Pipeline run and monitor until completion:
.. code-block:: Python
# Instantiate PipelineJob object
pl = PipelineJob(
display_name="My first pipeline",
# Whether or not to enable caching
# True = always cache pipeline step result
# False = never cache pipeline step result
# None = defer to cache option for each pipeline component in the pipeline definition
enable_caching=False,
# Local or GCS path to a compiled pipeline definition
template_path="pipeline.json",
# Dictionary containing input parameters for your pipeline
parameter_values=parameter_values,
# GCS path to act as the pipeline root
pipeline_root=pipeline_root,
)
# Execute pipeline in Vertex AI and monitor until completion
pl.run(
# Email address of service account to use for the pipeline run
# You must have iam.serviceAccounts.actAs permission on the service account to use it
service_account=service_account,
# Whether this function call should be synchronous (wait for pipeline run to finish before terminating)
# or asynchronous (return immediately)
sync=True
)
To create a Vertex AI Pipeline without monitoring until completion, use `submit` instead of `run`:
.. code-block:: Python
# Instantiate PipelineJob object
pl = PipelineJob(
display_name="My first pipeline",
# Whether or not to enable caching
# True = always cache pipeline step result
# False = never cache pipeline step result
# None = defer to cache option for each pipeline component in the pipeline definition
enable_caching=False,
# Local or GCS path to a compiled pipeline definition
template_path="pipeline.json",
# Dictionary containing input parameters for your pipeline
parameter_values=parameter_values,
# GCS path to act as the pipeline root
pipeline_root=pipeline_root,
)
# Submit the Pipeline to Vertex AI
pl.submit(
# Email address of service account to use for the pipeline run
# You must have iam.serviceAccounts.actAs permission on the service account to use it
service_account=service_account,
)
Explainable AI: Get Metadata
----------------------------
To get metadata in dictionary format from TensorFlow 1 models:
.. code-block:: Python
from google.cloud.aiplatform.explain.metadata.tf.v1 import saved_model_metadata_builder
builder = saved_model_metadata_builder.SavedModelMetadataBuilder(
'gs://python/to/my/model/dir', tags=[tf.saved_model.tag_constants.SERVING]
)
generated_md = builder.get_metadata()
To get metadata in dictionary format from TensorFlow 2 models:
.. code-block:: Python
from google.cloud.aiplatform.explain.metadata.tf.v2 import saved_model_metadata_builder
builder = saved_model_metadata_builder.SavedModelMetadataBuilder('gs://python/to/my/model/dir')
generated_md = builder.get_metadata()
To use Explanation Metadata in endpoint deployment and model upload:
.. code-block:: Python
explanation_metadata = builder.get_metadata_protobuf()
# To deploy a model to an endpoint with explanation
model.deploy(..., explanation_metadata=explanation_metadata)
# To deploy a model to a created endpoint with explanation
endpoint.deploy(..., explanation_metadata=explanation_metadata)
# To upload a model with explanation
aiplatform.Model.upload(..., explanation_metadata=explanation_metadata)
Cloud Profiler
----------------------------
Cloud Profiler allows you to profile your remote Vertex AI Training jobs on demand and visualize the results in Vertex AI Tensorboard.
To start using the profiler with TensorFlow, update your training script to include the following:
.. code-block:: Python
from google.cloud.aiplatform.training_utils import cloud_profiler
...
cloud_profiler.init()
Next, run the job with with a Vertex AI TensorBoard instance. For full details on how to do this, visit https://cloud.google.com/vertex-ai/docs/experiments/tensorboard-overview
Finally, visit your TensorBoard in your Google Cloud Console, navigate to the "Profile" tab, and click the `Capture Profile` button. This will allow users to capture profiling statistics for the running jobs.
Next Steps
- Read the
Client Library Documentation_ for Vertex AI API to see other available methods on the client. - Read the
Vertex AI API Product documentation_ to learn more about the product and see How-to Guides. - View this
README_ to see the full list of Cloud APIs that we cover.
.. _Vertex AI API Product documentation: https://cloud.google.com/vertex-ai/docs .. _README: https://github.com/googleapis/google-cloud-python/blob/main/README.rst
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for python-aiplatform
Similar Open Source Tools
python-aiplatform
The Vertex AI SDK for Python is a library that provides a convenient way to use the Vertex AI API. It offers a high-level interface for creating and managing Vertex AI resources, such as datasets, models, and endpoints. The SDK also provides support for training and deploying custom models, as well as using AutoML models. With the Vertex AI SDK for Python, you can quickly and easily build and deploy machine learning models on Vertex AI.
wllama
Wllama is a WebAssembly binding for llama.cpp, a high-performance and lightweight language model library. It enables you to run inference directly on the browser without the need for a backend or GPU. Wllama provides both high-level and low-level APIs, allowing you to perform various tasks such as completions, embeddings, tokenization, and more. It also supports model splitting, enabling you to load large models in parallel for faster download. With its Typescript support and pre-built npm package, Wllama is easy to integrate into your React Typescript projects.
langserve
LangServe helps developers deploy `LangChain` runnables and chains as a REST API. This library is integrated with FastAPI and uses pydantic for data validation. In addition, it provides a client that can be used to call into runnables deployed on a server. A JavaScript client is available in LangChain.js.
chromem-go
chromem-go is an embeddable vector database for Go with a Chroma-like interface and zero third-party dependencies. It enables retrieval augmented generation (RAG) and similar embeddings-based features in Go apps without the need for a separate database. The focus is on simplicity and performance for common use cases, allowing querying of documents with minimal memory allocations. The project is in beta and may introduce breaking changes before v1.0.0.
hydraai
Generate React components on-the-fly at runtime using AI. Register your components, and let Hydra choose when to show them in your App. Hydra development is still early, and patterns for different types of components and apps are still being developed. Join the discord to chat with the developers. Expects to be used in a NextJS project. Components that have function props do not work.
VMind
VMind is an open-source solution for intelligent visualization, providing an intelligent chart component based on LLM by VisActor. It allows users to create chart narrative works with natural language interaction, edit charts through dialogue, and export narratives as videos or GIFs. The tool is easy to use, scalable, supports various chart types, and offers one-click export functionality. Users can customize chart styles, specify themes, and aggregate data using LLM models. VMind aims to enhance efficiency in creating data visualization works through dialogue-based editing and natural language interaction.
otto-m8
otto-m8 is a flowchart based automation platform designed to run deep learning workloads with minimal to no code. It provides a user-friendly interface to spin up a wide range of AI models, including traditional deep learning models and large language models. The tool deploys Docker containers of workflows as APIs for integration with existing workflows, building AI chatbots, or standalone applications. Otto-m8 operates on an Input, Process, Output paradigm, simplifying the process of running AI models into a flowchart-like UI.
continuous-eval
Open-Source Evaluation for LLM Applications. `continuous-eval` is an open-source package created for granular and holistic evaluation of GenAI application pipelines. It offers modularized evaluation, a comprehensive metric library covering various LLM use cases, the ability to leverage user feedback in evaluation, and synthetic dataset generation for testing pipelines. Users can define their own metrics by extending the Metric class. The tool allows running evaluation on a pipeline defined with modules and corresponding metrics. Additionally, it provides synthetic data generation capabilities to create user interaction data for evaluation or training purposes.
swe-rl
SWE-RL is the official codebase for the paper 'SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution'. It is the first approach to scale reinforcement learning based LLM reasoning for real-world software engineering, leveraging open-source software evolution data and rule-based rewards. The code provides prompt templates and the implementation of the reward function based on sequence similarity. Agentless Mini, a part of SWE-RL, builds on top of Agentless with improvements like fast async inference, code refactoring for scalability, and support for using multiple reproduction tests for reranking. The tool can be used for localization, repair, and reproduction test generation in software engineering tasks.
instructor-js
Instructor is a Typescript library for structured extraction in Typescript, powered by llms, designed for simplicity, transparency, and control. It stands out for its simplicity, transparency, and user-centric design. Whether you're a seasoned developer or just starting out, you'll find Instructor's approach intuitive and steerable.
suno-api
Suno AI API is an open-source project that allows developers to integrate the music generation capabilities of Suno.ai into their own applications. The API provides a simple and convenient way to generate music, lyrics, and other audio content using Suno.ai's powerful AI models. With Suno AI API, developers can easily add music generation functionality to their apps, websites, and other projects.
exospherehost
Exosphere is an open source infrastructure designed to run AI agents at scale for large data and long running flows. It allows developers to define plug and playable nodes that can be run on a reliable backbone in the form of a workflow, with features like dynamic state creation at runtime, infinite parallel agents, persistent state management, and failure handling. This enables the deployment of production agents that can scale beautifully to build robust autonomous AI workflows.
aiconfig
AIConfig is a framework that makes it easy to build generative AI applications for production. It manages generative AI prompts, models and model parameters as JSON-serializable configs that can be version controlled, evaluated, monitored and opened in a local editor for rapid prototyping. It allows you to store and iterate on generative AI behavior separately from your application code, offering a streamlined AI development workflow.
GraphRAG-SDK
Build fast and accurate GenAI applications with GraphRAG SDK, a specialized toolkit for building Graph Retrieval-Augmented Generation (GraphRAG) systems. It integrates knowledge graphs, ontology management, and state-of-the-art LLMs to deliver accurate, efficient, and customizable RAG workflows. The SDK simplifies the development process by automating ontology creation, knowledge graph agent creation, and query handling, enabling users to interact and query their knowledge graphs effectively. It supports multi-agent systems and orchestrates agents specialized in different domains. The SDK is optimized for FalkorDB, ensuring high performance and scalability for large-scale applications. By leveraging knowledge graphs, it enables semantic relationships and ontology-driven queries that go beyond standard vector similarity, enhancing retrieval-augmented generation capabilities.
magika
Magika is a novel AI-powered file type detection tool that relies on deep learning to provide accurate detection. It employs a custom, highly optimized model to enable precise file identification within milliseconds. Trained on a dataset of ~100M samples across 200+ content types, achieving an average ~99% accuracy. Used at scale by Google to improve user safety by routing files to security scanners. Available as a command line tool in Rust, Python API, and bindings for Rust, JavaScript/TypeScript, and GoLang.
clarifai-python-grpc
This is the official Clarifai gRPC Python client for interacting with their recognition API. Clarifai offers a platform for data scientists, developers, researchers, and enterprises to utilize artificial intelligence for image, video, and text analysis through computer vision and natural language processing. The client allows users to authenticate, predict concepts in images, and access various functionalities provided by the Clarifai API. It follows a versioning scheme that aligns with the backend API updates and includes specific instructions for installation and troubleshooting. Users can explore the Clarifai demo, sign up for an account, and refer to the documentation for detailed information.
For similar tasks
ai-on-gke
This repository contains assets related to AI/ML workloads on Google Kubernetes Engine (GKE). Run optimized AI/ML workloads with Google Kubernetes Engine (GKE) platform orchestration capabilities. A robust AI/ML platform considers the following layers: Infrastructure orchestration that support GPUs and TPUs for training and serving workloads at scale Flexible integration with distributed computing and data processing frameworks Support for multiple teams on the same infrastructure to maximize utilization of resources
ray
Ray is a unified framework for scaling AI and Python applications. It consists of a core distributed runtime and a set of AI libraries for simplifying ML compute, including Data, Train, Tune, RLlib, and Serve. Ray runs on any machine, cluster, cloud provider, and Kubernetes, and features a growing ecosystem of community integrations. With Ray, you can seamlessly scale the same code from a laptop to a cluster, making it easy to meet the compute-intensive demands of modern ML workloads.
labelbox-python
Labelbox is a data-centric AI platform for enterprises to develop, optimize, and use AI to solve problems and power new products and services. Enterprises use Labelbox to curate data, generate high-quality human feedback data for computer vision and LLMs, evaluate model performance, and automate tasks by combining AI and human-centric workflows. The academic & research community uses Labelbox for cutting-edge AI research.
djl
Deep Java Library (DJL) is an open-source, high-level, engine-agnostic Java framework for deep learning. It is designed to be easy to get started with and simple to use for Java developers. DJL provides a native Java development experience and allows users to integrate machine learning and deep learning models with their Java applications. The framework is deep learning engine agnostic, enabling users to switch engines at any point for optimal performance. DJL's ergonomic API interface guides users with best practices to accomplish deep learning tasks, such as running inference and training neural networks.
mlflow
MLflow is a platform to streamline machine learning development, including tracking experiments, packaging code into reproducible runs, and sharing and deploying models. MLflow offers a set of lightweight APIs that can be used with any existing machine learning application or library (TensorFlow, PyTorch, XGBoost, etc), wherever you currently run ML code (e.g. in notebooks, standalone applications or the cloud). MLflow's current components are:
* `MLflow Tracking
tt-metal
TT-NN is a python & C++ Neural Network OP library. It provides a low-level programming model, TT-Metalium, enabling kernel development for Tenstorrent hardware.
burn
Burn is a new comprehensive dynamic Deep Learning Framework built using Rust with extreme flexibility, compute efficiency and portability as its primary goals.
awsome-distributed-training
This repository contains reference architectures and test cases for distributed model training with Amazon SageMaker Hyperpod, AWS ParallelCluster, AWS Batch, and Amazon EKS. The test cases cover different types and sizes of models as well as different frameworks and parallel optimizations (Pytorch DDP/FSDP, MegatronLM, NemoMegatron...).
For similar jobs
lollms-webui
LoLLMs WebUI (Lord of Large Language Multimodal Systems: One tool to rule them all) is a user-friendly interface to access and utilize various LLM (Large Language Models) and other AI models for a wide range of tasks. With over 500 AI expert conditionings across diverse domains and more than 2500 fine tuned models over multiple domains, LoLLMs WebUI provides an immediate resource for any problem, from car repair to coding assistance, legal matters, medical diagnosis, entertainment, and more. The easy-to-use UI with light and dark mode options, integration with GitHub repository, support for different personalities, and features like thumb up/down rating, copy, edit, and remove messages, local database storage, search, export, and delete multiple discussions, make LoLLMs WebUI a powerful and versatile tool.
Azure-Analytics-and-AI-Engagement
The Azure-Analytics-and-AI-Engagement repository provides packaged Industry Scenario DREAM Demos with ARM templates (Containing a demo web application, Power BI reports, Synapse resources, AML Notebooks etc.) that can be deployed in a customer’s subscription using the CAPE tool within a matter of few hours. Partners can also deploy DREAM Demos in their own subscriptions using DPoC.
minio
MinIO is a High Performance Object Storage released under GNU Affero General Public License v3.0. It is API compatible with Amazon S3 cloud storage service. Use MinIO to build high performance infrastructure for machine learning, analytics and application data workloads.
mage-ai
Mage is an open-source data pipeline tool for transforming and integrating data. It offers an easy developer experience, engineering best practices built-in, and data as a first-class citizen. Mage makes it easy to build, preview, and launch data pipelines, and provides observability and scaling capabilities. It supports data integrations, streaming pipelines, and dbt integration.
AiTreasureBox
AiTreasureBox is a versatile AI tool that provides a collection of pre-trained models and algorithms for various machine learning tasks. It simplifies the process of implementing AI solutions by offering ready-to-use components that can be easily integrated into projects. With AiTreasureBox, users can quickly prototype and deploy AI applications without the need for extensive knowledge in machine learning or deep learning. The tool covers a wide range of tasks such as image classification, text generation, sentiment analysis, object detection, and more. It is designed to be user-friendly and accessible to both beginners and experienced developers, making AI development more efficient and accessible to a wider audience.
tidb
TiDB is an open-source distributed SQL database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads. It is MySQL compatible and features horizontal scalability, strong consistency, and high availability.
airbyte
Airbyte is an open-source data integration platform that makes it easy to move data from any source to any destination. With Airbyte, you can build and manage data pipelines without writing any code. Airbyte provides a library of pre-built connectors that make it easy to connect to popular data sources and destinations. You can also create your own connectors using Airbyte's no-code Connector Builder or low-code CDK. Airbyte is used by data engineers and analysts at companies of all sizes to build and manage their data pipelines.
labelbox-python
Labelbox is a data-centric AI platform for enterprises to develop, optimize, and use AI to solve problems and power new products and services. Enterprises use Labelbox to curate data, generate high-quality human feedback data for computer vision and LLMs, evaluate model performance, and automate tasks by combining AI and human-centric workflows. The academic & research community uses Labelbox for cutting-edge AI research.