
vertex-ai-mlops
Google Cloud Platform Vertex AI end-to-end workflows for machine learning operations
Stars: 526

Vertex AI is a platform for end-to-end model development. It consist of core components that make the processes of MLOps possible for design patterns of all types.
README:
View on GitHub |
2024 UPDATE: This repository is evolving from end-to-end workflows for various frameworks into an MLOps focused approach for development of predictive and generative AI operations. The new approach is being developed in the MLOps folder. Once it nears completion, the content in this repository will be rearranged into the following structure:
- MLOps
- Pipelines
- Experiments
- Feature Store
- Model Monitoring
- ...
- Applied Examples
- Forecasting
- GenAI
- ...
- Framework Workflows
- BigQuery ML
- TensorFlow
- scikit-learn
- ...
- ...
This is the original readme from prior to the shift in this repository. After the content rearrangement is complete and this information is incorporate above it will be removed.
I want to share and enable Vertex AI from Google Cloud with you. The goal here is to share a comprehensive set of end-to-end workflows for machine learning that each cover the range of data to model to serving and managing - even automating the flow. Regardless of your data type, skill level or framework preferences you will find something helpful here. You can even ask for what you need and I might be able to work it into updates!

Click to watch on YouTube
Click here to see current playlist for this repository
To better understand which content is most helpful to users, this repository uses tracking pixels in each markdown (.md
) and notebook (.ipynb
) file. No user or location data is collected. The only information captured is that the content was rendered/viewed which gives us a daily count of usage. Please share any concerns you have with this in repositories discussion board and I am happy to also provide a branch without the tracking.
A script is provided to remove this tracking from your local copy of this repository in the file pixel_remove.py
in the folder pixel. This readme also has the complete code for creating the tracking in case you want to use replicate it or just understand it in greater detail.
This repository is presented as workflows using, primarily, interactive python notebooks .ipynb
. Why? These are easy to review, share, and move. They contain elements for both code and narrative. The narrative can be written with plain text, Markdown and/or HTML which makes providing visual explanations easy. This reinforces the goal of this repository: information that is easily accessible, portable, and great for starting points in your own work.
In notebooks, execution is driven from the locally attached compute. In this repository that means the Python code is currently running in the notebooks compute. The code in this repository heavily leans on orchestrating services in GCP rather than doing data compute in the local environment to the notebook. That means these notebooks are designed to run on minimal machine sizes, like n1-standard2
even. The heavy work of training and serving is done on Vertex AI, BigQuery, and other Google Cloud services. You will even find notebooks that author code, and then deploy the code in services like Vertex AI Custon Training and Vertex AI Pipelines.
There are sections that use other languages, like R, as well as creating files that are external to the notebooks: dockerfile
, .py
scripts and modules, etc.
The code in this repository is opinionated. It is not completely production ready as well as not simply ad-hoc exploration. It aims to the right of the continum of exploration to deployment: 'hello-world' to CI/CD/CT. In our data science daily work we might think of the process as:
In explore, everything is code as you go. At some point in this exploration ideas find value and need to be developed.
In develop, the approach is usually something like:
- make it work
- get a working end to end flow
- clean it up
- revisit the code and remove parts that are no longer needed and reorder based on what is learned
- generalize it
- parameterize
- use functions
- control flow: start using logic to check for out of bound conditions
- optimize it
- better use of data structures to handle data usage during execution
- consider execution timing and optimize for the simoultaneous goal of readability (= maintainability) and compute time
In many cases, getting from development to deployment is simple:
- schedule a notebook - a lot like skipping the develop stage
- deploy a pipeline
- create a cloud function
But, inevitably, as a workflow proves value it requires more effort before you deploy:
- error handling
- unit testing
- move from specialized code to generalized code:
- use classes
- control environment handling
So where does the code in the repository fall? In the late develop phase with strong readability and adaptibility.
- Tables: Tabular, structured data in rows and columns
- Language: Text for translation and/or understanding
- Vision: Images
- Video
- Use Pre-Trained APIs
- Automate building Custom Models
- End-to-end Custom ML with core tools in the framework of your choice
This is a series of workflow demonstrations that use the same data source to build and deploy the same machine learning model with different frameworks and automation. These are meant to help get started in understanding and learning Vertex AI and provide starting points for new projects.
The demonstrations focus on workflows and don't delve into the specifics of ML frameworks other than how to integrate and automate with Vertex AI. Let me know if you have ideas for more workflows or details to include!
To understand the contents of this repository, the following charts uncover the groupings of the content.
Direction |
---|
![]() |
Pre-Trained Models |
|
|||||
---|---|---|---|---|---|---|
Data Type | Pre-Trained Model | Prediction Types | Related Solutions | |||
Text |
Cloud Translation API |
Detect, Translate |
Cloud Text-to-Speech |
AutoML Translation |
||
Cloud Natural Language API |
Entities (Identify and label), Sentiment, Entity Sentiment, Syntax, Content Classification |
Healthceare Natural Language API |
AutoML Text |
|||
Image |
Cloud Vision API |
Crop Hint, OCR, Face Detect, Image Properties, Label Detect, Landmark Detect, Logo Detect, Object Localization, Safe Search, Web Detect |
|
AutoML Image |
||
Audio |
Cloud Media Translation API |
Real-time speech translation |
Cloud Speech-to-Text |
|||
Video |
Cloud Video Intelligence API |
Label Detect*, Shot Detect*, Explicit Content Detect*, Speech Transcription, Object Tracking*, Text Detect, Logo Detect, Face Detect, Person Detect, Celebrity Recognition |
Vertex AI Vision |
AutoML Video |
AutoML | ||
---|---|---|
Data Type |
AutoML |
Prediction Types |
Table |
AutoML Tables |
|
Image |
AutoML Image |
|
Video |
AutoML Video |
|
Text |
AutoML Text |
|
Text |
AutoML Translation |
Translation |
This work focuses on cases where you have training data:
Overview |
---|
![]() |
AutoML | BigQuery ML | Vertex AI | Forecasting with AutoML, BigQuery ML, OSS Prophet |
---|---|---|---|
![]() |
![]() |
![]() |
![]() |
Vetex AI is a platform for end-to-end model development. It consist of core components that make the processes of MLOps possible for design patterns of all types.
Many Vertex AI resources can be viewed and monitored directly in the GCP Console. Vertex AI resources are primarily created, and modified with the Vertex AI API.
The API is accessible from:
- the command line with
gcloud ai
, - REST,
- gRPC,
- or the client libraries (built on top of gRPC) for
The notebooks in this repository primarily use the Python client aiplatform
. There is occasional use of aiplatform.gapic
, aiplatform_v1
and aiplatform_v1beta1
.
For the full details on the APIs versions and layers and how/when to use each, see this helpful note.
Install the Vertex AI Python Client
pip install google-cloud-aiplatform
Example Usage: Listing all Models in Vertex AI Model Registry
PROJECT = 'statmike-mlops-349915'
REGION = 'us-central1'
# List all models for project in region with: aiplatform
from google.cloud import aiplatform
aiplatform.init(project = PROJECT, location = REGION)
model_list = aiplatform.Model.list()
The demonstrations are presented in a series of notebooks that are best run in JupyterLab. These can be reviewed directly in this repository on GitHub or cloned to your Jupyter instance on Vertex AI Workbench Instances.
Select the files and review them directly in the browser or IDE of your choice. This can be helpful for general understanding and selecting sections to copy/paste to your project. Some options to get a local copy of this repositories content:
- use git:
git clone https://github.com/statmike/vertex-ai-mlops
- use
wget
to copy individual files directly from GitHub:- Go to the notebook on GitHub.com and right-click the download link. Then select copy link address.
- Alternatively, click the Raw button on GitHub and then copy the URL that loads.
- Run the following from a notebook cell or directly from a terminal (without the !). Note the slightly different address that points directly to raw content on GitHub.
!wget "https://raw.githubusercontent.com/statmike/vertex-ai-mlops/main/<path and filename>.ipynb"
- Use Colab (and soon Vetex AI Enterprise Colab) to open the notebooks. Many of the notebooks have section at the top with buttons for opening directly in Colab. Some notebooks don't yet have this feature and some use local Docker which is not available on Colab.
TL;DR
In Google Cloud Console, Select/Create a Project then go to Vertex AI > Workbench > Instances
- Create a new notebook and Open JupyterLab
- Clone this repository using Git Menu, Open and run
00 - Environment Setup.ipynb
- Create a Project
- Link, Alternatively, go to: Console > IAM & Admin > Manage Resources
- Click "+ Create Project"
- Provide: name, billing account, organization, location
- Click "Create"
- Enable the APIs: Vertex AI API and Notebooks API
-
Link
- Alternatively, go to:
- Console > Vertex AI, then enable API
- Then Console > Vertex AI > Workbench, then enable API
- Alternatively, go to:
-
Link
- Create A Notebook with Vertex AI Workbench Instances:
- Go to: Console > Vertex AI > Workbench > Instances - direct link
- Create a new instance - instructions
- Once it is started, click the
Open JupyterLab
link. - Clone this repository to the JupyterLab instance:
- Either:
- Go to the
Git
menu and chooseClone a Repository
- Choose the Git icon on the left toolbar and click
Clone a Repository
- Go to the
- Provide the Clone URI of this repository: https://github.com/statmike/vertex-ai-mlops.git
- In the File Browser you will now have the folder "vertex-ai-mlops" that contains the files from this repository
- Either:
- Setup the Notebook Environment for these workflows
- Open the notebook vertex-ai-mlops/00 - Environment Setup
- Follow the instructions and run the cells
Resources on these items:
- Google Cloud Projects
- Vertex AI environment
- Introduction to Vertex AI Workbench
- Create a Vetex AI Workbench Instance
-
Learning Machine Learning
- I often get asked "How do I learn about ML?". There are lots of good answers. ....
-
Explorations
- This is a series of projects for exploring new, new-to-me, and emerging tools in the ML world!
-
Tips
- Tips for using the repository and notebooks with examples of core skills like building containers, parameterizing jobs and interacting with other GCP services. These tips help with scaling jobs and developing them with a focus on CI/CD.
This is my personal repository of demonstrations I use for learning and sharing Vertex AI. There are many more resources available. Within each notebook I have included a resources section and a related training section.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for vertex-ai-mlops
Similar Open Source Tools

vertex-ai-mlops
Vertex AI is a platform for end-to-end model development. It consist of core components that make the processes of MLOps possible for design patterns of all types.

ChatDev
ChatDev is a virtual software company powered by intelligent agents like CEO, CPO, CTO, programmer, reviewer, tester, and art designer. These agents collaborate to revolutionize the digital world through programming. The platform offers an easy-to-use, highly customizable, and extendable framework based on large language models, ideal for studying collective intelligence. ChatDev introduces innovative methods like Iterative Experience Refinement and Experiential Co-Learning to enhance software development efficiency. It supports features like incremental development, Docker integration, Git mode, and Human-Agent-Interaction mode. Users can customize ChatChain, Phase, and Role settings, and share their software creations easily. The project is open-source under the Apache 2.0 License and utilizes data licensed under CC BY-NC 4.0.

BotServer
General Bot is a chat bot server that accelerates bot development by providing code base, resources, deployment to the cloud, and templates for creating new bots. It allows modification of bot packages without code through a database and service backend. Users can develop bot packages using custom code in editors like Visual Studio Code, Atom, or Brackets. The tool supports creating bots by copying and pasting files and using favorite tools from Office or Photoshop. It also enables building custom dialogs with BASIC for extending bots.

repromodel
ReproModel is an open-source toolbox designed to boost AI research efficiency by enabling researchers to reproduce, compare, train, and test AI models faster. It provides standardized models, dataloaders, and processing procedures, allowing researchers to focus on new datasets and model development. With a no-code solution, users can access benchmark and SOTA models and datasets, utilize training visualizations, extract code for publication, and leverage an LLM-powered automated methodology description writer. The toolbox helps researchers modularize development, compare pipeline performance reproducibly, and reduce time for model development, computation, and writing. Future versions aim to facilitate building upon state-of-the-art research by loading previously published study IDs with verified code, experiments, and results stored in the system.

data-juicer
Data-Juicer is a one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs. It is a systematic & reusable library of 80+ core OPs, 20+ reusable config recipes, and 20+ feature-rich dedicated toolkits, designed to function independently of specific LLM datasets and processing pipelines. Data-Juicer allows detailed data analyses with an automated report generation feature for a deeper understanding of your dataset. Coupled with multi-dimension automatic evaluation capabilities, it supports a timely feedback loop at multiple stages in the LLM development process. Data-Juicer offers tens of pre-built data processing recipes for pre-training, fine-tuning, en, zh, and more scenarios. It provides a speedy data processing pipeline requiring less memory and CPU usage, optimized for maximum productivity. Data-Juicer is flexible & extensible, accommodating most types of data formats and allowing flexible combinations of OPs. It is designed for simplicity, with comprehensive documentation, easy start guides and demo configs, and intuitive configuration with simple adding/removing OPs from existing configs.

DocsGPT
DocsGPT is an open-source documentation assistant powered by GPT models. It simplifies the process of searching for information in project documentation by allowing developers to ask questions and receive accurate answers. With DocsGPT, users can say goodbye to manual searches and quickly find the information they need. The tool aims to revolutionize project documentation experiences and offers features like live previews, Discord community, guides, and contribution opportunities. It consists of a Flask app, Chrome extension, similarity search index creation script, and a frontend built with Vite and React. Users can quickly get started with DocsGPT by following the provided setup instructions and can contribute to its development by following the guidelines in the CONTRIBUTING.md file. The project follows a Code of Conduct to ensure a harassment-free community environment for all participants. DocsGPT is licensed under MIT and is built with LangChain.

LLM-Zero-to-Hundred
LLM-Zero-to-Hundred is a repository showcasing various applications of LLM chatbots and providing insights into training and fine-tuning Language Models. It includes projects like WebGPT, RAG-GPT, WebRAGQuery, LLM Full Finetuning, RAG-Master LLamaindex vs Langchain, open-source-RAG-GEMMA, and HUMAIN: Advanced Multimodal, Multitask Chatbot. The projects cover features like ChatGPT-like interaction, RAG capabilities, image generation and understanding, DuckDuckGo integration, summarization, text and voice interaction, and memory access. Tutorials include LLM Function Calling and Visualizing Text Vectorization. The projects have a general structure with folders for README, HELPER, .env, configs, data, src, images, and utils.

arbigent
Arbigent (Arbiter-Agent) is an AI agent testing framework designed to make AI agent testing practical for modern applications. It addresses challenges faced by traditional UI testing frameworks and AI agents by breaking down complex tasks into smaller, dependent scenarios. The framework is customizable for various AI providers, operating systems, and form factors, empowering users with extensive customization capabilities. Arbigent offers an intuitive UI for scenario creation and a powerful code interface for seamless test execution. It supports multiple form factors, optimizes UI for AI interaction, and is cost-effective by utilizing models like GPT-4o mini. With a flexible code interface and open-source nature, Arbigent aims to revolutionize AI agent testing in modern applications.

sourcegraph-public-snapshot
Sourcegraph is a tool that simplifies reading, writing, and fixing code in large and complex codebases. It offers features such as code search across repositories, code intelligence for code navigation and history tracing, and the ability to roll out large-scale changes to multiple repositories simultaneously. Sourcegraph can be used on the cloud or self-hosted, and provides public code search on Sourcegraph.com. The tool is designed to enhance code understanding and collaboration within development teams.

RAGxplorer
RAGxplorer is a tool designed to build visualisations for Retrieval Augmented Generation (RAG). It provides functionalities to interact with RAG models, visualize queries, and explore information retrieval tasks. The tool aims to simplify the process of working with RAG models and enhance the understanding of retrieval and generation processes.

ha-llmvision
LLM Vision is a Home Assistant integration that allows users to analyze images, videos, and camera feeds using multimodal LLMs. It supports providers such as OpenAI, Anthropic, Google Gemini, LocalAI, and Ollama. Users can input images and videos from camera entities or local files, with the option to downscale images for faster processing. The tool provides detailed instructions on setting up LLM Vision and each supported provider, along with usage examples and service call parameters.

StratosphereLinuxIPS
Slips is a powerful endpoint behavioral intrusion prevention and detection system that uses machine learning to detect malicious behaviors in network traffic. It can work with network traffic in real-time, PCAP files, and network flows from tools like Suricata, Zeek/Bro, and Argus. Slips threat detection is based on machine learning models, threat intelligence feeds, and expert heuristics. It gathers evidence of malicious behavior and triggers alerts when enough evidence is accumulated. The tool is Python-based and supported on Linux and MacOS, with blocking features only on Linux. Slips relies on Zeek network analysis framework and Redis for interprocess communication. It offers a graphical user interface for easy monitoring and analysis.

gptme
GPTMe is a tool that allows users to interact with an LLM assistant directly in their terminal in a chat-style interface. The tool provides features for the assistant to run shell commands, execute code, read/write files, and more, making it suitable for various development and terminal-based tasks. It serves as a local alternative to ChatGPT's 'Code Interpreter,' offering flexibility and privacy when using a local model. GPTMe supports code execution, file manipulation, context passing, self-correction, and works with various AI models like GPT-4. It also includes a GitHub Bot for requesting changes and operates entirely in GitHub Actions. In progress features include handling long contexts intelligently, a web UI and API for conversations, web and desktop vision, and a tree-based conversation structure.

fluid
Fluid is an open source Kubernetes-native Distributed Dataset Orchestrator and Accelerator for data-intensive applications, such as big data and AI applications. It implements dataset abstraction, scalable cache runtime, automated data operations, elasticity and scheduling, and is runtime platform agnostic. Key concepts include Dataset and Runtime. Prerequisites include Kubernetes version > 1.16, Golang 1.18+, and Helm 3. The tool offers features like accelerating remote file accessing, machine learning, accelerating PVC, preloading dataset, and on-the-fly dataset cache scaling. Contributions are welcomed, and the project is under the Apache 2.0 license with a vendor-neutral approach.

OpenAdapt
OpenAdapt is an open-source software adapter between Large Multimodal Models (LMMs) and traditional desktop and web Graphical User Interfaces (GUIs). It aims to automate repetitive GUI workflows by leveraging the power of LMMs. OpenAdapt records user input and screenshots, converts them into tokenized format, and generates synthetic input via transformer model completions. It also analyzes recordings to generate task trees and replay synthetic input to complete tasks. OpenAdapt is model agnostic and generates prompts automatically by learning from human demonstration, ensuring that agents are grounded in existing processes and mitigating hallucinations. It works with all types of desktop GUIs, including virtualized and web, and is open source under the MIT license.

humanlayer
HumanLayer is a Python toolkit designed to enable AI agents to interact with humans in tool-based and asynchronous workflows. By incorporating humans-in-the-loop, agentic tools can access more powerful and meaningful tasks. The toolkit provides features like requiring human approval for function calls, human as a tool for contacting humans, omni-channel contact capabilities, granular routing, and support for various LLMs and orchestration frameworks. HumanLayer aims to ensure human oversight of high-stakes function calls, making AI agents more reliable and safe in executing impactful tasks.
For similar tasks

vertex-ai-mlops
Vertex AI is a platform for end-to-end model development. It consist of core components that make the processes of MLOps possible for design patterns of all types.

ai-on-gke
This repository contains assets related to AI/ML workloads on Google Kubernetes Engine (GKE). Run optimized AI/ML workloads with Google Kubernetes Engine (GKE) platform orchestration capabilities. A robust AI/ML platform considers the following layers: Infrastructure orchestration that support GPUs and TPUs for training and serving workloads at scale Flexible integration with distributed computing and data processing frameworks Support for multiple teams on the same infrastructure to maximize utilization of resources

ray
Ray is a unified framework for scaling AI and Python applications. It consists of a core distributed runtime and a set of AI libraries for simplifying ML compute, including Data, Train, Tune, RLlib, and Serve. Ray runs on any machine, cluster, cloud provider, and Kubernetes, and features a growing ecosystem of community integrations. With Ray, you can seamlessly scale the same code from a laptop to a cluster, making it easy to meet the compute-intensive demands of modern ML workloads.

labelbox-python
Labelbox is a data-centric AI platform for enterprises to develop, optimize, and use AI to solve problems and power new products and services. Enterprises use Labelbox to curate data, generate high-quality human feedback data for computer vision and LLMs, evaluate model performance, and automate tasks by combining AI and human-centric workflows. The academic & research community uses Labelbox for cutting-edge AI research.

djl
Deep Java Library (DJL) is an open-source, high-level, engine-agnostic Java framework for deep learning. It is designed to be easy to get started with and simple to use for Java developers. DJL provides a native Java development experience and allows users to integrate machine learning and deep learning models with their Java applications. The framework is deep learning engine agnostic, enabling users to switch engines at any point for optimal performance. DJL's ergonomic API interface guides users with best practices to accomplish deep learning tasks, such as running inference and training neural networks.

mlflow
MLflow is a platform to streamline machine learning development, including tracking experiments, packaging code into reproducible runs, and sharing and deploying models. MLflow offers a set of lightweight APIs that can be used with any existing machine learning application or library (TensorFlow, PyTorch, XGBoost, etc), wherever you currently run ML code (e.g. in notebooks, standalone applications or the cloud). MLflow's current components are:
* `MLflow Tracking

tt-metal
TT-NN is a python & C++ Neural Network OP library. It provides a low-level programming model, TT-Metalium, enabling kernel development for Tenstorrent hardware.

burn
Burn is a new comprehensive dynamic Deep Learning Framework built using Rust with extreme flexibility, compute efficiency and portability as its primary goals.
For similar jobs

lollms-webui
LoLLMs WebUI (Lord of Large Language Multimodal Systems: One tool to rule them all) is a user-friendly interface to access and utilize various LLM (Large Language Models) and other AI models for a wide range of tasks. With over 500 AI expert conditionings across diverse domains and more than 2500 fine tuned models over multiple domains, LoLLMs WebUI provides an immediate resource for any problem, from car repair to coding assistance, legal matters, medical diagnosis, entertainment, and more. The easy-to-use UI with light and dark mode options, integration with GitHub repository, support for different personalities, and features like thumb up/down rating, copy, edit, and remove messages, local database storage, search, export, and delete multiple discussions, make LoLLMs WebUI a powerful and versatile tool.

Azure-Analytics-and-AI-Engagement
The Azure-Analytics-and-AI-Engagement repository provides packaged Industry Scenario DREAM Demos with ARM templates (Containing a demo web application, Power BI reports, Synapse resources, AML Notebooks etc.) that can be deployed in a customer’s subscription using the CAPE tool within a matter of few hours. Partners can also deploy DREAM Demos in their own subscriptions using DPoC.

minio
MinIO is a High Performance Object Storage released under GNU Affero General Public License v3.0. It is API compatible with Amazon S3 cloud storage service. Use MinIO to build high performance infrastructure for machine learning, analytics and application data workloads.

mage-ai
Mage is an open-source data pipeline tool for transforming and integrating data. It offers an easy developer experience, engineering best practices built-in, and data as a first-class citizen. Mage makes it easy to build, preview, and launch data pipelines, and provides observability and scaling capabilities. It supports data integrations, streaming pipelines, and dbt integration.

AiTreasureBox
AiTreasureBox is a versatile AI tool that provides a collection of pre-trained models and algorithms for various machine learning tasks. It simplifies the process of implementing AI solutions by offering ready-to-use components that can be easily integrated into projects. With AiTreasureBox, users can quickly prototype and deploy AI applications without the need for extensive knowledge in machine learning or deep learning. The tool covers a wide range of tasks such as image classification, text generation, sentiment analysis, object detection, and more. It is designed to be user-friendly and accessible to both beginners and experienced developers, making AI development more efficient and accessible to a wider audience.

tidb
TiDB is an open-source distributed SQL database that supports Hybrid Transactional and Analytical Processing (HTAP) workloads. It is MySQL compatible and features horizontal scalability, strong consistency, and high availability.

airbyte
Airbyte is an open-source data integration platform that makes it easy to move data from any source to any destination. With Airbyte, you can build and manage data pipelines without writing any code. Airbyte provides a library of pre-built connectors that make it easy to connect to popular data sources and destinations. You can also create your own connectors using Airbyte's no-code Connector Builder or low-code CDK. Airbyte is used by data engineers and analysts at companies of all sizes to build and manage their data pipelines.

labelbox-python
Labelbox is a data-centric AI platform for enterprises to develop, optimize, and use AI to solve problems and power new products and services. Enterprises use Labelbox to curate data, generate high-quality human feedback data for computer vision and LLMs, evaluate model performance, and automate tasks by combining AI and human-centric workflows. The academic & research community uses Labelbox for cutting-edge AI research.