
earth2studio
Open-source deep-learning framework for exploring, building and deploying AI weather/climate workflows.
Stars: 254

Earth2Studio is a Python-based package designed to enable users to quickly get started with AI weather and climate models. It provides access to pre-trained models, diagnostic tools, data sources, IO utilities, perturbation methods, and sample workflows for building custom weather prediction workflows. The package aims to empower users to explore AI-driven meteorology through modular components and seamless integration with other Nvidia packages like Modulus.
README:
Earth2Studio is a Python-based package designed to get users up and running with AI Earth system models fast. Our mission is to enable everyone to build, research and explore AI driven weather and climate science.
- Earth2Studio Documentation -
Install | User-Guide | Examples | API
Install Earth2Studio:
pip install earth2studio[dlwp]
Run a deterministic AI weather prediction in just a few lines of code:
from earth2studio.models.px import DLWP
from earth2studio.data import GFS
from earth2studio.io import NetCDF4Backend
from earth2studio.run import deterministic as run
model = DLWP.load_model(DLWP.load_default_package())
ds = GFS()
io = NetCDF4Backend("output.nc")
run(["2024-01-01"], 10, model, ds, io)
Swap out for a different AI model by just installing
and replacing DLWP
references with another forecast model.
- The latest Climate in a Bottle generative AI model from NVIDIA research has been added via several APIs including a data source, infilling and super-resolution APIs. See the cBottle examples for more.
- The long awaited GraphCast 1 degree prognostic model and GraphCast Operational prognostic model are now added.
- Advanced Subseasonal-to-Seasonal (S2S) forecasting recipe added demonstrating new inference pipelines for subseasonal weather forecasts (from 2 weeks to 3 months).
For a complete list of latest features and improvements see the changelog.
Earth2Studio is an AI inference pipeline toolkit focused on weather and climate applications that is designed to ride on top of different AI frameworks, model architectures, data sources and SciML tooling while providing a unified API.
The composability of the different core components in Earth2Studio easily allows the development and deployment of increasingly complex pipelines that may chain multiple data sources, AI models and other modules together.
The unified ecosystem of Earth2Studio provides users the opportunity to rapidly swap out components for alternatives. In addition to the largest model zoo of weather/climate AI models, Earth2Studio is packed with useful functionality such as optimized data access to cloud data stores, statistical operations and more to accelerate your pipelines.
Earth2Studio can be used for seamless deployment of Earth-2 models trained in PhysicsNeMo.
Earth2Studio package focuses on supplying users the tools to build their own workflows, pipelines, APIs, packages, etc. via modular components including:
Prognostic Models
Prognostic models in Earth2Studio perform time integration, taking atmospheric fields at a specific time and auto-regressively predicting the same fields into the future (typically 6 hours per step), enabling both single time-step predictions and extended time-series forecasting.
Earth2Studio maintains the largest collection of pre-trained state-of-the-art AI weather/climate models ranging from global forecast models to regional specialized models, covering various resolutions, architectures, and forecasting capabilities to suit different computational and accuracy requirements.
Available models include but are not limited to:
Model | Resolution | Architecture | Time Step | Coverage |
---|---|---|---|---|
GraphCast Small | 1.0° | Graph Neural Network | 6h | Global |
GraphCast Operational | 0.25° | Graph Neural Network | 6h | Global |
Pangu 3hr | 0.25° | Transformer | 3h | Global |
Pangu 6hr | 0.25° | Transformer | 6h | Global |
Pangu 24hr | 0.25° | Transformer | 24h | Global |
Aurora | 0.25° | Transformer | 6h | Global |
FuXi | 0.25° | Transformer | 6h | Global |
AIFS | 0.25° | Transformer | 6h | Global |
StormCast | 3km | Diffusion + Regression | 1h | Regional (US) |
SFNO | 0.25° | Neural Operator | 6h | Global |
DLESyM | 0.25° | Convolutional | 6h | Global |
For a complete list, see the prognostic model API docs.
Diagnostic Models
Diagnostic models in Earth2Studio perform time-independent transformations, typically taking geospatial fields at a specific time and predicting new derived quantities without performing time integration enabling users to build pipelines to predict specific quantities of interest that may not be provided by forecasting models.
Earth2Studio contains a growing collection of specialized diagnostic models for various phenomena including precipitation prediction, tropical cyclone tracking, solar radiation estimation, wind gust forecasting, and more.
Available diagnostics include but are not limited to:
Model | Resolution | Architecture | Coverage | Output |
---|---|---|---|---|
PrecipitationAFNO | 0.25° | Neural Operator | Global | Total precipitation |
SolarRadiationAFNO1H | 0.25° | Neural Operator | Global | Surface solar radiation |
WindgustAFNO | 0.25° | AFNO | Global | Maximum wind gust |
TCTrackerVitart | 0.25° | Algorithmic | Global | TC tracks & properties |
CBottleInfill | 100km | Diffusion | Global | Global climate sample |
CBottleSR | 5km | Diffusion | Regional / Global | High-res climate |
CorrDiff | Variable | Diffusion | Regional | Fine-scale weather |
CorrDiffTaiwan | 2km | Diffusion | Regional (Taiwan) | Taiwan fine-scale weather |
For a complete list, see the diagnostic model API docs.
Datasources
Data sources in Earth2Studio provide a standardized API for accessing weather and climate datasets from various providers (numerical models, data assimilation results, and AI-generated data), enabling seamless integration of initial conditions for model inference and validation data for scoring across different data formats and storage systems.
Earth2Studio includes data sources ranging from operational weather models (GFS, HRRR, IFS) and reanalysis datasets (ERA5 via ARCO, CDS) to AI-generated climate data (cBottle) and local file systems. Fetching data is just plain easy, Earth2Studio handles the complicated parts giving the users an easy to use Xarray data array of requested data under a shared package wide vocabulary and coordinate system.
Available data sources include but are not limited to:
Data Source | Type | Resolution | Coverage | Data Format |
---|---|---|---|---|
GFS | Operational | 0.25° | Global | GRIB2 |
GFS_FX | Forecast | 0.25° | Global | GRIB2 |
HRRR | Operational | 3km | Regional (US) | GRIB2 |
HRRR_FX | Forecast | 3km | Regional (US) | GRIB2 |
ARCO ERA5 | Reanalysis | 0.25° | Global | Zarr |
CDS | Reanalysis | 0.25° | Global | NetCDF |
IFS | Operational | 0.25° | Global | GRIB2 |
NCAR_ERA5 | Reanalysis | 0.25° | Global | NetCDF |
WeatherBench2 | Reanalysis | 0.25° | Global | Zarr |
GEFS_FX | Ensemble Forecast | 0.25° | Global | GRIB2 |
IMERG | Precipitation | 0.1° | Global | NetCDF |
CBottle3D | AI Generated | 100km | Global | HEALPix |
For a complete list, see the data source API docs.
IO Backends
IO backends in Earth2Studio provides a standardized interface for writing and storing pipeline outputs across different file formats and storage systems enabling users to store inference outputs for later processing.
Earth2Studio includes IO backends ranging from traditional scientific formats (NetCDF) and modern cloud-optimized formats (Zarr) to in-memory storage backends.
Available IO backends include:
IO Backend | Format | Features | Location |
---|---|---|---|
ZarrBackend | Zarr | Compression, Chunking | In-Memory/Local |
AsyncZarrBackend | Zarr | Async writes, Parallel I/O | In-Memory/Local/Remote |
NetCDF4Backend | NetCDF4 | CF-compliant, Metadata | In-Memory/Local |
XarrayBackend | Xarray Dataset | Rich metadata, Analysis-ready | In-Memory |
KVBackend | Key-Value | Fast Temporary Access | In-Memory |
For a complete list, see the IO API docs.
Perturbation Methods
Perturbation methods in Earth2Studio provide a standardized interface for adding noise to data arrays, typically enabling the creation of ensembling forecast pipelines that capture uncertainty in weather and climate predictions.
Available perturbations include but are not limited to:
Perturbation Method | Type | Spatial Correlation | Temporal Correlation |
---|---|---|---|
Gaussian | Noise | None | None |
Correlated SphericalGaussian | Noise | Spherical | AR(1) process |
Spherical Gaussian | Noise | Spherical (Matern) | None |
Brown | Noise | 2D Fourier | None |
Bred Vector | Dynamical | Model-dependent | Model-dependent |
Hemispheric Centred Bred Vector | Dynamical | Hemispheric | Model-dependent |
For a complete list, see the perturbations API docs.
Statistics / Metrics
Statistics and metrics in Earth2Studio provide operations typically useful for in-pipeline evaluation of forecast performance across different dimensions (spatial, temporal, ensemble) through various statistical measures including error metrics, correlation coefficients, and ensemble verification statistics.
Available operations include but are not limited to:
Statistic | Type | Application |
---|---|---|
RMSE | Error Metric | Forecast accuracy |
ACC | Correlation | Pattern correlation |
CRPS | Ensemble Metric | Probabilistic skill |
Rank Histogram | Ensemble Metric | Ensemble reliability |
Standard Deviation | Moment | Spread measure |
Spread-Skill Ratio | Ensemble Metric | Ensemble calibration |
For a complete list, see the statistics API docs.
For a more complete list of features, be sure to view the documentation. Don't see what you need? Great news, extension and customization are at the heart of our design.
Check out the contributing document for details about the technical requirements and the userguide for higher level philosophy, structure, and design.
Earth2Studio is provided under the Apache License 2.0, please see the LICENSE file for full license text.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for earth2studio
Similar Open Source Tools

earth2studio
Earth2Studio is a Python-based package designed to enable users to quickly get started with AI weather and climate models. It provides access to pre-trained models, diagnostic tools, data sources, IO utilities, perturbation methods, and sample workflows for building custom weather prediction workflows. The package aims to empower users to explore AI-driven meteorology through modular components and seamless integration with other Nvidia packages like Modulus.

lemonai
LemonAI is a versatile machine learning library designed to simplify the process of building and deploying AI models. It provides a wide range of tools and algorithms for data preprocessing, model training, and evaluation. With LemonAI, users can easily experiment with different machine learning techniques and optimize their models for various tasks. The library is well-documented and beginner-friendly, making it suitable for both novice and experienced data scientists. LemonAI aims to streamline the development of AI applications and empower users to create innovative solutions using state-of-the-art machine learning methods.

koog
Koog is a Kotlin-based framework for building and running AI agents entirely in idiomatic Kotlin. It allows users to create agents that interact with tools, handle complex workflows, and communicate with users. Key features include pure Kotlin implementation, MCP integration, embedding capabilities, custom tool creation, ready-to-use components, intelligent history compression, powerful streaming API, persistent agent memory, comprehensive tracing, flexible graph workflows, modular feature system, scalable architecture, and multiplatform support.

ml-retreat
ML-Retreat is a comprehensive machine learning library designed to simplify and streamline the process of building and deploying machine learning models. It provides a wide range of tools and utilities for data preprocessing, model training, evaluation, and deployment. With ML-Retreat, users can easily experiment with different algorithms, hyperparameters, and feature engineering techniques to optimize their models. The library is built with a focus on scalability, performance, and ease of use, making it suitable for both beginners and experienced machine learning practitioners.

LightLLM
LightLLM is a lightweight library for linear and logistic regression models. It provides a simple and efficient way to train and deploy machine learning models for regression tasks. The library is designed to be easy to use and integrate into existing projects, making it suitable for both beginners and experienced data scientists. With LightLLM, users can quickly build and evaluate regression models using a variety of algorithms and hyperparameters. The library also supports feature engineering and model interpretation, allowing users to gain insights from their data and make informed decisions based on the model predictions.

deepflow
DeepFlow is an open-source project that provides deep observability for complex cloud-native and AI applications. It offers Zero Code data collection with eBPF for metrics, distributed tracing, request logs, and function profiling. DeepFlow is integrated with SmartEncoding to achieve Full Stack correlation and efficient access to all observability data. With DeepFlow, cloud-native and AI applications automatically gain deep observability, removing the burden of developers continually instrumenting code and providing monitoring and diagnostic capabilities covering everything from code to infrastructure for DevOps/SRE teams.

ai-edge-quantizer
AI Edge Quantizer is a tool designed for advanced developers to quantize converted LiteRT models. It aims to optimize performance on resource-demanding models by providing quantization recipes for edge device deployment. The tool supports dynamic quantization, weight-only quantization, and static quantization methods, allowing users to customize the quantization process for different hardware deployments. Users can specify quantization recipes to apply to source models, resulting in quantized LiteRT models ready for deployment. The tool also includes advanced features such as selective quantization and mixed precision schemes for fine-tuning quantization recipes.

jadx-mcp-server
JADX-MCP-SERVER is a standalone Python server that interacts with JADX-AI-MCP Plugin to analyze Android APKs using LLMs like Claude. It enables live communication with decompiled Android app context, uncovering vulnerabilities, parsing manifests, and facilitating reverse engineering effortlessly. The tool combines JADX-AI-MCP and JADX MCP SERVER to provide real-time reverse engineering support with LLMs, offering features like quick analysis, vulnerability detection, AI code modification, static analysis, and reverse engineering helpers. It supports various MCP tools for fetching class information, text, methods, fields, smali code, AndroidManifest.xml content, strings.xml file, resource files, and more. Tested on Claude Desktop, it aims to support other LLMs in the future, enhancing Android reverse engineering and APK modification tools connectivity for easier reverse engineering purely from vibes.

open-ai
Open AI is a powerful tool for artificial intelligence research and development. It provides a wide range of machine learning models and algorithms, making it easier for developers to create innovative AI applications. With Open AI, users can explore cutting-edge technologies such as natural language processing, computer vision, and reinforcement learning. The platform offers a user-friendly interface and comprehensive documentation to support users in building and deploying AI solutions. Whether you are a beginner or an experienced AI practitioner, Open AI offers the tools and resources you need to accelerate your AI projects and stay ahead in the rapidly evolving field of artificial intelligence.

AI_Spectrum
AI_Spectrum is a versatile machine learning library that provides a wide range of tools and algorithms for building and deploying AI models. It offers a user-friendly interface for data preprocessing, model training, and evaluation. With AI_Spectrum, users can easily experiment with different machine learning techniques and optimize their models for various tasks. The library is designed to be flexible and scalable, making it suitable for both beginners and experienced data scientists.

pdr_ai_v2
pdr_ai_v2 is a Python library for implementing machine learning algorithms and models. It provides a wide range of tools and functionalities for data preprocessing, model training, evaluation, and deployment. The library is designed to be user-friendly and efficient, making it suitable for both beginners and experienced data scientists. With pdr_ai_v2, users can easily build and deploy machine learning models for various applications, such as classification, regression, clustering, and more.

GEN-AI
GEN-AI is a versatile Python library for implementing various artificial intelligence algorithms and models. It provides a wide range of tools and functionalities to support machine learning, deep learning, natural language processing, computer vision, and reinforcement learning tasks. With GEN-AI, users can easily build, train, and deploy AI models for diverse applications such as image recognition, text classification, sentiment analysis, object detection, and game playing. The library is designed to be user-friendly, efficient, and scalable, making it suitable for both beginners and experienced AI practitioners.

azure-ai-docs
Azure AI Docs is a repository that provides detailed documentation and resources for developers looking to leverage Microsoft's AI services on the Azure platform. The repository covers a wide range of topics including machine learning, natural language processing, computer vision, and more. Developers can find tutorials, code samples, best practices, and guidelines to help them integrate AI capabilities into their applications seamlessly.

redb-open
reDB Node is a distributed, policy-driven data mesh platform that enables True Data Portability across various databases, warehouses, clouds, and environments. It unifies data access, data mobility, and schema transformation into one open platform. Built for developers, architects, and AI systems, reDB addresses the challenges of fragmented data ecosystems in modern enterprises by providing multi-database interoperability, automated schema versioning, zero-downtime migration, real-time developer data environments with obfuscation, quantum-resistant encryption, and policy-based access control. The project aims to build a foundation for future-proof data infrastructure.

lmnr
Laminar is an all-in-one open-source platform designed for engineering AI products. It allows users to trace, evaluate, label, and analyze LLM data efficiently. The platform offers features such as automatic tracing of common AI frameworks and SDKs, local and online evaluations, simple UI for data labeling, dataset management, and scalability with gRPC communication. Laminar is built with a modern open-source stack including RabbitMQ, Postgres, Clickhouse, and Qdrant for semantic similarity search. It provides fast and beautiful dashboards for traces, evaluations, and labels, making it a comprehensive tool for AI product development.

jadx-ai-mcp
JADX-AI-MCP is a plugin for the JADX decompiler that integrates with Model Context Protocol (MCP) to provide live reverse engineering support with LLMs like Claude. It allows for quick analysis, vulnerability detection, and AI code modification, all in real time. The tool combines JADX-AI-MCP and JADX MCP SERVER to analyze Android APKs effortlessly. It offers various prompts for code understanding, vulnerability detection, reverse engineering helpers, static analysis, AI code modification, and documentation. The tool is part of the Zin MCP Suite and aims to connect all android reverse engineering and APK modification tools with a single MCP server for easy reverse engineering of APK files.
For similar tasks

Awesome-LWMs
Awesome Large Weather Models (LWMs) is a curated collection of articles and resources related to large weather models used in AI for Earth and AI for Science. It includes information on various cutting-edge weather forecasting models, benchmark datasets, and research papers. The repository serves as a hub for researchers and enthusiasts to explore the latest advancements in weather modeling and forecasting.

earth2studio
Earth2Studio is a Python-based package designed to enable users to quickly get started with AI weather and climate models. It provides access to pre-trained models, diagnostic tools, data sources, IO utilities, perturbation methods, and sample workflows for building custom weather prediction workflows. The package aims to empower users to explore AI-driven meteorology through modular components and seamless integration with other Nvidia packages like Modulus.

Pathway-AI-Bootcamp
Welcome to the μLearn x Pathway Initiative, an exciting adventure into the world of Artificial Intelligence (AI)! This comprehensive course, developed in collaboration with Pathway, will empower you with the knowledge and skills needed to navigate the fascinating world of AI, with a special focus on Large Language Models (LLMs).

LLM-Agent-Survey
Autonomous agents are designed to achieve specific objectives through self-guided instructions. With the emergence and growth of large language models (LLMs), there is a growing trend in utilizing LLMs as fundamental controllers for these autonomous agents. This repository conducts a comprehensive survey study on the construction, application, and evaluation of LLM-based autonomous agents. It explores essential components of AI agents, application domains in natural sciences, social sciences, and engineering, and evaluation strategies. The survey aims to be a resource for researchers and practitioners in this rapidly evolving field.

genkit
Firebase Genkit (beta) is a framework with powerful tooling to help app developers build, test, deploy, and monitor AI-powered features with confidence. Genkit is cloud optimized and code-centric, integrating with many services that have free tiers to get started. It provides unified API for generation, context-aware AI features, evaluation of AI workflow, extensibility with plugins, easy deployment to Firebase or Google Cloud, observability and monitoring with OpenTelemetry, and a developer UI for prototyping and testing AI features locally. Genkit works seamlessly with Firebase or Google Cloud projects through official plugins and templates.

vector-cookbook
The Vector Cookbook is a collection of recipes and sample application starter kits for building AI applications with LLMs using PostgreSQL and Timescale Vector. Timescale Vector enhances PostgreSQL for AI applications by enabling the storage of vector, relational, and time-series data with faster search, higher recall, and more efficient time-based filtering. The repository includes resources, sample applications like TSV Time Machine, and guides for creating, storing, and querying OpenAI embeddings with PostgreSQL and pgvector. Users can learn about Timescale Vector, explore performance benchmarks, and access Python client libraries and tutorials.

cogai
The W3C Cognitive AI Community Group focuses on advancing Cognitive AI through collaboration on defining use cases, open source implementations, and application areas. The group aims to demonstrate the potential of Cognitive AI in various domains such as customer services, healthcare, cybersecurity, online learning, autonomous vehicles, manufacturing, and web search. They work on formal specifications for chunk data and rules, plausible knowledge notation, and neural networks for human-like AI. The group positions Cognitive AI as a combination of symbolic and statistical approaches inspired by human thought processes. They address research challenges including mimicry, emotional intelligence, natural language processing, and common sense reasoning. The long-term goal is to develop cognitive agents that are knowledgeable, creative, collaborative, empathic, and multilingual, capable of continual learning and self-awareness.

ai-hub
The Enterprise Azure OpenAI Hub is a comprehensive repository designed to guide users through the world of Generative AI on the Azure platform. It offers a structured learning experience to accelerate the transition from concept to production in an Enterprise context. The hub empowers users to explore various use cases with Azure services, ensuring security and compliance. It provides real-world examples and playbooks for practical insights into solving complex problems and developing cutting-edge AI solutions. The repository also serves as a library of proven patterns, aligning with industry standards and promoting best practices for secure and compliant AI development.
For similar jobs

sweep
Sweep is an AI junior developer that turns bugs and feature requests into code changes. It automatically handles developer experience improvements like adding type hints and improving test coverage.

teams-ai
The Teams AI Library is a software development kit (SDK) that helps developers create bots that can interact with Teams and Microsoft 365 applications. It is built on top of the Bot Framework SDK and simplifies the process of developing bots that interact with Teams' artificial intelligence capabilities. The SDK is available for JavaScript/TypeScript, .NET, and Python.

ai-guide
This guide is dedicated to Large Language Models (LLMs) that you can run on your home computer. It assumes your PC is a lower-end, non-gaming setup.

classifai
Supercharge WordPress Content Workflows and Engagement with Artificial Intelligence. Tap into leading cloud-based services like OpenAI, Microsoft Azure AI, Google Gemini and IBM Watson to augment your WordPress-powered websites. Publish content faster while improving SEO performance and increasing audience engagement. ClassifAI integrates Artificial Intelligence and Machine Learning technologies to lighten your workload and eliminate tedious tasks, giving you more time to create original content that matters.

chatbot-ui
Chatbot UI is an open-source AI chat app that allows users to create and deploy their own AI chatbots. It is easy to use and can be customized to fit any need. Chatbot UI is perfect for businesses, developers, and anyone who wants to create a chatbot.

BricksLLM
BricksLLM is a cloud native AI gateway written in Go. Currently, it provides native support for OpenAI, Anthropic, Azure OpenAI and vLLM. BricksLLM aims to provide enterprise level infrastructure that can power any LLM production use cases. Here are some use cases for BricksLLM: * Set LLM usage limits for users on different pricing tiers * Track LLM usage on a per user and per organization basis * Block or redact requests containing PIIs * Improve LLM reliability with failovers, retries and caching * Distribute API keys with rate limits and cost limits for internal development/production use cases * Distribute API keys with rate limits and cost limits for students

uAgents
uAgents is a Python library developed by Fetch.ai that allows for the creation of autonomous AI agents. These agents can perform various tasks on a schedule or take action on various events. uAgents are easy to create and manage, and they are connected to a fast-growing network of other uAgents. They are also secure, with cryptographically secured messages and wallets.

griptape
Griptape is a modular Python framework for building AI-powered applications that securely connect to your enterprise data and APIs. It offers developers the ability to maintain control and flexibility at every step. Griptape's core components include Structures (Agents, Pipelines, and Workflows), Tasks, Tools, Memory (Conversation Memory, Task Memory, and Meta Memory), Drivers (Prompt and Embedding Drivers, Vector Store Drivers, Image Generation Drivers, Image Query Drivers, SQL Drivers, Web Scraper Drivers, and Conversation Memory Drivers), Engines (Query Engines, Extraction Engines, Summary Engines, Image Generation Engines, and Image Query Engines), and additional components (Rulesets, Loaders, Artifacts, Chunkers, and Tokenizers). Griptape enables developers to create AI-powered applications with ease and efficiency.