awesome-air-quality

An awesome list of air quality resources.

Stars: 56

Visit

The 'awesome-air-quality' repository is a curated list of software libraries, tools, and resources related to air quality data acquisition, analysis, and visualization. It includes libraries in various programming languages such as Python, Java, R, and C#, as well as hardware drivers and software for gas sensors and particulate matter sensors. The repository aims to provide a comprehensive collection of tools for working with air quality data from different sources and for different purposes.

README:

I want to contribute

Please read our contributing guidelines and open a pull-request.

Software Libraries

C#
- openair - National air quality data acquisition library (Chinese)
Java
- NAPSDataAnalysis - Canadian National Air Pollution Surveillance Program (NAPS) data downloader, importer, extractor, analysis, and visualization toolbox.
NodeJS
- openaq - A JS client for the OpenAQ API
Python
- airbase - An easy downloader for the AirBase air quality data.
- atmospy - visualization and analysis tools for air quality data in python
- py-openaq - python wrapper for the OpenAQ API
- py-quantaq - A python wrapper for the QuantAQ RESTful API
- py-opcsim - Python library to simulate OPCs and Nephlometers under different conditions
- py-smps - Python library for the analysis and visualization of data from a Scanning Mobility Particle Sizer (SMPS) and other similar instruments (SEMS, OPC's).
- python-aqi - A library to convert between AQI value and pollutant concentration (µg/m³ or ppm)
- The QuantAQ CLI - QuantAQ command line interface
- quantpy - Provides tools for visually evaluating low-cost air quality sensors
- sensortoolkit - Air Sensor Data Analysis Library
R
- AirBeamR - An interactive data tool to visualize and work with AirBeam, OpenAQ, and PurpleAir data
- AirMonitor - Utilities for working with air quality monitoring data CRAN
- AirSensor - Utilities for working with data from PurpleAir sensorsCRAN
- AMET - Code base for the U.S. EPA’s Atmospheric Model Evaluation Tool (AMET).
- beethoven - BEETHOVEN is: Building an Extensible, rEproducible, Test-driven, Harmonized, Open-source, Versioned, ENsemble model for air quality.
- CMAQ - Code for U.S. EPA’s Community Multiscale Air Quality Model (CMAQ) which helps in conducting air quality model simulations.
- openair - Tools to analyse, interpret and understand air pollution data. Data are typically hourly time series and both monitoring data and dispersion model output can be analysed. Many functions can also be applied to other data, including meteorological and traffic data. CRAN
- openairmaps - mapping functions to support openair CRAN
- Purple Air Data Merger - Merges and corrects Purple Air SD Card Data
- qualR - This is the qualR package, it will help you bring São Paulo and Rio de Janeiro air quality data to your R session 🇧🇷.
- quantr - Provides tools for visually evaluating low-cost air quality sensors
- RAQSAPI - A R extension to Retrieve EPA Air Quality System Data via the AQS Data Mart API.
- rmweather - Tools to Conduct Meteorological Normalisation on Air Quality Data.
- rPollution - R functions to work with air pollution data
- r-quantaq - The official R wrapper for the QuantAQ API
- saqgetr - Import Air Quality Monitoring Data in a Fast and Easy Way
- sensortoolkit - _A collection of R scripts for managing an air quality sensor network
- biteSizedAQ - A collection of bite sized projects aimed at democratizing access to air quality data, pipelines and insights in a manner that is free, open, accessible and easy to understand. Air pollution can feel like a giant overwhelming issue and it is, but by consistently taking bite-sized smart steps, we can collectively make significant progress in tackling it!
Rust
- openaq-client - Unofficial Open Air Quality API Client written in Rust crate

Hardware drivers / software

C
C++
- Nova Fitness SDS dust sensors arduino library
- PMS - Arduino library for Plantower PMS x003 family sensors.
- Sensirion SPS30 driver for ESP32, SODAQ, MEGA2560, UNO, ESP8266, Particle-photon on UART OR I2C coummunication
- Arduino library for Sensirion SCD4x sensors
- Embedded UART Driver for Sensirion Particulate Matter Sensors
Python
- bme680-python - Python library for the BME680 gas, temperature, humidity and pressure sensor.
- py-licor - Python logging software for the Licor 840 CO2/H2O analyzer
- Software to read out Sensirion SCD30 CO₂ Sensor values over I2C on Raspberry Pi
- Sentinair - A flexible tool for data acquisition from heterogeneous low-cost gas sensors and other devices
Rust
- Rust library with common functionality for I2C sensors

For Tasks:

Click tags to check more tools for each tasks

analyze air quality visualize pollution data manage sensor network simulate air quality acquire atmospheric data

For Jobs:

air quality analyst environmental scientist data scientist software developer research assistant

Alternative AI tools for awesome-air-quality

Similar Open Source Tools

awesome-air-quality

github

: 56

awesome-mlops

Awesome MLOps is a curated list of tools related to Machine Learning Operations, covering areas such as AutoML, CI/CD for Machine Learning, Data Cataloging, Data Enrichment, Data Exploration, Data Management, Data Processing, Data Validation, Data Visualization, Drift Detection, Feature Engineering, Feature Store, Hyperparameter Tuning, Knowledge Sharing, Machine Learning Platforms, Model Fairness and Privacy, Model Interpretability, Model Lifecycle, Model Serving, Model Testing & Validation, Optimization Tools, Simplification Tools, Visual Analysis and Debugging, and Workflow Tools. The repository provides a comprehensive collection of tools and resources for individuals and teams working in the field of MLOps.

github

: 3.7k

project-lakechain

Project Lakechain is a cloud-native, AI-powered framework for building document processing pipelines on AWS. It provides a composable API with built-in middlewares for common tasks, scalable architecture, cost efficiency, GPU and CPU support, and the ability to create custom transform middlewares. With ready-made examples and emphasis on modularity, Lakechain simplifies the deployment of scalable document pipelines for tasks like metadata extraction, NLP analysis, text summarization, translations, audio transcriptions, computer vision, and more.

github

: 109

LabelLLM

LabelLLM is an open-source data annotation platform designed to optimize the data annotation process for LLM development. It offers flexible configuration, multimodal data support, comprehensive task management, and AI-assisted annotation. Users can access a suite of annotation tools, enjoy a user-friendly experience, and enhance efficiency. The platform allows real-time monitoring of annotation progress and quality control, ensuring data integrity and timeliness.

github

: 634

deeplake

Deep Lake is a Database for AI powered by a storage format optimized for deep-learning applications. Deep Lake can be used for: 1. Storing data and vectors while building LLM applications 2. Managing datasets while training deep learning models Deep Lake simplifies the deployment of enterprise-grade LLM-based products by offering storage for all data types (embeddings, audio, text, videos, images, pdfs, annotations, etc.), querying and vector search, data streaming while training models at scale, data versioning and lineage, and integrations with popular tools such as LangChain, LlamaIndex, Weights & Biases, and many more. Deep Lake works with data of any size, it is serverless, and it enables you to store all of your data in your own cloud and in one place. Deep Lake is used by Intel, Bayer Radiology, Matterport, ZERO Systems, Red Cross, Yale, & Oxford.

github

: 8.5k

model_server

OpenVINO™ Model Server (OVMS) is a high-performance system for serving models. Implemented in C++ for scalability and optimized for deployment on Intel architectures, the model server uses the same architecture and API as TensorFlow Serving and KServe while applying OpenVINO for inference execution. Inference service is provided via gRPC or REST API, making deploying new algorithms and AI experiments easy.

github

: 718

LAMBDA

LAMBDA is a code-free multi-agent data analysis system that utilizes large models to address data analysis challenges in complex data-driven applications. It allows users to perform complex data analysis tasks through human language instruction, seamlessly generate and debug code using two key agent roles, integrate external models and algorithms, and automatically generate reports. The system has demonstrated strong performance on various machine learning datasets, enhancing data science practice by integrating human and artificial intelligence.

github

: 344

wren-engine

Wren Engine is a semantic engine designed to serve as the backbone of the semantic layer for LLMs. It simplifies the user experience by translating complex data structures into a business-friendly format, enabling end-users to interact with data using familiar terminology. The engine powers the semantic layer with advanced capabilities to define and manage modeling definitions, metadata, schema, data relationships, and logic behind calculations and aggregations through an analytics-as-code design approach. By leveraging Wren Engine, organizations can ensure a developer-friendly semantic layer that reflects nuanced data relationships and dynamics, facilitating more informed decision-making and strategic insights.

github

: 245

btp-cap-genai-rag

This GitHub repository provides support for developers, partners, and customers to create advanced GenAI solutions on SAP Business Technology Platform (SAP BTP) following the Reference Architecture. It includes examples on integrating Foundation Models and Large Language Models via Generative AI Hub, using LangChain in CAP, and implementing advanced techniques like Retrieval Augmented Generation (RAG) through embeddings and SAP HANA Cloud's Vector Engine for enhanced value in customer support scenarios.

github

: 79

Bodo

Bodo is a high-performance Python compute engine designed for large-scale data processing and AI workloads. It utilizes an auto-parallelizing just-in-time compiler to optimize Python programs, making them 20x to 240x faster compared to alternatives. Bodo seamlessly integrates with native Python APIs like Pandas and NumPy, eliminates runtime overheads using MPI for distributed execution, and provides exceptional performance and scalability for data workloads. It is easy to use, interoperable with the Python ecosystem, and integrates with modern data platforms like Apache Iceberg and Snowflake. Bodo focuses on data-intensive and computationally heavy workloads in data engineering, data science, and AI/ML, offering automatic optimization and parallelization, linear scalability, advanced I/O support, and a high-performance SQL engine.

github

: 240

embedJs

EmbedJs is a NodeJS framework that simplifies RAG application development by efficiently processing unstructured data. It segments data, creates relevant embeddings, and stores them in a vector database for quick retrieval.

github

: 352

nextpy

Nextpy is a cutting-edge software development framework optimized for AI-based code generation. It provides guardrails for defining AI system boundaries, structured outputs for prompt engineering, a powerful prompt engine for efficient processing, better AI generations with precise output control, modularity for multiplatform and extensible usage, developer-first approach for transferable knowledge, and containerized & scalable deployment options. It offers 4-10x faster performance compared to Streamlit apps, with a focus on cooperation within the open-source community and integration of key components from various projects.

github

: 2.1k

higress

Higress is an open-source cloud-native API gateway built on the core of Istio and Envoy, based on Alibaba's internal practice of Envoy Gateway. It is designed for AI-native API gateway, serving AI businesses such as Tongyi Qianwen APP, Bailian Big Model API, and Machine Learning PAI platform. Higress provides capabilities to interface with LLM model vendors, AI observability, multi-model load balancing/fallback, AI token flow control, and AI caching. It offers features for AI gateway, Kubernetes Ingress gateway, microservices gateway, and security protection gateway, with advantages in production-level scalability, stream processing, extensibility, and ease of use.

github

: 4.3k

nixtla

Nixtla is a production-ready generative pretrained transformer for time series forecasting and anomaly detection. It can accurately predict various domains such as retail, electricity, finance, and IoT with just a few lines of code. TimeGPT introduces a paradigm shift with its standout performance, efficiency, and simplicity, making it accessible even to users with minimal coding experience. The model is based on self-attention and is independently trained on a vast time series dataset to minimize forecasting error. It offers features like zero-shot inference, fine-tuning, API access, adding exogenous variables, multiple series forecasting, custom loss function, cross-validation, prediction intervals, and handling irregular timestamps.

github

: 2.6k

LLMStats

LLMStats is a community-driven repository providing detailed information on hundreds of Language Models (LLMs). Users can compare and explore LLMs through an interactive dashboard at llm-stats.com. The repository includes model parameters, context window sizes, licensing details, capabilities, provider pricing, performance metrics, and standardized benchmark results. Community contributions are welcome to maintain data accuracy. The platform prioritizes data quality through verifiable source links, community review processes, multiple source citations, and regular data validation. While not guaranteed to be 100% accurate, efforts are made to ensure the information is as reliable as possible.

github

: 130

air-quality-info

Air Quality Info is a PHP-based page that displays current PM10 and PM2.5 measurements from Sensor.Community-compatible devices. It features a clean interface, stores records in MySQL, renders graphs with ChartJS, supports multiple devices, offers locale support, and functions as a Progressive Web App. The project setup involves creating directory structures, setting permissions, and starting Docker containers. The admin dashboard is accessible at http://aqi.eco.localhost:8080/, while the Air Quality Info pages use a specific naming schema. The project is supported by Nettigo Air Monitor, Sensor.Community, and a forum thread in Polish.

github

: 51

For similar tasks

awesome-air-quality

github

: 56

For similar jobs

awesome-air-quality

github

: 56

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 855

agentcloud

AgentCloud is an open-source platform that enables companies to build and deploy private LLM chat apps, empowering teams to securely interact with their data. It comprises three main components: Agent Backend, Webapp, and Vector Proxy. To run this project locally, clone the repository, install Docker, and start the services. The project is licensed under the GNU Affero General Public License, version 3 only. Contributions and feedback are welcome from the community.

github

: 583

oss-fuzz-gen

This framework generates fuzz targets for real-world `C`/`C++` projects with various Large Language Models (LLM) and benchmarks them via the `OSS-Fuzz` platform. It manages to successfully leverage LLMs to generate valid fuzz targets (which generate non-zero coverage increase) for 160 C/C++ projects. The maximum line coverage increase is 29% from the existing human-written targets.

github

: 1.2k

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.3k