data:image/s3,"s3://crabby-images/74c83/74c83df2ebf176f02fdd6a78b77f5efae33d2d47" alt="aitviewer"
aitviewer
A set of tools to visualize and interact with sequences of 3D data.
Stars: 542
data:image/s3,"s3://crabby-images/8052e/8052ea353dbb99428e8c471d4bd876d8bb58892c" alt="screenshot"
A set of tools to visualize and interact with sequences of 3D data with cross-platform support on Windows, Linux, and macOS. It provides a native Python interface for loading and displaying SMPL[-H/-X], MANO, FLAME, STAR, and SUPR sequences in an interactive viewer. Users can render 3D data on top of images, edit SMPL sequences and poses, export screenshots and videos, and utilize a high-performance ModernGL-based rendering pipeline. The tool is designed for easy use and hacking, with features like headless mode, remote mode, animatable camera paths, and a built-in extensible GUI.
README:
A set of tools to visualize and interact with sequences of 3D data with cross-platform support on Windows, Linux, and macOS. See the official page at https://eth-ait.github.io/aitviewer for all the details.
Basic Installation:
pip install aitviewer
Note that this does not install the GPU-version of PyTorch automatically. If your environment already contains it, you should be good to go, otherwise install it manually.
Or install locally (if you need to extend or modify code)
git clone [email protected]:eth-ait/aitviewer.git
cd aitviewer
pip install -e .
On macOS with Apple Silicon it is recommended to use PyQt6. Please check this issue for installation instructions.
For more advanced installation and for installing SMPL body models, please refer to the documentation .
- Native Python interface, easy to use and hack.
- Load SMPL[-H/-X] / MANO / FLAME / STAR / SUPR sequences and display them in an interactive viewer.
- Headless mode for server rendering of videos/images.
- Remote mode for non-blocking integration of visualization code.
- Render 3D data on top of images via weak-perspective or OpenCV camera models.
- Animatable camera paths.
- Edit SMPL sequences and poses manually.
- Prebuilt renderable primitives (cylinders, spheres, point clouds, etc).
- Built-in extensible GUI (based on Dear ImGui).
- Export screenshots, videos and turntable views (as mp4/gif)
- High-Performance ModernGL-based rendering pipeline (running at 100fps+ on most laptops).
Display an SMPL T-pose (Requires SMPL models):
from aitviewer.renderables.smpl import SMPLSequence
from aitviewer.viewer import Viewer
if __name__ == '__main__':
v = Viewer()
v.scene.add(SMPLSequence.t_pose())
v.run()
A sampling of projects using the aitviewer. Let us know if you want to be added to this list!
- Fan et al., HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and Objects from Video, CVPR 2024
- Braun et al., Physically Plausible Full-Body Hand-Object Interaction Synthesis, 3DV 2024
- Zhang and Christen et al., ArtiGrasp: Physically Plausible Synthesis of Bi-Manual Dexterous Grasping and Articulation, 3DV 2024
- Kaufmann et al., EMDB: The Electromagnetic Database of Global 3D Human Pose and Shape in the Wild, ICCV 2023
- Shen and Guo et al., X-Avatar: Expressive Human Avatars, CVPR 2023
- Sun et al., TRACE: 5D Temporal Regression of Avatars with Dynamic Cameras in 3D Environments, CVPR 2023
- Fan et al., ARCTIC: A Dataset for Dexterous Bimanual Hand-Object Manipulation, CVPR 2023
- Dong and Guo et al., PINA: Learning a Personalized Implicit Neural Avatar from a Single RGB-D Video Sequence, CVPR 2022
- Dong et al., Shape-aware Multi-Person Pose Estimation from Multi-view Images, ICCV 2021
- Kaufmann et al., EM-POSE: 3D Human Pose Estimation from Sparse Electromagnetic Trackers, ICCV 2021
- Vechev et al., Computational Design of Kinesthetic Garments, Eurographics 2021
- Guo et al., Human Performance Capture from Monocular Video in the Wild, 3DV 2021
If you use this software, please cite it as below.
@software{Kaufmann_Vechev_aitviewer_2022,
author = {Kaufmann, Manuel and Vechev, Velko and Mylonopoulos, Dario},
doi = {10.5281/zenodo.10013305},
month = {7},
title = {{aitviewer}},
url = {https://github.com/eth-ait/aitviewer},
year = {2022}
}
This software was developed by Manuel Kaufmann, Velko Vechev and Dario Mylonopoulos. For questions please create an issue. We welcome and encourage module and feature contributions from the community.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for aitviewer
Similar Open Source Tools
data:image/s3,"s3://crabby-images/8052e/8052ea353dbb99428e8c471d4bd876d8bb58892c" alt="aitviewer Screenshot"
aitviewer
A set of tools to visualize and interact with sequences of 3D data with cross-platform support on Windows, Linux, and macOS. It provides a native Python interface for loading and displaying SMPL[-H/-X], MANO, FLAME, STAR, and SUPR sequences in an interactive viewer. Users can render 3D data on top of images, edit SMPL sequences and poses, export screenshots and videos, and utilize a high-performance ModernGL-based rendering pipeline. The tool is designed for easy use and hacking, with features like headless mode, remote mode, animatable camera paths, and a built-in extensible GUI.
data:image/s3,"s3://crabby-images/9fe19/9fe19def54463bab30883bf7ee13fc826c183c4a" alt="only_train_once Screenshot"
only_train_once
Only Train Once (OTO) is an automatic, architecture-agnostic DNN training and compression framework that allows users to train a general DNN from scratch or a pretrained checkpoint to achieve high performance and slimmer architecture simultaneously in a one-shot manner without fine-tuning. The framework includes features for automatic structured pruning and erasing operators, as well as hybrid structured sparse optimizers for efficient model compression. OTO provides tools for pruning zero-invariant group partitioning, constructing pruned models, and visualizing pruning and erasing dependency graphs. It supports the HESSO optimizer and offers a sanity check for compliance testing on various DNNs. The repository also includes publications, installation instructions, quick start guides, and a roadmap for future enhancements and collaborations.
data:image/s3,"s3://crabby-images/4e404/4e404267bf6aa1c6a47228f9f544bcdba5fde8a5" alt="Genesis Screenshot"
Genesis
Genesis is a physics platform designed for general purpose Robotics/Embodied AI/Physical AI applications. It includes a universal physics engine, a lightweight, ultra-fast, pythonic, and user-friendly robotics simulation platform, a powerful and fast photo-realistic rendering system, and a generative data engine that transforms user-prompted natural language description into various modalities of data. It aims to lower the barrier to using physics simulations, unify state-of-the-art physics solvers, and minimize human effort in collecting and generating data for robotics and other domains.
data:image/s3,"s3://crabby-images/4695c/4695cda4218e5385faaf97fdb9ab46a829583c76" alt="trafilatura Screenshot"
trafilatura
Trafilatura is a Python package and command-line tool for gathering text on the Web and simplifying the process of turning raw HTML into structured, meaningful data. It includes components for web crawling, downloads, scraping, and extraction of main texts, metadata, and comments. The tool aims to focus on actual content, avoid noise, and make sense of data and metadata. It is robust, fast, and widely used by companies and institutions. Trafilatura outperforms other libraries in text extraction benchmarks and offers various features like support for sitemaps, parallel processing, configurable extraction of key elements, multiple output formats, and optional add-ons. The tool is actively maintained with regular updates and comprehensive documentation.
data:image/s3,"s3://crabby-images/82511/82511ec60d0fd9111a98d866cd95d51467a04bff" alt="Macaw-LLM Screenshot"
Macaw-LLM
Macaw-LLM is a pioneering multi-modal language modeling tool that seamlessly integrates image, audio, video, and text data. It builds upon CLIP, Whisper, and LLaMA models to process and analyze multi-modal information effectively. The tool boasts features like simple and fast alignment, one-stage instruction fine-tuning, and a new multi-modal instruction dataset. It enables users to align multi-modal features efficiently, encode instructions, and generate responses across different data types.
data:image/s3,"s3://crabby-images/c89e6/c89e66572a91775dd05d77b266cf922ca3a4fc68" alt="slideflow Screenshot"
slideflow
Slideflow is a deep learning library for digital pathology, offering a user-friendly interface for model development. It is designed for medical researchers and AI enthusiasts, providing an accessible platform for developing state-of-the-art pathology models. Slideflow offers customizable training pipelines, robust slide processing and stain normalization toolkit, support for weakly-supervised or strongly-supervised labels, built-in foundation models, multiple-instance learning, self-supervised learning, generative adversarial networks, explainability tools, layer activation analysis tools, uncertainty quantification, interactive user interface for model deployment, and more. It supports both PyTorch and Tensorflow, with optional support for Libvips for slide reading. Slideflow can be installed via pip, Docker container, or from source, and includes non-commercial add-ons for additional tools and pretrained models. It allows users to create projects, extract tiles from slides, train models, and provides evaluation tools like heatmaps and mosaic maps.
data:image/s3,"s3://crabby-images/cf7ea/cf7eab880f0eb28bb1a56cab63a3754e408b556d" alt="Geoweaver Screenshot"
Geoweaver
Geoweaver is an in-browser software that enables users to easily compose and execute full-stack data processing workflows using online spatial data facilities, high-performance computation platforms, and open-source deep learning libraries. It provides server management, code repository, workflow orchestration software, and history recording capabilities. Users can run it from both local and remote machines. Geoweaver aims to make data processing workflows manageable for non-coder scientists and preserve model run history. It offers features like progress storage, organization, SSH connection to external servers, and a web UI with Python support.
data:image/s3,"s3://crabby-images/043e7/043e7ed18932e468c47d69957945d2d430c90e53" alt="Quantus Screenshot"
Quantus
Quantus is a toolkit designed for the evaluation of neural network explanations. It offers more than 30 metrics in 6 categories for eXplainable Artificial Intelligence (XAI) evaluation. The toolkit supports different data types (image, time-series, tabular, NLP) and models (PyTorch, TensorFlow). It provides built-in support for explanation methods like captum, tf-explain, and zennit. Quantus is under active development and aims to provide a comprehensive set of quantitative evaluation metrics for XAI methods.
data:image/s3,"s3://crabby-images/1b588/1b5885a06b4a445b903a44f47f220168d4681852" alt="habitat-lab Screenshot"
habitat-lab
Habitat-Lab is a modular high-level library for end-to-end development in embodied AI. It is designed to train agents to perform a wide variety of embodied AI tasks in indoor environments, as well as develop agents that can interact with humans in performing these tasks.
data:image/s3,"s3://crabby-images/6c82e/6c82e51856338e6334544d16c2875ceaafd477d3" alt="pytorch-forecasting Screenshot"
pytorch-forecasting
PyTorch Forecasting is a PyTorch-based package for time series forecasting with state-of-the-art network architectures. It offers a high-level API for training networks on pandas data frames and utilizes PyTorch Lightning for scalable training on GPUs and CPUs. The package aims to simplify time series forecasting with neural networks by providing a flexible API for professionals and default settings for beginners. It includes a timeseries dataset class, base model class, multiple neural network architectures, multi-horizon timeseries metrics, and hyperparameter tuning with optuna. PyTorch Forecasting is built on pytorch-lightning for easy training on various hardware configurations.
data:image/s3,"s3://crabby-images/8a4d9/8a4d91bc8fc26965378a4371117dc0bc5e62c7ce" alt="mlflow Screenshot"
mlflow
MLflow is a platform to streamline machine learning development, including tracking experiments, packaging code into reproducible runs, and sharing and deploying models. MLflow offers a set of lightweight APIs that can be used with any existing machine learning application or library (TensorFlow, PyTorch, XGBoost, etc), wherever you currently run ML code (e.g. in notebooks, standalone applications or the cloud). MLflow's current components are:
* `MLflow Tracking
data:image/s3,"s3://crabby-images/fd697/fd697910a72ec8f9103988d44be285db9ba632d4" alt="nous Screenshot"
nous
Nous is an open-source TypeScript platform for autonomous AI agents and LLM based workflows. It aims to automate processes, support requests, review code, assist with refactorings, and more. The platform supports various integrations, multiple LLMs/services, CLI and web interface, human-in-the-loop interactions, flexible deployment options, observability with OpenTelemetry tracing, and specific agents for code editing, software engineering, and code review. It offers advanced features like reasoning/planning, memory and function call history, hierarchical task decomposition, and control-loop function calling options. Nous is designed to be a flexible platform for the TypeScript community to expand and support different use cases and integrations.
data:image/s3,"s3://crabby-images/c1ed7/c1ed7e1eec36dc70c55503dfafc89f9a93851d1b" alt="unitxt Screenshot"
unitxt
Unitxt is a customizable library for textual data preparation and evaluation tailored to generative language models. It natively integrates with common libraries like HuggingFace and LM-eval-harness and deconstructs processing flows into modular components, enabling easy customization and sharing between practitioners. These components encompass model-specific formats, task prompts, and many other comprehensive dataset processing definitions. The Unitxt-Catalog centralizes these components, fostering collaboration and exploration in modern textual data workflows. Beyond being a tool, Unitxt is a community-driven platform, empowering users to build, share, and advance their pipelines collaboratively.
data:image/s3,"s3://crabby-images/8c73d/8c73d5cbe168bed3e7e3cd8dd1c3a78edc179ee8" alt="InstructGraph Screenshot"
InstructGraph
InstructGraph is a framework designed to enhance large language models (LLMs) for graph-centric tasks by utilizing graph instruction tuning and preference alignment. The tool collects and decomposes 29 standard graph datasets into four groups, enabling LLMs to better understand and generate graph data. It introduces a structured format verbalizer to transform graph data into a code-like format, facilitating code understanding and generation. Additionally, it addresses hallucination problems in graph reasoning and generation through direct preference optimization (DPO). The tool aims to bridge the gap between textual LLMs and graph data, offering a comprehensive solution for graph-related tasks.
data:image/s3,"s3://crabby-images/bc5bc/bc5bc1197e529219321901400d091c66083b2225" alt="InsPLAD Screenshot"
InsPLAD
InsPLAD is a dataset and benchmark for power line asset inspection in UAV images. It contains 10,607 high-resolution UAV color images of seventeen unique power line assets with six defects. The dataset is used for object detection, defect classification, and anomaly detection tasks in computer vision. InsPLAD offers challenges like multi-scale objects, intra-class variation, cluttered background, and varied lighting conditions, aiming to improve state-of-the-art methods in the field.
data:image/s3,"s3://crabby-images/39bc9/39bc9de0925a29ad22b4f1739406902cacddb743" alt="Vision-LLM-Alignment Screenshot"
Vision-LLM-Alignment
Vision-LLM-Alignment is a repository focused on implementing alignment training for visual large language models (LLMs), including SFT training, reward model training, and PPO/DPO training. It supports various model architectures and provides datasets for training. The repository also offers benchmark results and installation instructions for users.
For similar tasks
data:image/s3,"s3://crabby-images/a502c/a502c6eccce579d9b089defe71c5783077b7bc87" alt="cog-comfyui Screenshot"
cog-comfyui
Cog-comfyui allows users to run ComfyUI workflows on Replicate. ComfyUI is a visual programming tool for creating and sharing generative art workflows. With cog-comfyui, users can access a variety of pre-trained models and custom nodes to create their own unique artworks. The tool is easy to use and does not require any coding experience. Users simply need to upload their API JSON file and any necessary input files, and then click the "Run" button. Cog-comfyui will then generate the output image or video file.
data:image/s3,"s3://crabby-images/31aef/31aef6ef2b157033f59ca69cb55fc3d8a7a9132d" alt="deforum-comfy-nodes Screenshot"
deforum-comfy-nodes
Deforum for ComfyUI is an integration tool designed to enhance the user experience of using ComfyUI. It provides custom nodes that can be added to ComfyUI to improve functionality and workflow. Users can easily install Deforum for ComfyUI by cloning the repository and following the provided instructions. The tool is compatible with Python v3.10 and is recommended to be used within a virtual environment. Contributions to the tool are welcome, and users can join the Discord community for support and discussions.
data:image/s3,"s3://crabby-images/6a47d/6a47d50e79e069bd4c3d40be05a797aca8a4557d" alt="Anim Screenshot"
Anim
Anim v0.1.0 is an animation tool that allows users to convert videos to animations using mixamorig characters. It features FK animation editing, object selection, embedded Python support (only on Windows), and the ability to export to glTF and FBX formats. Users can also utilize Mediapipe to create animations. The tool is designed to assist users in creating animations with ease and flexibility.
data:image/s3,"s3://crabby-images/4ba22/4ba220f52ca24c23c59959a67c03e16c57f94f2d" alt="next-money Screenshot"
next-money
Next Money Stripe Starter is a SaaS Starter project that empowers your next project with a stack of Next.js, Prisma, Supabase, Clerk Auth, Resend, React Email, Shadcn/ui, and Stripe. It seamlessly integrates these technologies to accelerate your development and SaaS journey. The project includes frameworks, platforms, UI components, hooks and utilities, code quality tools, and miscellaneous features to enhance the development experience. Created by @koyaguo in 2023 and released under the MIT license.
data:image/s3,"s3://crabby-images/8052e/8052ea353dbb99428e8c471d4bd876d8bb58892c" alt="aitviewer Screenshot"
aitviewer
A set of tools to visualize and interact with sequences of 3D data with cross-platform support on Windows, Linux, and macOS. It provides a native Python interface for loading and displaying SMPL[-H/-X], MANO, FLAME, STAR, and SUPR sequences in an interactive viewer. Users can render 3D data on top of images, edit SMPL sequences and poses, export screenshots and videos, and utilize a high-performance ModernGL-based rendering pipeline. The tool is designed for easy use and hacking, with features like headless mode, remote mode, animatable camera paths, and a built-in extensible GUI.
For similar jobs
data:image/s3,"s3://crabby-images/8052e/8052ea353dbb99428e8c471d4bd876d8bb58892c" alt="aitviewer Screenshot"
aitviewer
A set of tools to visualize and interact with sequences of 3D data with cross-platform support on Windows, Linux, and macOS. It provides a native Python interface for loading and displaying SMPL[-H/-X], MANO, FLAME, STAR, and SUPR sequences in an interactive viewer. Users can render 3D data on top of images, edit SMPL sequences and poses, export screenshots and videos, and utilize a high-performance ModernGL-based rendering pipeline. The tool is designed for easy use and hacking, with features like headless mode, remote mode, animatable camera paths, and a built-in extensible GUI.
data:image/s3,"s3://crabby-images/7740a/7740ad4457091afbcd6c9b0f3b808492d0dccb01" alt="spear Screenshot"
spear
SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.
data:image/s3,"s3://crabby-images/fa48f/fa48f2d0db61427023099414ac1c2eb560ac53b8" alt="openvino Screenshot"
openvino
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference. It provides a common API to deliver inference solutions on various platforms, including CPU, GPU, NPU, and heterogeneous devices. OpenVINO™ supports pre-trained models from Open Model Zoo and popular frameworks like TensorFlow, PyTorch, and ONNX. Key components of OpenVINO™ include the OpenVINO™ Runtime, plugins for different hardware devices, frontends for reading models from native framework formats, and the OpenVINO Model Converter (OVC) for adjusting models for optimal execution on target devices.
data:image/s3,"s3://crabby-images/a9fd0/a9fd05c02522e280edf939ea0981e63c21a19236" alt="peft Screenshot"
peft
PEFT (Parameter-Efficient Fine-Tuning) is a collection of state-of-the-art methods that enable efficient adaptation of large pretrained models to various downstream applications. By only fine-tuning a small number of extra model parameters instead of all the model's parameters, PEFT significantly decreases the computational and storage costs while achieving performance comparable to fully fine-tuned models.
data:image/s3,"s3://crabby-images/4082f/4082fe5615485c4bee557ae4733ce0dac6aa76e4" alt="jetson-generative-ai-playground Screenshot"
jetson-generative-ai-playground
This repo hosts tutorial documentation for running generative AI models on NVIDIA Jetson devices. The documentation is auto-generated and hosted on GitHub Pages using their CI/CD feature to automatically generate/update the HTML documentation site upon new commits.
data:image/s3,"s3://crabby-images/8eeaa/8eeaac0b708d646cff82bad61505569972b13c85" alt="emgucv Screenshot"
emgucv
Emgu CV is a cross-platform .Net wrapper for the OpenCV image-processing library. It allows OpenCV functions to be called from .NET compatible languages. The wrapper can be compiled by Visual Studio, Unity, and "dotnet" command, and it can run on Windows, Mac OS, Linux, iOS, and Android.
data:image/s3,"s3://crabby-images/59645/59645f646fd877a868933524540c296e169938b1" alt="MMStar Screenshot"
MMStar
MMStar is an elite vision-indispensable multi-modal benchmark comprising 1,500 challenge samples meticulously selected by humans. It addresses two key issues in current LLM evaluation: the unnecessary use of visual content in many samples and the existence of unintentional data leakage in LLM and LVLM training. MMStar evaluates 6 core capabilities across 18 detailed axes, ensuring a balanced distribution of samples across all dimensions.
data:image/s3,"s3://crabby-images/469f9/469f90c478f6abf9e93bec038dca179ad72025d2" alt="VLMEvalKit Screenshot"
VLMEvalKit
VLMEvalKit is an open-source evaluation toolkit of large vision-language models (LVLMs). It enables one-command evaluation of LVLMs on various benchmarks, without the heavy workload of data preparation under multiple repositories. In VLMEvalKit, we adopt generation-based evaluation for all LVLMs, and provide the evaluation results obtained with both exact matching and LLM-based answer extraction.