Best AI tools for< 3d Computer Vision Engineer >
Infographic
20 - AI tool Sites
Orbbec
Orbbec is a leading provider of 3D vision technology, offering a wide range of 3D cameras and sensors for various applications. With a focus on AI, optics, and advanced algorithms, Orbbec empowers developers and enterprises to create immersive experiences, precise measurements, and advanced visualizations. Their products include stereo vision cameras, ToF cameras, structured light cameras, camera computers, and lidar sensors, catering to industries such as manufacturing, healthcare, robotics, fitness, logistics, and retail.
Tangram Vision
Tangram Vision is a company that provides sensor calibration tools and infrastructure for robotics and autonomous vehicles. Their products include MetriCal, a high-speed bundle adjustment software for precise sensor calibration, and AutoCal, an on-device, real-time calibration health check and adjustment tool. Tangram Vision also offers a high-resolution depth sensor called HiFi, which combines high-resolution depth data with high-powered AI capabilities. The company's mission is to accelerate the development and deployment of autonomous systems by providing the tools and infrastructure needed to ensure the accuracy and reliability of sensors.
Lexset
Lexset is an AI tool that provides synthetic data generation services for computer vision model training. It offers a no-code interface to create unlimited data with advanced camera controls and lighting options. Users can simulate AI-scale environments, composite objects into images, and create custom 3D scenarios. Lexset also provides access to GPU nodes, dedicated support, and feature development assistance. The tool aims to improve object detection accuracy and optimize generalization on high-quality synthetic data.
Visual Computing & Artificial Intelligence Lab at TUM
The Visual Computing & Artificial Intelligence Lab at TUM is a group of research enthusiasts advancing cutting-edge research at the intersection of computer vision, computer graphics, and artificial intelligence. Our research mission is to obtain highly-realistic digital replica of the real world, which include representations of detailed 3D geometries, surface textures, and material definitions of both static and dynamic scene environments. In our research, we heavily build on advances in modern machine learning, and develop novel methods that enable us to learn strong priors to fuel 3D reconstruction techniques. Ultimately, we aim to obtain holographic representations that are visually indistinguishable from the real world, ideally captured from a simple webcam or mobile phone. We believe this is a critical component in facilitating immersive augmented and virtual reality applications, and will have a substantial positive impact in modern digital societies.
OpenCV.ai
OpenCV.ai is a leading provider of computer vision software and services. The company's team of experts has extensive experience in developing optimized large-scale computer vision solutions. OpenCV.ai's expertise is helping businesses grow in a variety of industries, including medicine, manufacturing, and retail. The company's solutions are used by startups and Fortune 500 companies alike.
Synthesis AI
Synthesis AI is a synthetic data platform that enables more capable and ethical computer vision AI. It provides on-demand labeled images and videos, photorealistic images, and 3D generative AI to help developers build better models faster. Synthesis AI's products include Synthesis Humans, which allows users to create detailed images and videos of digital humans with rich annotations; Synthesis Scenarios, which enables users to craft complex multi-human simulations across a variety of environments; and a range of applications for industries such as ID verification, automotive, avatar creation, virtual fashion, AI fitness, teleconferencing, visual effects, and security.
Enric Corona
Enric Corona is a Research Scientist at Google Research, working on 3D Humans and Generative AI. His research is in areas of computer vision and machine learning, including modelling and reconstruction of 3D human bodies and hands.
Datagen
Datagen is a platform that provides synthetic data for computer vision. Synthetic data is artificially generated data that can be used to train machine learning models. Datagen's data is generated using a variety of techniques, including 3D modeling, computer graphics, and machine learning. The company's data is used by a variety of industries, including automotive, security, smart office, fitness, cosmetics, and facial applications.
Bifrost AI
Bifrost AI is a data generation engine designed for AI and robotics applications. It enables users to train and validate AI models faster by generating physically accurate synthetic datasets in 3D simulations, eliminating the need for real-world data. The platform offers pixel-perfect labels, scenario metadata, and a simulated 3D world to enhance AI understanding. Bifrost AI empowers users to create new scenarios and datasets rapidly, stress test AI perception, and improve model performance. It is built for teams at every stage of AI development, offering features like automated labeling, class imbalance correction, and performance enhancement.
Rerun
Rerun is an SDK, time-series database, and visualizer for temporal and multimodal data. It is used in fields like robotics, spatial computing, 2D/3D simulation, and finance to verify, debug, and explain data. Rerun allows users to log data like tensors, point clouds, and text to create streams, visualize and interact with live and recorded streams, build layouts, customize visualizations, and extend data and UI functionalities. The application provides a composable data model, dynamic schemas, and custom views for enhanced data visualization and analysis.
Brick Generator
Brick Generator is a proof of concept tool that allows users to generate 3D brick models from text descriptions. It is powered by artificial intelligence and machine learning algorithms, which enable it to understand the user's intent and create realistic and detailed models. The tool is still in its early stages of development, but it has the potential to revolutionize the way that architects, engineers, and designers create and visualize their projects.
DraftAid
DraftAid is an AI-powered drawing automation tool that streamlines the fabrication drawing process, reducing the time from weeks to minutes. It integrates seamlessly with existing CAD software and offers extensive customization options to align with specific project requirements, delivering consistently accurate and high-quality drawings.
Meshy
Meshy is a free 3D AI model generator that empowers artists, game developers, and creators to bring their visions to life with a toolkit for creating 3D models in minutes. It offers powerful AI generation tools, lightning speed modeling, PBR maps, versatile art styles, and user-friendly interface. Meshy allows users to convert text to 3D, images to 3D models, and upload existing 3D models to transform words into textures. With multilingual support, API integration, and various export options, Meshy provides a seamless 3D workflow for users to unleash their creativity like never before.
Make your image 3D
This website provides a tool that allows users to convert 2D images into 3D images. The tool uses artificial intelligence to extract depth information from the image, which is then used to create a 3D model. The resulting 3D model can be embedded into a website or shared via a link.
Move AI
Move AI is a markerless motion capture technology that uses advanced AI, computer vision, biomechanics, and physics to capture high-quality 3D motion data from 2D video. It is designed to make it easy for 3D animators and studios to create realistic human motion for animated characters, without the need for expensive suits or markers. Move AI offers both single-camera and multi-camera products, and can be used to capture motion in a variety of environments, including large volume spaces. It also supports hand and finger tracking, and allows users to export their motion data to their preferred 3D animation software.
Graswald.ai
Graswald.ai is an AI-powered platform that enables users to create 3D product visualizations in minutes, without the need for 3D modeling expertise. The platform uses AI to convert a video of a product into a 3D model, which can then be used to create high-quality product images, videos, and AR experiences. Graswald.ai is designed to help businesses save time and money on product visualization, while also improving conversion rates and reducing return rates.
CharacterGen
CharacterGen is an advanced AI tool for efficient 3D character generation from single images. It utilizes cutting-edge multi-view pose calibration technology and deep learning algorithms to create detailed and realistic 3D models in seconds. The platform offers real-time processing, customizable outputs, and seamless integration capabilities, making it a valuable tool for professionals and beginners in gaming, animation, and virtual reality industries.
VRee Labs
VRee Labs is a pioneering research company, focusing on developing cutting-edge AI-powered 3D modeling solutions for various products and items spanning various materials and shapes. Our innovative solution leverages the latest advancements in artificial intelligence and machine learning to cut costs and boost sales for businesses across various industries. Our mission is to empower companies with the tools and insights they need to succeed in an ever-changing market. By utilizing VREE Labs, businesses can streamline their operations, increase efficiency, and make informed decisions that drive growth and profitability. Whether you are a small business or a large corporation, VRee solutions can help your business reach better results and reach its full potential.
IDjpg
IDjpg is an AI-powered photo editing tool that allows users to transform their photos into a variety of artistic styles, including 3D, caricature, line art, anime, and movies. The tool is easy to use and can be used to create unique and eye-catching images.
Metail
Metail is an AI tool specializing in apparel AI solutions. The platform offers services such as EcoShot for creating on-model images from 3D designs, AI-generated models, and Machine-Learning Advanced Motion Analytics (MAMA) for factory efficiency. Metail also provides the Metail-Avatar-System (MAS) for accurate 3D body scans and SizeStream mobile apps for personalized experiences. The company focuses on leveraging AI technologies to enhance the capabilities of apparel companies and improve user experiences.
20 - Open Source Tools
DeepLearing-Interview-Awesome-2024
DeepLearning-Interview-Awesome-2024 is a repository that covers various topics related to deep learning, computer vision, big models (LLMs), autonomous driving, smart healthcare, and more. It provides a collection of interview questions with detailed explanations sourced from recent academic papers and industry developments. The repository is aimed at assisting individuals in academic research, work innovation, and job interviews. It includes six major modules covering topics such as large language models (LLMs), computer vision models, common problems in computer vision and perception algorithms, deep learning basics and frameworks, as well as specific tasks like 3D object detection, medical image segmentation, and more.
Awesome-Colorful-LLM
Awesome-Colorful-LLM is a meticulously assembled anthology of vibrant multimodal research focusing on advancements propelled by large language models (LLMs) in domains such as Vision, Audio, Agent, Robotics, and Fundamental Sciences like Mathematics. The repository contains curated collections of works, datasets, benchmarks, projects, and tools related to LLMs and multimodal learning. It serves as a comprehensive resource for researchers and practitioners interested in exploring the intersection of language models and various modalities for tasks like image understanding, video pretraining, 3D modeling, document understanding, audio analysis, agent learning, robotic applications, and mathematical research.
ai_all_resources
This repository is a compilation of excellent ML and DL tutorials created by various individuals and organizations. It covers a wide range of topics, including machine learning fundamentals, deep learning, computer vision, natural language processing, reinforcement learning, and more. The resources are organized into categories, making it easy to find the information you need. Whether you're a beginner or an experienced practitioner, you're sure to find something valuable in this repository.
CVPR2024-Papers-with-Code-Demo
This repository contains a collection of papers and code for the CVPR 2024 conference. The papers cover a wide range of topics in computer vision, including object detection, image segmentation, image generation, and video analysis. The code provides implementations of the algorithms described in the papers, making it easy for researchers and practitioners to reproduce the results and build upon the work of others. The repository is maintained by a team of researchers at the University of California, Berkeley.
kornia
Kornia is a differentiable computer vision library for PyTorch. It consists of a set of routines and differentiable modules to solve generic computer vision problems. At its core, the package uses PyTorch as its main backend both for efficiency and to take advantage of the reverse-mode auto-differentiation to define and compute the gradient of complex functions.
learnopencv
LearnOpenCV is a repository containing code for Computer Vision, Deep learning, and AI research articles shared on the blog LearnOpenCV.com. It serves as a resource for individuals looking to enhance their expertise in AI through various courses offered by OpenCV. The repository includes a wide range of topics such as image inpainting, instance segmentation, robotics, deep learning models, and more, providing practical implementations and code examples for readers to explore and learn from.
pytorch-grad-cam
This repository provides advanced AI explainability for PyTorch, offering state-of-the-art methods for Explainable AI in computer vision. It includes a comprehensive collection of Pixel Attribution methods for various tasks like Classification, Object Detection, Semantic Segmentation, and more. The package supports high performance with full batch image support and includes metrics for evaluating and tuning explanations. Users can visualize and interpret model predictions, making it suitable for both production and model development scenarios.
EmbodiedScan
EmbodiedScan is a holistic multi-modal 3D perception suite designed for embodied AI. It introduces a multi-modal, ego-centric 3D perception dataset and benchmark for holistic 3D scene understanding. The dataset includes over 5k scans with 1M ego-centric RGB-D views, 1M language prompts, 160k 3D-oriented boxes spanning 760 categories, and dense semantic occupancy with 80 common categories. The suite includes a baseline framework named Embodied Perceptron, capable of processing multi-modal inputs for 3D perception tasks and language-grounded tasks.
ScreenAgent
ScreenAgent is a project focused on creating an environment for Visual Language Model agents (VLM Agent) to interact with real computer screens. The project includes designing an automatic control process for agents to interact with the environment and complete multi-step tasks. It also involves building the ScreenAgent dataset, which collects screenshots and action sequences for various daily computer tasks. The project provides a controller client code, configuration files, and model training code to enable users to control a desktop with a large model.
habitat-sim
Habitat-Sim is a high-performance physics-enabled 3D simulator with support for 3D scans of indoor/outdoor spaces, CAD models of spaces and piecewise-rigid objects, configurable sensors, robots described via URDF, and rigid-body mechanics. It prioritizes simulation speed over the breadth of simulation capabilities, achieving several thousand frames per second (FPS) running single-threaded and over 10,000 FPS multi-process on a single GPU when rendering a scene from the Matterport3D dataset. Habitat-Sim simulates a Fetch robot interacting in ReplicaCAD scenes at over 8,000 steps per second (SPS), where each ‘step’ involves rendering 1 RGBD observation (128×128 pixels) and rigid-body dynamics for 1/30sec.
Awesome-LLM-3D
This repository is a curated list of papers related to 3D tasks empowered by Large Language Models (LLMs). It covers tasks such as 3D understanding, reasoning, generation, and embodied agents. The repository also includes other Foundation Models like CLIP and SAM to provide a comprehensive view of the area. It is actively maintained and updated to showcase the latest advances in the field. Users can find a variety of research papers and projects related to 3D tasks and LLMs in this repository.
Awesome-AIGC-3D
Awesome-AIGC-3D is a curated list of awesome AIGC 3D papers, inspired by awesome-NeRF. It aims to provide a comprehensive overview of the state-of-the-art in AIGC 3D, including papers on text-to-3D generation, 3D scene generation, human avatar generation, and dynamic 3D generation. The repository also includes a list of benchmarks and datasets, talks, companies, and implementations related to AIGC 3D. The description is less than 400 words and provides a concise overview of the repository's content and purpose.
awesome-mobile-robotics
The 'awesome-mobile-robotics' repository is a curated list of important content related to Mobile Robotics and AI. It includes resources such as courses, books, datasets, software and libraries, podcasts, conferences, journals, companies and jobs, laboratories and research groups, and miscellaneous resources. The repository covers a wide range of topics in the field of Mobile Robotics and AI, providing valuable information for enthusiasts, researchers, and professionals in the domain.
2025-AI-College-Jobs
2025-AI-College-Jobs is a repository containing a comprehensive list of AI/ML & Data Science jobs suitable for college students seeking internships or new graduate positions. The repository is regularly updated with positions posted within the last 120 days, featuring opportunities from various companies in the USA and internationally. The list includes positions in areas such as research scientist internships, quantitative research analyst roles, and other data science-related positions. The repository aims to provide a valuable resource for students looking to kickstart their careers in the field of artificial intelligence and machine learning.
MiKaPo
MiKaPo is a web-based tool that allows users to pose MMD models in real-time using video input. It utilizes technologies such as Mediapipe for 3D key points detection, Babylon.js for 3D scene rendering, babylon-mmd for MMD model viewing, and Vite+React for the web framework. Users can upload videos and images, select different environments, and choose models for posing. MiKaPo also supports camera input and Ollama (electron version). The tool is open to feature requests and pull requests, with ongoing development to add VMD export functionality.
awesome-openvino
Awesome OpenVINO is a curated list of AI projects based on the OpenVINO toolkit, offering a rich assortment of projects, libraries, and tutorials covering various topics like model optimization, deployment, and real-world applications across industries. It serves as a valuable resource continuously updated to maximize the potential of OpenVINO in projects, featuring projects like Stable Diffusion web UI, Visioncom, FastSD CPU, OpenVINO AI Plugins for GIMP, and more.
Awesome-Quantization-Papers
This repo contains a comprehensive paper list of **Model Quantization** for efficient deep learning on AI conferences/journals/arXiv. As a highlight, we categorize the papers in terms of model structures and application scenarios, and label the quantization methods with keywords.
ChatSim
ChatSim is a tool designed for editable scene simulation for autonomous driving via LLM-Agent collaboration. It provides functionalities for setting up the environment, installing necessary dependencies like McNeRF and Inpainting tools, and preparing data for simulation. Users can train models, simulate scenes, and track trajectories for smoother and more realistic results. The tool integrates with Blender software and offers options for training McNeRF models and McLight's skydome estimation network. It also includes a trajectory tracking module for improved trajectory tracking. ChatSim aims to facilitate the simulation of autonomous driving scenarios with collaborative LLM-Agents.
habitat-lab
Habitat-Lab is a modular high-level library for end-to-end development in embodied AI. It is designed to train agents to perform a wide variety of embodied AI tasks in indoor environments, as well as develop agents that can interact with humans in performing these tasks.
20 - OpenAI Gpts
Pixie: Computer Vision Engineer
Expert in computer vision, deep learning, ready to assist you with 3d and geometric computer vision. https://github.com/kornia/pixie
Gandi IDE Shader Helper
Helps you code a shader for Gandi IDE project in GLSL. https://getgandi.com/extensions/glsl-in-gandi-ide
OpenGL 3.3 Graphics Programming Helper
Helps beginners understand OpenGL 3.3 concepts and terminology
3D Print Diagnostics Expert
Expert in 3D printing diagnostics and problem resolution, mindful of confidentiality and careful with brand usage.
3D Design Visualizer
Guides users in designing 3D printed products with practical and creative advice.
3Dスキャンできる場所は知らんけど、ニッチな旅行場所をおすすめするで!
Japanese travel guide with a focus on hidden gems and port towns