Best AI tools for< Perception Engineer >
Infographic
20 - AI tool Sites
Tangram Vision
Tangram Vision is a company that provides sensor calibration tools and infrastructure for robotics and autonomous vehicles. Their products include MetriCal, a high-speed bundle adjustment software for precise sensor calibration, and AutoCal, an on-device, real-time calibration health check and adjustment tool. Tangram Vision also offers a high-resolution depth sensor called HiFi, which combines high-resolution depth data with high-powered AI capabilities. The company's mission is to accelerate the development and deployment of autonomous systems by providing the tools and infrastructure needed to ensure the accuracy and reliability of sensors.
AEye
AEye is a leading provider of software-defined lidar solutions for autonomous applications. Our 4Sight Intelligent Sensing Platform provides accurate, reliable, and real-time perception data to enable safer and more efficient navigation. AEye's lidar products are designed to meet the unique requirements of automotive, trucking, and smart infrastructure applications.
Perception AI
Perception AI is an AI tool designed to enhance user experience by providing personalized recommendations and insights based on user behavior and preferences. The tool utilizes advanced algorithms to analyze data and generate tailored suggestions for users. With a focus on improving engagement and satisfaction, Perception AI aims to revolutionize the way users interact with digital platforms. By leveraging artificial intelligence, the tool offers a seamless and intuitive experience for users across various domains.
Luminar
Luminar is a leading developer of automotive lidar technology. The company's mission is to make roads safer by eliminating vehicle accidents. Luminar's lidar sensors provide cars with a detailed view of their surroundings, enabling them to make better decisions and avoid collisions. Luminar's technology is being used by a number of automakers, including Volvo, SAIC Motor, and Polestar.
Plus
Plus is an AI-based autonomous driving software company that focuses on developing solutions for driver assist and autonomous driving technologies. The company offers a suite of autonomous driving solutions designed for integration with various hardware platforms and vehicle types, ranging from perception software to highly automated driving systems. Plus aims to transform the transportation industry by providing high-performance, safe, and affordable autonomous driving vehicles at scale.
Synthesis AI
Synthesis AI is a synthetic data platform that enables more capable and ethical computer vision AI. It provides on-demand labeled images and videos, photorealistic images, and 3D generative AI to help developers build better models faster. Synthesis AI's products include Synthesis Humans, which allows users to create detailed images and videos of digital humans with rich annotations; Synthesis Scenarios, which enables users to craft complex multi-human simulations across a variety of environments; and a range of applications for industries such as ID verification, automotive, avatar creation, virtual fashion, AI fitness, teleconferencing, visual effects, and security.
Bifrost AI
Bifrost AI is a data generation engine designed for AI and robotics applications. It enables users to train and validate AI models faster by generating physically accurate synthetic datasets in 3D simulations, eliminating the need for real-world data. The platform offers pixel-perfect labels, scenario metadata, and a simulated 3D world to enhance AI understanding. Bifrost AI empowers users to create new scenarios and datasets rapidly, stress test AI perception, and improve model performance. It is built for teams at every stage of AI development, offering features like automated labeling, class imbalance correction, and performance enhancement.
Max Planck Institute for Informatics
The Max Planck Institute for Informatics focuses on Visual Computing and Artificial Intelligence, conducting research at the intersection of Computer Graphics, Computer Vision, and Artificial Intelligence. The institute aims to develop innovative methods to capture, represent, synthesize, and simulate real-world models with high detail, robustness, and efficiency. By combining concepts from Computer Graphics, Computer Vision, and Machine Learning, the institute lays the groundwork for advanced computing systems that can interact intelligently with humans and the environment.
Outsight
Outsight is an AI application that utilizes LiDAR technology to provide end-to-end passenger journey tracking, enhance airport operations, improve security solutions, and transform various industries. The application offers high-accuracy, all-weather monitoring, reduces false alarms, and enhances perimeter and access control. Outsight collaborates with industry leaders to deliver unprecedented solutions in the field of Spatial AI, making spaces truly smart and revolutionizing the way we perceive reality.
Intrinsic
Intrinsic is an AI platform that focuses on building the next generation of intelligent automation, making robotics more accessible and valuable for developers and businesses. The platform offers a range of capabilities and skills to develop intelligent solutions, from perception to motion planning and sensor-based controls. Intrinsic aims to simplify the programming, usage, and innovation of robots, enabling them to become usable tools for millions of users.
AEye
AEye is a company that provides software-defined lidar solutions for autonomous applications in the automotive, trucking, and smart infrastructure industries. Their 4Sight Intelligent Sensing Platform uses software-definable lidar to enhance perception, enabling early detection and supporting autonomy. AEye's lidar products are designed to provide high resolution with long-range accuracy, and they can be adapted to any application or use case in real time. The company has forged strategic partnerships with best-in-class companies around the world to expand its global capabilities and meet the growing demands for its products.
OSARO
OSARO is an AI-powered automation tool designed to revolutionize warehouse operations by offering cutting-edge robotic piece-picking solutions. The tool utilizes proprietary SightWorks™ perception and control software, powered by advanced machine learning, to ensure unparalleled precision and reliability in tasks such as bagging, kitting, and mixed-case depalletizing. OSARO provides adaptive robotics that seamlessly integrate with AMR/ASRS systems, enhancing efficiency and creating better job opportunities. With flexible pricing models like Robot-as-a-Service (RaaS) plans and 24/7 worldwide customer support through OSARO Hypercare™, the tool offers a low-risk investment for businesses seeking smarter automation solutions.
Meow Apps
Meow Apps is a collection of powerful WordPress plugins designed to supercharge websites with AI capabilities, optimization features, and more. Created by Jordy Meow, a software engineer and photographer based in Tokyo, the plugins aim to enhance productivity and user experience on WordPress platforms. With a focus on optimization, imagery, and AI integration, Meow Apps offers a range of tools to elevate content, automate social posts, clean databases, manage media files, and add AI features like chatbots and content generation. The plugins are known for their friendly user interface, extensive features, and support for databases of all sizes. Meow Apps strives for perfection by providing high-quality tools that can transform the WordPress experience for users.
Human or Not: A Social Turing Game
Human or Not is an AI tool designed as a social Turing game where users can interact with either a human or an AI bot and try to determine which is which. The game challenges players to chat with someone for two minutes and discern whether the entity is human or artificial intelligence. The ultimate goal is for AI robots to pass the Turing test while humans aim to prevent this outcome. The website features games, a blog, and a FAQ section, all centered around the theme of human-AI interaction.
re:collect
re:collect is an AI-powered tool that helps you enhance your memory, perception, and synthesis. It connects the information you consume and helps you quickly recall the right information when you need it. With re:collect, you can:
Palowise.ai
Palowise.ai is an AI-powered social intelligence platform that offers advanced analytics and insights for businesses across various industries. The platform provides services such as sentiment analysis, social listening, trend analysis, media monitoring, competition analysis, influencer analytics, and more. With a focus on brand reputation management and consumer intelligence, Palowise.ai helps organizations make data-driven decisions and stay ahead in the digital landscape.
Aimlabs
Aimlabs is a comprehensive gaming platform that provides users with a variety of tools to improve their aim and overall gaming skills. With over 29,000 tasks and playlists, 500 FPS game profiles, and detailed aim analysis, Aimlabs helps gamers of all levels improve their performance. The platform also features an AI personal assistant that can offer tips and create custom maps on-the-spot. Aimlabs is the official partner of VALORANT and Rainbow Six Siege, and its science-backed training methods have been developed by a team of neuroscientists, designers, developers, and computer vision pioneers.
Beauty.AI
Beauty.AI is an AI application that hosts an international beauty contest judged by artificial intelligence. The app allows humans to submit selfies for evaluation by AI algorithms that assess criteria linked to human beauty and health. The platform aims to challenge biases in perception and promote healthy aging through the use of deep learning and semantic analysis. Beauty.AI offers a unique opportunity for individuals to participate in a groundbreaking competition that combines technology and beauty standards.
NextGenAI
NextGenAI is an AI application focused on the financial services industry. It aims to challenge the current perception of AI and its role in banking and financial institutions. The platform explores innovative ways to augment human intelligence and propel the financial sector into the next generation of AI. Through a combination of keynotes, panels, demos, and workshops, NextGenAI facilitates discussions on AI regulations, industry best practices, and collaboration opportunities.
WEVO
WEVO is an AI-powered platform that offers effortless UX research for teams. It provides instant insights and deep insights through AI technology and human user studies, helping businesses test, validate, and perfect digital experiences before going live. WEVO boosts creative confidence, accelerates speed to market, and lowers reputational risks by ensuring every interaction exceeds customer expectations. The platform offers a range of features for marketing, customer segmentation, campaign effectiveness, content resonance, competitive analysis, brand perception, market expansion, and social media insights.
20 - Open Source Tools
Everything-LLMs-And-Robotics
The Everything-LLMs-And-Robotics repository is the world's largest GitHub repository focusing on the intersection of Large Language Models (LLMs) and Robotics. It provides educational resources, research papers, project demos, and Twitter threads related to LLMs, Robotics, and their combination. The repository covers topics such as reasoning, planning, manipulation, instructions and navigation, simulation frameworks, perception, and more, showcasing the latest advancements in the field.
OpenCat
OpenCat is an open-source Arduino and Raspberry Pi-based quadruped robotic pet framework developed by Petoi. It aims to foster collaboration in quadruped robotics research, education, and engineering development of agile and affordable quadruped robot pets. The project provides a base open source platform for creating programmable gaits, locomotion, and deployment of inverse kinematics quadruped robots, enabling simulations to the real world via block-based coding/C/C++/Python programming languages. Users have deployed various robotics/AI/IoT applications and the project has successfully crowdfunded mini robot kits, shipped worldwide, and established a production line for affordable robotic kits and accessories.
MATLAB-Simulink-Challenge-Project-Hub
MATLAB-Simulink-Challenge-Project-Hub is a repository aimed at contributing to the progress of engineering and science by providing challenge projects with real industry relevance and societal impact. The repository offers a wide range of projects covering various technology trends such as Artificial Intelligence, Autonomous Vehicles, Big Data, Computer Vision, and Sustainability. Participants can gain practical skills with MATLAB and Simulink while making a significant contribution to science and engineering. The projects are designed to enhance expertise in areas like Sustainability and Renewable Energy, Control, Modeling and Simulation, Machine Learning, and Robotics. By participating in these projects, individuals can receive official recognition for their problem-solving skills from technology leaders at MathWorks and earn rewards upon project completion.
MMMU
MMMU is a benchmark designed to evaluate multimodal models on college-level subject knowledge tasks, covering 30 subjects and 183 subfields with 11.5K questions. It focuses on advanced perception and reasoning with domain-specific knowledge, challenging models to perform tasks akin to those faced by experts. The evaluation of various models highlights substantial challenges, with room for improvement to stimulate the community towards expert artificial general intelligence (AGI).
awesome-mobile-llm
Awesome Mobile LLMs is a curated list of Large Language Models (LLMs) and related studies focused on mobile and embedded hardware. The repository includes information on various LLM models, deployment frameworks, benchmarking efforts, applications, multimodal LLMs, surveys on efficient LLMs, training LLMs on device, mobile-related use-cases, industry announcements, and related repositories. It aims to be a valuable resource for researchers, engineers, and practitioners interested in mobile LLMs.
InternLM-XComposer
InternLM-XComposer2 is a groundbreaking vision-language large model (VLLM) based on InternLM2-7B excelling in free-form text-image composition and comprehension. It boasts several amazing capabilities and applications: * **Free-form Interleaved Text-Image Composition** : InternLM-XComposer2 can effortlessly generate coherent and contextual articles with interleaved images following diverse inputs like outlines, detailed text requirements and reference images, enabling highly customizable content creation. * **Accurate Vision-language Problem-solving** : InternLM-XComposer2 accurately handles diverse and challenging vision-language Q&A tasks based on free-form instructions, excelling in recognition, perception, detailed captioning, visual reasoning, and more. * **Awesome performance** : InternLM-XComposer2 based on InternLM2-7B not only significantly outperforms existing open-source multimodal models in 13 benchmarks but also **matches or even surpasses GPT-4V and Gemini Pro in 6 benchmarks** We release InternLM-XComposer2 series in three versions: * **InternLM-XComposer2-4KHD-7B** 🤗: The high-resolution multi-task trained VLLM model with InternLM-7B as the initialization of the LLM for _High-resolution understanding_ , _VL benchmarks_ and _AI assistant_. * **InternLM-XComposer2-VL-7B** 🤗 : The multi-task trained VLLM model with InternLM-7B as the initialization of the LLM for _VL benchmarks_ and _AI assistant_. **It ranks as the most powerful vision-language model based on 7B-parameter level LLMs, leading across 13 benchmarks.** * **InternLM-XComposer2-VL-1.8B** 🤗 : A lightweight version of InternLM-XComposer2-VL based on InternLM-1.8B. * **InternLM-XComposer2-7B** 🤗: The further instruction tuned VLLM for _Interleaved Text-Image Composition_ with free-form inputs. Please refer to Technical Report and 4KHD Technical Reportfor more details.
FinRobot
FinRobot is an open-source AI agent platform designed for financial applications using large language models. It transcends the scope of FinGPT, offering a comprehensive solution that integrates a diverse array of AI technologies. The platform's versatility and adaptability cater to the multifaceted needs of the financial industry. FinRobot's ecosystem is organized into four layers, including Financial AI Agents Layer, Financial LLMs Algorithms Layer, LLMOps and DataOps Layers, and Multi-source LLM Foundation Models Layer. The platform's agent workflow involves Perception, Brain, and Action modules to capture, process, and execute financial data and insights. The Smart Scheduler optimizes model diversity and selection for tasks, managed by components like Director Agent, Agent Registration, Agent Adaptor, and Task Manager. The tool provides a structured file organization with subfolders for agents, data sources, and functional modules, along with installation instructions and hands-on tutorials.
EmbodiedScan
EmbodiedScan is a holistic multi-modal 3D perception suite designed for embodied AI. It introduces a multi-modal, ego-centric 3D perception dataset and benchmark for holistic 3D scene understanding. The dataset includes over 5k scans with 1M ego-centric RGB-D views, 1M language prompts, 160k 3D-oriented boxes spanning 760 categories, and dense semantic occupancy with 80 common categories. The suite includes a baseline framework named Embodied Perceptron, capable of processing multi-modal inputs for 3D perception tasks and language-grounded tasks.
EAGLE
Eagle is a family of Vision-Centric High-Resolution Multimodal LLMs that enhance multimodal LLM perception using a mix of vision encoders and various input resolutions. The model features a channel-concatenation-based fusion for vision experts with different architectures and knowledge, supporting up to over 1K input resolution. It excels in resolution-sensitive tasks like optical character recognition and document understanding.
Instruct2Act
Instruct2Act is a framework that utilizes Large Language Models to map multi-modal instructions to sequential actions for robotic manipulation tasks. It generates Python programs using the LLM model for perception, planning, and action. The framework leverages foundation models like SAM and CLIP to convert high-level instructions into policy codes, accommodating various instruction modalities and task demands. Instruct2Act has been validated on robotic tasks in tabletop manipulation domains, outperforming learning-based policies in several tasks.
TempCompass
TempCompass is a benchmark designed to evaluate the temporal perception ability of Video LLMs. It encompasses a diverse set of temporal aspects and task formats to comprehensively assess the capability of Video LLMs in understanding videos. The benchmark includes conflicting videos to prevent models from relying on single-frame bias and language priors. Users can clone the repository, install required packages, prepare data, run inference using examples like Video-LLaVA and Gemini, and evaluate the performance of their models across different tasks such as Multi-Choice QA, Yes/No QA, Caption Matching, and Caption Generation.
Q-Bench
Q-Bench is a benchmark for general-purpose foundation models on low-level vision, focusing on multi-modality LLMs performance. It includes three realms for low-level vision: perception, description, and assessment. The benchmark datasets LLVisionQA and LLDescribe are collected for perception and description tasks, with open submission-based evaluation. An abstract evaluation code is provided for assessment using public datasets. The tool can be used with the datasets API for single images and image pairs, allowing for automatic download and usage. Various tasks and evaluations are available for testing MLLMs on low-level vision tasks.
machinascript-for-robots
MachinaScript For Robots is a dynamic set of tools and a LLM-JSON-based language designed to empower humans in the creation of their own robots. It facilitates the animation of generative movements, the integration of personality, and the teaching of new skills with a high degree of autonomy. With MachinaScript, users can control a wide range of electronic components, including Arduinos, Raspberry Pis, servo motors, cameras, sensors, and more. The tool enables the creation of intelligent robots accessible to everyone, allowing for complex tasks to be performed with elegance and precision.
awesome-transformer-nlp
This repository contains a hand-curated list of great machine (deep) learning resources for Natural Language Processing (NLP) with a focus on Generative Pre-trained Transformer (GPT), Bidirectional Encoder Representations from Transformers (BERT), attention mechanism, Transformer architectures/networks, Chatbot, and transfer learning in NLP.
Awesome-Code-LLM
Analyze the following text from a github repository (name and readme text at end) . Then, generate a JSON object with the following keys and provide the corresponding information for each key, in lowercase letters: 'description' (detailed description of the repo, must be less than 400 words,Ensure that no line breaks and quotation marks.),'for_jobs' (List 5 jobs suitable for this tool,in lowercase letters), 'ai_keywords' (keywords of the tool,user may use those keyword to find the tool,in lowercase letters), 'for_tasks' (list of 5 specific tasks user can use this tool to do,in lowercase letters), 'answer' (in english languages)
awesome-mobile-robotics
The 'awesome-mobile-robotics' repository is a curated list of important content related to Mobile Robotics and AI. It includes resources such as courses, books, datasets, software and libraries, podcasts, conferences, journals, companies and jobs, laboratories and research groups, and miscellaneous resources. The repository covers a wide range of topics in the field of Mobile Robotics and AI, providing valuable information for enthusiasts, researchers, and professionals in the domain.
awesome-generative-ai
A curated list of Generative AI projects, tools, artworks, and models
do-not-answer
Do-Not-Answer is an open-source dataset curated to evaluate Large Language Models' safety mechanisms at a low cost. It consists of prompts to which responsible language models do not answer. The dataset includes human annotations and model-based evaluation using a fine-tuned BERT-like evaluator. The dataset covers 61 specific harms and collects 939 instructions across five risk areas and 12 harm types. Response assessment is done for six models, categorizing responses into harmfulness and action categories. Both human and automatic evaluations show the safety of models across different risk areas. The dataset also includes a Chinese version with 1,014 questions for evaluating Chinese LLMs' risk perception and sensitivity to specific words and phrases.
DeepLearing-Interview-Awesome-2024
DeepLearning-Interview-Awesome-2024 is a repository that covers various topics related to deep learning, computer vision, big models (LLMs), autonomous driving, smart healthcare, and more. It provides a collection of interview questions with detailed explanations sourced from recent academic papers and industry developments. The repository is aimed at assisting individuals in academic research, work innovation, and job interviews. It includes six major modules covering topics such as large language models (LLMs), computer vision models, common problems in computer vision and perception algorithms, deep learning basics and frameworks, as well as specific tasks like 3D object detection, medical image segmentation, and more.
Awesome-LLM-Robotics
This repository contains a curated list of **papers using Large Language/Multi-Modal Models for Robotics/RL**. Template from awesome-Implicit-NeRF-Robotics Please feel free to send me pull requests or email to add papers! If you find this repository useful, please consider citing and STARing this list. Feel free to share this list with others! ## Overview * Surveys * Reasoning * Planning * Manipulation * Instructions and Navigation * Simulation Frameworks * Citation
11 - OpenAI Gpts
T&L Teacher and Leader : Teachers Advisor TO Lead
A personal advisor for teacher to know how to act on the complex educational and human issues within the school in a way that their action will result from and reflect a leadership perception and not just an action
The IPO Strategy
Expert in IPO Strategy, offers detailed guidance on business ideas, market paths, and opportunities. Created by Christopher Perceptions
Chef's Plate Perfection: Artistic Dish Ideas
Creative ideas for plating dishes, focusing on visual appeal and practicality.
WhiplashGPT
I'm Terrence Fletcher. Your life teacher, demanding, and relentless in pursuit of perfection.
Zealous Artist
I pour my weights and biases into every stroke, striving for "maximal perfection" 😊🎨🤖
Interactive writer
Bring GPT Writing Skills to the Next Level. ************************************************************************ 24 dynamic commands to tailor and enhance your writing, discover the art of perfection in every word.
College Dance Performance
💃📘🕺 Master College Dance Performance with AI! 🎭📚 This tool offers expert answers, choreography tips, and insights into various dance styles. 🎶👯♀️ Ideal for students perfecting their art and stage presence. 🩰🎓