MegaDetector
MegaDetector is an AI model that helps conservation folks spend less time doing boring things with camera trap images.
Stars: 106
MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".
README:
...helping conservation biologists spend less time doing boring things with camera trap images.
- What's MegaDetector all about?
- How do I get started with MegaDetector?
- Who is using MegaDetector?
- Repo contents
- Contact
- Gratuitous camera trap picture
MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems.
Here's a “teaser” image of what MegaDetector output looks like:
Image credit University of Washington.
- If you are looking for a convenient tool to run MegaDetector, you don't need anything from this repository: check out EcoAssist.
- If you're just considering the use of AI in your workflow, and you aren't even sure yet whether MegaDetector would be useful to you, we recommend reading the "getting started with MegaDetector" page.
- If you're already familiar with MegaDetector and you're ready to run it on your data, see the MegaDetector User Guide for instructions on running MegaDetector.
- If you're a programmer-type looking to use tools from this repo, check out the MegaDetector Python package that provides access to everything in this repo (yes, you guessed it, "pip install megadetector").
- If you have any questions, or you want to tell us that MegaDetector was amazing/terrible on your images, email us!
MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled “Everything I know about machine learning and camera traps”.
We work with ecologists all over the world to help them spend less time annotating images and more time thinking about conservation. You can read a little more about how this works on our getting started with MegaDetector page.
Here are a few of the organizations that have used MegaDetector... we're only listing organizations who (a) we know about and (b) have given us permission to refer to them here (or have posted publicly about their use of MegaDetector), so if you're using MegaDetector or other tools from this repo and would like to be added to this list, email us!
-
Canadian Parks and Wilderness Society (CPAWS) Northern Alberta Chapter
-
Applied Conservation Macro Ecology Lab, University of Victoria
-
Banff National Park Resource Conservation, Parks Canada
-
Blumstein Lab, UCLA
-
Borderlands Research Institute, Sul Ross State University
-
Capitol Reef National Park / Utah Valley University
-
Canyon Critters Project, University of Georgia
-
Center for Biodiversity and Conservation, American Museum of Natural History
-
Centre for Ecosystem Science, UNSW Sydney
-
Cross-Cultural Ecology Lab, Macquarie University
-
DC Cat Count, led by the Humane Rescue Alliance
-
Department of Fish and Wildlife Sciences, University of Idaho
-
Department of Society & Conservation, W.A. Franke College of Forestry & Conservation, University of Montana
-
Department of Wildlife Ecology and Conservation, University of Florida
-
Ecology and Conservation of Amazonian Vertebrates Research Group, Federal University of Amapá
-
Gola Forest Programme, Royal Society for the Protection of Birds (RSPB)
-
Graeme Shannon's Research Group, Bangor University
-
Grizzly Bear Recovery Program, U.S. Fish & Wildlife Service
-
Hamaarag, The Steinhardt Museum of Natural History, Tel Aviv University
-
Institut des Science de la Forêt Tempérée (ISFORT), Université du Québec en Outaouais
-
Lab of Dr. Bilal Habib, the Wildlife Institute of India
-
Landscape Ecology Lab, Concordia University
-
Mammal Spatial Ecology and Conservation Lab, Washington State University
-
McLoughlin Lab in Population Ecology, University of Saskatchewan
-
National Wildlife Refuge System, Southwest Region, U.S. Fish & Wildlife Service
-
Northern Great Plains Program, Smithsonian
-
Polar Ecology Group, University of Gdansk
-
Quantitative Ecology Lab, University of Washington
-
San Diego Field Station, U.S. Geological Survey
-
Santa Monica Mountains Recreation Area, National Park Service
-
Seattle Urban Carnivore Project, Woodland Park Zoo
-
Serra dos Órgãos National Park, ICMBio
-
Snapshot USA, Smithsonian
-
TROPECOLNET project, Museo Nacional de Ciencias Naturales
-
Wildlife Coexistence Lab, University of British Columbia
-
Wildlife Research, Oregon Department of Fish and Wildlife
-
Wildlife Division, Michigan Department of Natural Resources
-
Department of Ecology, TU Berlin
-
Ghost Cat Analytics
-
Protected Areas Unit, Canadian Wildlife Service
-
Conservation and Restoration Science Branch, New South Wales Department of Climate Change, Energy, the Environment and Water
-
School of Natural Sciences, University of Tasmania (story)
-
Kenai National Wildlife Refuge, U.S. Fish & Wildlife Service (story)
-
Australian Wildlife Conservancy (blog posts 1, 2, 3)
-
Island Conservation (blog posts 1,2) (video)
-
Alberta Biodiversity Monitoring Institute (ABMI) (WildTrax platform) (blog posts 1,2)
-
Shan Shui Conservation Center (blog post) (translated blog post) (Web demo)
-
Road Ecology Center, University of California, Davis (Wildlife Observer Network platform)
-
The Nature Conservancy in California (Animl platform) (story)
Also see:
-
The list of MD-related GUIs, platforms, and GitHub repos within the MegaDetector User Guide
-
Peter's map of EcoAssist users (who are also MegaDetector users!)
-
The list of papers tagged "MegaDetector" on our list of papers about ML and camera traps
MegaDetector was initially developed by the Microsoft AI for Earth program; this repo was forked from the microsoft/cameratraps repo and is maintained by the original MegaDetector developers (who are no longer at Microsoft, but are absolutely fantastically eternally grateful to Microsoft for the investment and commitment that made MegaDetector happen). If you're interested in MD's history, see the downloading the model section in the MegaDetector User Guide to learn about the history of MegaDetector releases, and the can you share the training data? section to learn about the training data used in each of those releases.
The core functionality provided in this repo is:
- Tools for training and running MegaDetector.
- Tools for working with MegaDetector output, e.g. for reviewing the results of a large processing batch.
- Tools to convert among frequently-used camera trap metadata formats.
This repo does not host the data used to train MegaDetector, but we work with our collaborators to make data and annotations available whenever possible on lila.science. See the MegaDetector training data section to learn more about the data used to train MegaDetector.
This repo is organized into the following folders...
Code for running models, especially MegaDetector.
Code for common operations one might do after running MegaDetector, e.g. generating preview pages to summarize your results, separating images into different folders based on AI results, or converting results to a different format.
Small utility functions for string manipulation, filename manipulation, downloading files from URLs, etc.
Tools for visualizing images with ground truth and/or predicted bounding boxes.
Code for:
- Converting frequently-used metadata formats to COCO Camera Traps format
- Converting the output of AI models (especially YOLOv5) to the format used for AI results throughout this repo
- Creating, visualizing, and editing COCO Camera Traps .json databases
Code for hosting our models as an API, either for synchronous operation (i.e., for real-time inference) or as a batch process (for large biodiversity surveys).
Experimental code for training species classifiers on new data sets, generally trained on MegaDetector crops. Currently the main pipeline described in this folder relies on a large database of labeled images that is not publicly available; therefore, this folder is not yet set up to facilitate training of your own classifiers. However, it is useful for users of the classifiers that we train, and contains some useful starting points if you are going to take a "DIY" approach to training classifiers on cropped images.
All that said, here's another "teaser image" of what you get at the end of training and running a classifier:
Image credit University of Minnesota, from the Snapshot Safari program.
Code to facilitate mapping data-set-specific category names (e.g. "lion", which means very different things in Idaho vs. South Africa) to a standard taxonomy.
Environment files... specifically .yml files for mamba/conda environments (these are what we recommend in our MegaDetector User Guide), and a requirements.txt for the pip-inclined.
Media used in documentation.
Old code that we didn't quite want to delete, but is basically obsolete.
Random things that don't fit in any other directory, but aren't quite deprecated. Mostly postprocessing scripts that were built for a single use case but could potentially be useful in the future.
A handful of images from LILA that facilitate testing and debugging.
For questions about this repo, contact [email protected].
You can also chat with us and the broader camera trap AI community on the AI for Conservation forum at WILDLABS or the AI for Conservation Slack group.
Image credit USDA, from the NACTI data set.
You will find lots more gratuitous camera trap pictures sprinkled about this repo. It's like a scavenger hunt.
This repository is licensed with the MIT license.
Code written on or before April 28, 2023 is copyright Microsoft.
This project welcomes contributions, as pull requests, issues, or suggestions by email. We have a list of issues that we're hoping to address, many of which would be good starting points for new contributors. We also depend on other open-source tools that help users run MegaDetector (e.g. EcoAssist and CamTrap Detector) and work with MegaDetector results (e.g. Timelapse); if you are looking to get involved in GUI development, reach out to the developers of those tools as well!
If you are interesting in getting involved in the conservation technology space, and MegaDetector just happens to be the first page you landed on, and none of our open issues are getting you fired up, don't fret! Head over to the WILDLABS discussion forums and let the community know you're a developer looking to get involved. Someone needs your help!
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for MegaDetector
Similar Open Source Tools
MegaDetector
MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".
agentUniverse
agentUniverse is a multi-agent framework based on large language models, providing flexible capabilities for building individual agents. It focuses on multi-agent collaborative patterns, integrating domain experience to help agents solve problems in various fields. The framework includes pattern components like PEER and DOE for event interpretation, industry analysis, and financial report generation. It offers features for agent construction, multi-agent collaboration, and domain expertise integration, aiming to create intelligent applications with professional know-how.
LLMs-in-science
The 'LLMs-in-science' repository is a collaborative environment for organizing papers related to large language models (LLMs) and autonomous agents in the field of chemistry. The goal is to discuss trend topics, challenges, and the potential for supporting scientific discovery in the context of artificial intelligence. The repository aims to maintain a systematic structure of the field and welcomes contributions from the community to keep the content up-to-date and relevant.
ParrotServe
Parrot is a distributed serving system for LLM-based Applications, designed to efficiently serve LLM-based applications by adding Semantic Variable in the OpenAI-style API. It allows for horizontal scalability with multiple Engine instances running LLM models communicating with ServeCore. The system enables AI agents to interact with LLMs via natural language prompts for collaborative tasks.
agentUniverse
agentUniverse is a framework for developing applications powered by multi-agent based on large language model. It provides essential components for building single agent and multi-agent collaboration mechanism for customizing collaboration patterns. Developers can easily construct multi-agent applications and share pattern practices from different fields. The framework includes pre-installed collaboration patterns like PEER and DOE for complex task breakdown and data-intensive tasks.
CodeFuse-muAgent
CodeFuse-muAgent is a Multi-Agent framework designed to streamline Standard Operating Procedure (SOP) orchestration for agents. It integrates toolkits, code libraries, knowledge bases, and sandbox environments for rapid construction of complex Multi-Agent interactive applications. The framework enables efficient execution and handling of multi-layered and multi-dimensional tasks.
aika
AIKA (Artificial Intelligence for Knowledge Acquisition) is a new type of artificial neural network designed to mimic the behavior of a biological brain more closely and bridge the gap to classical AI. The network conceptually separates activations from neurons, creating two separate graphs to represent acquired knowledge and inferred information. It uses different types of neurons and synapses to propagate activation values, binding signals, causal relations, and training gradients. The network structure allows for flexible topology and supports the gradual population of neurons and synapses during training.
llm-on-openshift
This repository provides resources, demos, and recipes for working with Large Language Models (LLMs) on OpenShift using OpenShift AI or Open Data Hub. It includes instructions for deploying inference servers for LLMs, such as vLLM, Hugging Face TGI, Caikit-TGIS-Serving, and Ollama. Additionally, it offers guidance on deploying serving runtimes, such as vLLM Serving Runtime and Hugging Face Text Generation Inference, in the Single-Model Serving stack of Open Data Hub or OpenShift AI. The repository also covers vector databases that can be used as a Vector Store for Retrieval Augmented Generation (RAG) applications, including Milvus, PostgreSQL+pgvector, and Redis. Furthermore, it provides examples of inference and application usage, such as Caikit, Langchain, Langflow, and UI examples.
Robyn
Robyn is an experimental, semi-automated and open-sourced Marketing Mix Modeling (MMM) package from Meta Marketing Science. It uses various machine learning techniques to define media channel efficiency and effectivity, explore adstock rates and saturation curves. Built for granular datasets with many independent variables, especially suitable for digital and direct response advertisers with rich data sources. Aiming to democratize MMM, make it accessible for advertisers of all sizes, and contribute to the measurement landscape.
GrAIdient
GrAIdient is a framework designed to enable the development of deep learning models using the internal GPU of a Mac. It provides access to the graph of layers, allowing for unique model design with greater understanding, control, and reproducibility. The goal is to challenge the understanding of deep learning models, transitioning from black box to white box models. Key features include direct access to layers, native Mac GPU support, Swift language implementation, gradient checking, PyTorch interoperability, and more. The documentation covers main concepts, architecture, and examples. GrAIdient is MIT licensed.
Me-LLaMA
Me LLaMA introduces a suite of open-source medical Large Language Models (LLMs), including Me LLaMA 13B/70B and their chat-enhanced versions. Developed through innovative continual pre-training and instruction tuning, these models leverage a vast medical corpus comprising PubMed papers, medical guidelines, and general domain data. Me LLaMA sets new benchmarks on medical reasoning tasks, making it a significant asset for medical NLP applications and research. The models are intended for computational linguistics and medical research, not for clinical decision-making without validation and regulatory approval.
LLMsForTimeSeries
LLMsForTimeSeries is a repository that questions the usefulness of language models in time series forecasting. The work shows that simple baselines outperform most language model-based time series forecasting models. It includes ablation studies on LLM-based TSF methods and introduces the PAttn method, showcasing the performance of patching and attention structures in forecasting. The repository provides datasets, setup instructions, and scripts for running ablations on different datasets.
Sarvadnya
Sarvadnya is a repository focused on interfacing custom data using Large Language Models (LLMs) through Proof-of-Concepts (PoCs) like Retrieval Augmented Generation (RAG) and Fine-Tuning. It aims to enable domain adaptation for LLMs to answer on user-specific corpora. The repository also covers topics such as Indic-languages models, 3D World Simulations, Knowledge Graphs Generation, Signal Processing, Drones, UAV Image Processing, and Floor Plan Segmentation. It provides insights into building chatbots of various modalities, preparing videos, and creating content for different platforms like Medium, LinkedIn, and YouTube. The tech stacks involved range from enterprise solutions like Google Doc AI and Microsoft Azure Language AI Services to open-source tools like Langchain and HuggingFace.
naas
Naas (Notebooks as a service) is an open source platform that enables users to create powerful data engines combining automation, analytics, and AI from Jupyter notebooks. It offers features like templates for automated data jobs and reports, drivers for data connectivity, and production-ready environment with scheduling and notifications. Naas aims to provide an alternative to Google Colab with enhanced low-code layers.
For similar tasks
MegaDetector
MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".
For similar jobs
promptflow
**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.
deepeval
DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.
MegaDetector
MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".
leapfrogai
LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.
llava-docker
This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.
carrot
The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.
TrustLLM
TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.
AI-YinMei
AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.