MegaDetector
MegaDetector is an AI model that helps conservation folks spend less time doing boring things with camera trap images.
Stars: 106
MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".
README:
...helping conservation biologists spend less time doing boring things with camera trap images.
- What's MegaDetector all about?
- How do I get started with MegaDetector?
- Who is using MegaDetector?
- Repo contents
- Contact
- Gratuitous camera trap picture
MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems.
Here's a “teaser” image of what MegaDetector output looks like:
Image credit University of Washington.
- If you are looking for a convenient tool to run MegaDetector, you don't need anything from this repository: check out EcoAssist.
- If you're just considering the use of AI in your workflow, and you aren't even sure yet whether MegaDetector would be useful to you, we recommend reading the "getting started with MegaDetector" page.
- If you're already familiar with MegaDetector and you're ready to run it on your data, see the MegaDetector User Guide for instructions on running MegaDetector.
- If you're a programmer-type looking to use tools from this repo, check out the MegaDetector Python package that provides access to everything in this repo (yes, you guessed it, "pip install megadetector").
- If you have any questions, or you want to tell us that MegaDetector was amazing/terrible on your images, email us!
MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled “Everything I know about machine learning and camera traps”.
We work with ecologists all over the world to help them spend less time annotating images and more time thinking about conservation. You can read a little more about how this works on our getting started with MegaDetector page.
Here are a few of the organizations that have used MegaDetector... we're only listing organizations who (a) we know about and (b) have given us permission to refer to them here (or have posted publicly about their use of MegaDetector), so if you're using MegaDetector or other tools from this repo and would like to be added to this list, email us!
-
Canadian Parks and Wilderness Society (CPAWS) Northern Alberta Chapter
-
Applied Conservation Macro Ecology Lab, University of Victoria
-
Banff National Park Resource Conservation, Parks Canada
-
Blumstein Lab, UCLA
-
Borderlands Research Institute, Sul Ross State University
-
Capitol Reef National Park / Utah Valley University
-
Canyon Critters Project, University of Georgia
-
Center for Biodiversity and Conservation, American Museum of Natural History
-
Centre for Ecosystem Science, UNSW Sydney
-
Cross-Cultural Ecology Lab, Macquarie University
-
DC Cat Count, led by the Humane Rescue Alliance
-
Department of Fish and Wildlife Sciences, University of Idaho
-
Department of Society & Conservation, W.A. Franke College of Forestry & Conservation, University of Montana
-
Department of Wildlife Ecology and Conservation, University of Florida
-
Ecology and Conservation of Amazonian Vertebrates Research Group, Federal University of Amapá
-
Gola Forest Programme, Royal Society for the Protection of Birds (RSPB)
-
Graeme Shannon's Research Group, Bangor University
-
Grizzly Bear Recovery Program, U.S. Fish & Wildlife Service
-
Hamaarag, The Steinhardt Museum of Natural History, Tel Aviv University
-
Institut des Science de la Forêt Tempérée (ISFORT), Université du Québec en Outaouais
-
Lab of Dr. Bilal Habib, the Wildlife Institute of India
-
Landscape Ecology Lab, Concordia University
-
Mammal Spatial Ecology and Conservation Lab, Washington State University
-
McLoughlin Lab in Population Ecology, University of Saskatchewan
-
National Wildlife Refuge System, Southwest Region, U.S. Fish & Wildlife Service
-
Northern Great Plains Program, Smithsonian
-
Polar Ecology Group, University of Gdansk
-
Quantitative Ecology Lab, University of Washington
-
San Diego Field Station, U.S. Geological Survey
-
Santa Monica Mountains Recreation Area, National Park Service
-
Seattle Urban Carnivore Project, Woodland Park Zoo
-
Serra dos Órgãos National Park, ICMBio
-
Snapshot USA, Smithsonian
-
TROPECOLNET project, Museo Nacional de Ciencias Naturales
-
Wildlife Coexistence Lab, University of British Columbia
-
Wildlife Research, Oregon Department of Fish and Wildlife
-
Wildlife Division, Michigan Department of Natural Resources
-
Department of Ecology, TU Berlin
-
Ghost Cat Analytics
-
Protected Areas Unit, Canadian Wildlife Service
-
Conservation and Restoration Science Branch, New South Wales Department of Climate Change, Energy, the Environment and Water
-
School of Natural Sciences, University of Tasmania (story)
-
Kenai National Wildlife Refuge, U.S. Fish & Wildlife Service (story)
-
Australian Wildlife Conservancy (blog posts 1, 2, 3)
-
Island Conservation (blog posts 1,2) (video)
-
Alberta Biodiversity Monitoring Institute (ABMI) (WildTrax platform) (blog posts 1,2)
-
Shan Shui Conservation Center (blog post) (translated blog post) (Web demo)
-
Road Ecology Center, University of California, Davis (Wildlife Observer Network platform)
-
The Nature Conservancy in California (Animl platform) (story)
Also see:
-
The list of MD-related GUIs, platforms, and GitHub repos within the MegaDetector User Guide
-
Peter's map of EcoAssist users (who are also MegaDetector users!)
-
The list of papers tagged "MegaDetector" on our list of papers about ML and camera traps
MegaDetector was initially developed by the Microsoft AI for Earth program; this repo was forked from the microsoft/cameratraps repo and is maintained by the original MegaDetector developers (who are no longer at Microsoft, but are absolutely fantastically eternally grateful to Microsoft for the investment and commitment that made MegaDetector happen). If you're interested in MD's history, see the downloading the model section in the MegaDetector User Guide to learn about the history of MegaDetector releases, and the can you share the training data? section to learn about the training data used in each of those releases.
The core functionality provided in this repo is:
- Tools for training and running MegaDetector.
- Tools for working with MegaDetector output, e.g. for reviewing the results of a large processing batch.
- Tools to convert among frequently-used camera trap metadata formats.
This repo does not host the data used to train MegaDetector, but we work with our collaborators to make data and annotations available whenever possible on lila.science. See the MegaDetector training data section to learn more about the data used to train MegaDetector.
This repo is organized into the following folders...
Code for running models, especially MegaDetector.
Code for common operations one might do after running MegaDetector, e.g. generating preview pages to summarize your results, separating images into different folders based on AI results, or converting results to a different format.
Small utility functions for string manipulation, filename manipulation, downloading files from URLs, etc.
Tools for visualizing images with ground truth and/or predicted bounding boxes.
Code for:
- Converting frequently-used metadata formats to COCO Camera Traps format
- Converting the output of AI models (especially YOLOv5) to the format used for AI results throughout this repo
- Creating, visualizing, and editing COCO Camera Traps .json databases
Code for hosting our models as an API, either for synchronous operation (i.e., for real-time inference) or as a batch process (for large biodiversity surveys).
Experimental code for training species classifiers on new data sets, generally trained on MegaDetector crops. Currently the main pipeline described in this folder relies on a large database of labeled images that is not publicly available; therefore, this folder is not yet set up to facilitate training of your own classifiers. However, it is useful for users of the classifiers that we train, and contains some useful starting points if you are going to take a "DIY" approach to training classifiers on cropped images.
All that said, here's another "teaser image" of what you get at the end of training and running a classifier:
Image credit University of Minnesota, from the Snapshot Safari program.
Code to facilitate mapping data-set-specific category names (e.g. "lion", which means very different things in Idaho vs. South Africa) to a standard taxonomy.
Environment files... specifically .yml files for mamba/conda environments (these are what we recommend in our MegaDetector User Guide), and a requirements.txt for the pip-inclined.
Media used in documentation.
Old code that we didn't quite want to delete, but is basically obsolete.
Random things that don't fit in any other directory, but aren't quite deprecated. Mostly postprocessing scripts that were built for a single use case but could potentially be useful in the future.
A handful of images from LILA that facilitate testing and debugging.
For questions about this repo, contact [email protected].
You can also chat with us and the broader camera trap AI community on the AI for Conservation forum at WILDLABS or the AI for Conservation Slack group.
Image credit USDA, from the NACTI data set.
You will find lots more gratuitous camera trap pictures sprinkled about this repo. It's like a scavenger hunt.
This repository is licensed with the MIT license.
Code written on or before April 28, 2023 is copyright Microsoft.
This project welcomes contributions, as pull requests, issues, or suggestions by email. We have a list of issues that we're hoping to address, many of which would be good starting points for new contributors. We also depend on other open-source tools that help users run MegaDetector (e.g. EcoAssist and CamTrap Detector) and work with MegaDetector results (e.g. Timelapse); if you are looking to get involved in GUI development, reach out to the developers of those tools as well!
If you are interesting in getting involved in the conservation technology space, and MegaDetector just happens to be the first page you landed on, and none of our open issues are getting you fired up, don't fret! Head over to the WILDLABS discussion forums and let the community know you're a developer looking to get involved. Someone needs your help!
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for MegaDetector
Similar Open Source Tools
MegaDetector
MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".
dioptra
Dioptra is a software test platform for assessing the trustworthy characteristics of artificial intelligence (AI). It supports the NIST AI Risk Management Framework by providing functionality to assess, analyze, and track identified AI risks. Dioptra provides a REST API and can be controlled via a web interface or Python client for designing, managing, executing, and tracking experiments. It aims to be reproducible, traceable, extensible, interoperable, modular, secure, interactive, shareable, and reusable.
kaapana
Kaapana is an open-source toolkit for state-of-the-art platform provisioning in the field of medical data analysis. The applications comprise AI-based workflows and federated learning scenarios with a focus on radiological and radiotherapeutic imaging. Obtaining large amounts of medical data necessary for developing and training modern machine learning methods is an extremely challenging effort that often fails in a multi-center setting, e.g. due to technical, organizational and legal hurdles. A federated approach where the data remains under the authority of the individual institutions and is only processed on-site is, in contrast, a promising approach ideally suited to overcome these difficulties. Following this federated concept, the goal of Kaapana is to provide a framework and a set of tools for sharing data processing algorithms, for standardized workflow design and execution as well as for performing distributed method development. This will facilitate data analysis in a compliant way enabling researchers and clinicians to perform large-scale multi-center studies. By adhering to established standards and by adopting widely used open technologies for private cloud development and containerized data processing, Kaapana integrates seamlessly with the existing clinical IT infrastructure, such as the Picture Archiving and Communication System (PACS), and ensures modularity and easy extensibility.
LLMs-in-science
The 'LLMs-in-science' repository is a collaborative environment for organizing papers related to large language models (LLMs) and autonomous agents in the field of chemistry. The goal is to discuss trend topics, challenges, and the potential for supporting scientific discovery in the context of artificial intelligence. The repository aims to maintain a systematic structure of the field and welcomes contributions from the community to keep the content up-to-date and relevant.
ParrotServe
Parrot is a distributed serving system for LLM-based Applications, designed to efficiently serve LLM-based applications by adding Semantic Variable in the OpenAI-style API. It allows for horizontal scalability with multiple Engine instances running LLM models communicating with ServeCore. The system enables AI agents to interact with LLMs via natural language prompts for collaborative tasks.
CodeFuse-muAgent
CodeFuse-muAgent is a Multi-Agent framework designed to streamline Standard Operating Procedure (SOP) orchestration for agents. It integrates toolkits, code libraries, knowledge bases, and sandbox environments for rapid construction of complex Multi-Agent interactive applications. The framework enables efficient execution and handling of multi-layered and multi-dimensional tasks.
DNAnalyzer
DNAnalyzer is a nonprofit organization dedicated to revolutionizing DNA analysis through AI-powered tools. It aims to democratize access to DNA analysis for a deeper understanding of human health and disease. The tool provides innovative AI-powered analysis and interpretive tools to empower geneticists, physicians, and researchers to gain deep insights into DNA sequences, revolutionizing how we understand human health and disease.
intro-to-intelligent-apps
This repository introduces and helps organizations get started with building AI Apps and incorporating Large Language Models (LLMs) into them. The workshop covers topics such as prompt engineering, AI orchestration, and deploying AI apps. Participants will learn how to use Azure OpenAI, Langchain/ Semantic Kernel, Qdrant, and Azure AI Search to build intelligent applications.
ai_summer
AI Summer is a repository focused on providing workshops and resources for developing foundational skills in generative AI models and transformer models. The repository offers practical applications for inferencing and training, with a specific emphasis on understanding and utilizing advanced AI chat models like BingGPT. Participants are encouraged to engage in interactive programming environments, decide on projects to work on, and actively participate in discussions and breakout rooms. The workshops cover topics such as generative AI models, retrieval-augmented generation, building AI solutions, and fine-tuning models. The goal is to equip individuals with the necessary skills to work with AI technologies effectively and securely, both locally and in the cloud.
llm-on-openshift
This repository provides resources, demos, and recipes for working with Large Language Models (LLMs) on OpenShift using OpenShift AI or Open Data Hub. It includes instructions for deploying inference servers for LLMs, such as vLLM, Hugging Face TGI, Caikit-TGIS-Serving, and Ollama. Additionally, it offers guidance on deploying serving runtimes, such as vLLM Serving Runtime and Hugging Face Text Generation Inference, in the Single-Model Serving stack of Open Data Hub or OpenShift AI. The repository also covers vector databases that can be used as a Vector Store for Retrieval Augmented Generation (RAG) applications, including Milvus, PostgreSQL+pgvector, and Redis. Furthermore, it provides examples of inference and application usage, such as Caikit, Langchain, Langflow, and UI examples.
GrAIdient
GrAIdient is a framework designed to enable the development of deep learning models using the internal GPU of a Mac. It provides access to the graph of layers, allowing for unique model design with greater understanding, control, and reproducibility. The goal is to challenge the understanding of deep learning models, transitioning from black box to white box models. Key features include direct access to layers, native Mac GPU support, Swift language implementation, gradient checking, PyTorch interoperability, and more. The documentation covers main concepts, architecture, and examples. GrAIdient is MIT licensed.
merlin
Merlin is a groundbreaking model capable of generating natural language responses intricately linked with object trajectories of multiple images. It excels in predicting and reasoning about future events based on initial observations, showcasing unprecedented capability in future prediction and reasoning. Merlin achieves state-of-the-art performance on the Future Reasoning Benchmark and multiple existing multimodal language models benchmarks, demonstrating powerful multi-modal general ability and foresight minds.
ai2apps
AI2Apps is a visual IDE for building LLM-based AI agent applications, enabling developers to efficiently create AI agents through drag-and-drop, with features like design-to-development for rapid prototyping, direct packaging of agents into apps, powerful debugging capabilities, enhanced user interaction, efficient team collaboration, flexible deployment, multilingual support, simplified product maintenance, and extensibility through plugins.
oreilly-hands-on-gpt-llm
This repository contains code for the O'Reilly Live Online Training for Deploying GPT & LLMs. Learn how to use GPT-4, ChatGPT, OpenAI embeddings, and other large language models to build applications for experimenting and production. Gain practical experience in building applications like text generation, summarization, question answering, and more. Explore alternative generative models such as Cohere and GPT-J. Understand prompt engineering, context stuffing, and few-shot learning to maximize the potential of GPT-like models. Focus on deploying models in production with best practices and debugging techniques. By the end of the training, you will have the skills to start building applications with GPT and other large language models.
csghub
CSGHub is an open source platform for managing large model assets, including datasets, model files, and codes. It offers functionalities similar to a privatized Huggingface, managing assets in a manner akin to how OpenStack Glance manages virtual machine images. Users can perform operations such as uploading, downloading, storing, verifying, and distributing assets through various interfaces. The platform provides microservice submodules and standardized OpenAPIs for easy integration with users' systems. CSGHub is designed for large models and can be deployed On-Premise for offline operation.
For similar tasks
MegaDetector
MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".
For similar jobs
promptflow
**Prompt flow** is a suite of development tools designed to streamline the end-to-end development cycle of LLM-based AI applications, from ideation, prototyping, testing, evaluation to production deployment and monitoring. It makes prompt engineering much easier and enables you to build LLM apps with production quality.
deepeval
DeepEval is a simple-to-use, open-source LLM evaluation framework specialized for unit testing LLM outputs. It incorporates various metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., and runs locally on your machine for evaluation. It provides a wide range of ready-to-use evaluation metrics, allows for creating custom metrics, integrates with any CI/CD environment, and enables benchmarking LLMs on popular benchmarks. DeepEval is designed for evaluating RAG and fine-tuning applications, helping users optimize hyperparameters, prevent prompt drifting, and transition from OpenAI to hosting their own Llama2 with confidence.
MegaDetector
MegaDetector is an AI model that identifies animals, people, and vehicles in camera trap images (which also makes it useful for eliminating blank images). This model is trained on several million images from a variety of ecosystems. MegaDetector is just one of many tools that aims to make conservation biologists more efficient with AI. If you want to learn about other ways to use AI to accelerate camera trap workflows, check out our of the field, affectionately titled "Everything I know about machine learning and camera traps".
leapfrogai
LeapfrogAI is a self-hosted AI platform designed to be deployed in air-gapped resource-constrained environments. It brings sophisticated AI solutions to these environments by hosting all the necessary components of an AI stack, including vector databases, model backends, API, and UI. LeapfrogAI's API closely matches that of OpenAI, allowing tools built for OpenAI/ChatGPT to function seamlessly with a LeapfrogAI backend. It provides several backends for various use cases, including llama-cpp-python, whisper, text-embeddings, and vllm. LeapfrogAI leverages Chainguard's apko to harden base python images, ensuring the latest supported Python versions are used by the other components of the stack. The LeapfrogAI SDK provides a standard set of protobuffs and python utilities for implementing backends and gRPC. LeapfrogAI offers UI options for common use-cases like chat, summarization, and transcription. It can be deployed and run locally via UDS and Kubernetes, built out using Zarf packages. LeapfrogAI is supported by a community of users and contributors, including Defense Unicorns, Beast Code, Chainguard, Exovera, Hypergiant, Pulze, SOSi, United States Navy, United States Air Force, and United States Space Force.
llava-docker
This Docker image for LLaVA (Large Language and Vision Assistant) provides a convenient way to run LLaVA locally or on RunPod. LLaVA is a powerful AI tool that combines natural language processing and computer vision capabilities. With this Docker image, you can easily access LLaVA's functionalities for various tasks, including image captioning, visual question answering, text summarization, and more. The image comes pre-installed with LLaVA v1.2.0, Torch 2.1.2, xformers 0.0.23.post1, and other necessary dependencies. You can customize the model used by setting the MODEL environment variable. The image also includes a Jupyter Lab environment for interactive development and exploration. Overall, this Docker image offers a comprehensive and user-friendly platform for leveraging LLaVA's capabilities.
carrot
The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.
TrustLLM
TrustLLM is a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TrustLLM, consisting of over 30 datasets. The document explains how to use the trustllm python package to help you assess the performance of your LLM in trustworthiness more quickly. For more details about TrustLLM, please refer to project website.
AI-YinMei
AI-YinMei is an AI virtual anchor Vtuber development tool (N card version). It supports fastgpt knowledge base chat dialogue, a complete set of solutions for LLM large language models: [fastgpt] + [one-api] + [Xinference], supports docking bilibili live broadcast barrage reply and entering live broadcast welcome speech, supports Microsoft edge-tts speech synthesis, supports Bert-VITS2 speech synthesis, supports GPT-SoVITS speech synthesis, supports expression control Vtuber Studio, supports painting stable-diffusion-webui output OBS live broadcast room, supports painting picture pornography public-NSFW-y-distinguish, supports search and image search service duckduckgo (requires magic Internet access), supports image search service Baidu image search (no magic Internet access), supports AI reply chat box [html plug-in], supports AI singing Auto-Convert-Music, supports playlist [html plug-in], supports dancing function, supports expression video playback, supports head touching action, supports gift smashing action, supports singing automatic start dancing function, chat and singing automatic cycle swing action, supports multi scene switching, background music switching, day and night automatic switching scene, supports open singing and painting, let AI automatically judge the content.