LabelLLM

LabelLLM

The Open-Source Data Annotation Platform

Stars: 634

Visit
 screenshot

LabelLLM is an open-source data annotation platform designed to optimize the data annotation process for LLM development. It offers flexible configuration, multimodal data support, comprehensive task management, and AI-assisted annotation. Users can access a suite of annotation tools, enjoy a user-friendly experience, and enhance efficiency. The platform allows real-time monitoring of annotation progress and quality control, ensuring data integrity and timeliness.

README:

LabelLLM: The Open-Source Data Annotation Platform

LOGO(1)

YouTube BiliBili

English | 简体中文

Product Introduction

LabelLLM introduces an innovative, open-source platform dedicated to optimizing the data annotation process integral to the development of LLM. Engineered with a vision to be a powerful tool for independent developers and small to medium-sized research teams to improve annotation efficiency. At its core, LabelLLM commits to facilitating the data annatation processes of model training with simplicity and efficiency by providing comprehensive task management solutions and versatile multimodal data support.

Key Features

Flexible Configuration

LabelLLM is distinguished by its adaptable framework, offering an array of task-specific tools that are customizable to meet the diverse needs of data annotation projects. This flexibility allows for seamless integration into a variety of task parameters, making it an invaluable asset in the preparation of data for model training.

Multimodal Data Support

Recognizing the importance of diversity in data, LabelLLM extends its capabilities to encompass a wide range of data modalities, including audio, images, and video. This holistic approach ensures that users can undertake complex annotation projects involving multiple types of data, under a single unified platform.

Comprehensive Task Management

Ensuring the highest standards of quality and efficiency, LabelLLM features an all-encompassing task management system. This system offers real-time monitoring of annotation progress and quality control, thereby guaranteeing the integrity and timeliness of the data preparation phase for all projects.

Artificial Intelligence Assisted Annotation

LabelLLM supports pre-annotation loading, which can be refined and adjusted by users according to actual needs. This feature improves the efficiency and accuracy of annotation.

https://github.com/user-attachments/assets/1acb2096-38dc-4225-8aa5-bdb616862679

Product Characteristics

Versatility 

With LabelLLM, users gain access to an extensive suite of data annotation tools, designed to cater to a wide array of task without compromising on the efficacy or precision of annotations.

User-Friendly 

Beyond its robust capabilities, LabelLLM places a strong emphasis on user experience, offering intuitive configurations and workflow processes that streamline the setup and distribution of data annotation tasks.

Efficiency Enhanced

By incorporating AI-assisted annotations, LabelLLM dramatically increases annotation efficiency.

Getting Strated

Video Tutorial

Click on the image below to watch the video:

Watch the video
Watch the video

Local Deployment

Deployment Tutorial Video: https://www.youtube.com/watch?v=KXofJzCOafk

  1. Clone the project locally or download the project code zip.

    Recommended to run on Linux, if you encounter problems with the installation you can refer to FAQ

  2. Install Docker, select the corresponding operating system type and download and install it, then start the Docker service.

    Docker installation tutorial: https://docs.docker.com/get-docker/

  3. Under the file address of the corresponding project, run the command:

    docker compose up
    

    Note: The initial installation may take some time, so please be patient and make sure you have a good internet connection.
    If you are in china, you can use the following command to speed up the download:

    // /etc/docker/daemon.json
    {
      "registry-mirrors": [
        "https://docker.m.daocloud.io"
      ]
    }

    Read more: https://github.com/DaoCloud/public-image-mirror?tab=readme-ov-file#%E6%9C%80%E4%BD%B3%E5%AE%9E%E8%B7%B5

  4. Open a browser and access Localhost:9001.

    username: user password: password

  5. Create a new access key and fill in the following fields:

  • Access Key: MekKrisWUnFFtsEk
  • Secret Key: XK4uxD1czzYFJCRTcM70jVrchccBdy6C

    You can find the built-in AK/SK environment variables in the ./backend/.env file. Alternatively, you can create a new access key and update the AK/SK in the .env file.

  1. Open your browser and visit the following address to access it:

    http://localhost:8086/supplier Labeling

    http://localhost:8086/operator admin

    Replace localhost with the corresponding ip address to share it with other team members so that they can use it directly without repeated deployment.

    The first registered account will be set as administrator by default, and subsequent accounts need to be set to get the operation side of the account privileges, please do not forget the first registered account and password!

Citation

@article{he2024opendatalab,
  title={Opendatalab: Empowering general artificial intelligence with open datasets},
  author={He, Conghui and Li, Wei and Jin, Zhenjiang and Xu, Chao and Wang, Bin and Lin, Dahua},
  journal={arXiv preprint arXiv:2407.13773},
  year={2024}
}

Technical Communication

Welcome to join Opendatalab official weibo group!

Links

  • LabelU (another multimodal labeling artifact from Opendatalab)
  • MinerU (One-stop high quality data extraction tool)

Configuration details

Backend Documentation Configuration File

Frontend Documentation Configuration File

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for LabelLLM

Similar Open Source Tools

For similar tasks

For similar jobs