csghub-server

csghub-server

csghub-server is the backend server for CSGHub which helps user to manage datasets, modes, and also run Model Inference, Finetune and Application Spaces.

Stars: 787

Visit
 screenshot

CSGHub Server is a part of the open source and reliable large model assets management platform - CSGHub. It focuses on management of models, datasets, and other LLM assets through REST API. Key features include creation and management of users and organizations, auto-tagging of model and dataset labels, search functionality, online preview of dataset files, content moderation for text and image, download of individual files, tracking of model and dataset activity data. The tool is extensible and customizable, supporting different git servers, flexible LFS storage system configuration, and content moderation options. The roadmap includes support for more Git servers, Git LFS, dataset online viewer, model/dataset auto-tag, S3 protocol support, model format conversion, and model one-click deploy. The project is licensed under Apache 2.0 and welcomes contributions.

README:

English简体中文日本語

CSGHub Server is a part of the open source and reliable large model assets management platform - CSGHub. It focuses on management of models、datasets and other LLM assets through REST API。

Key Features:

  • Creation and Management of users and orgnizations
  • Auto-tagging of model and dataset labels
  • Search for users, organizations, models, and data
  • Online preview of dataset files, like .parquet file
  • Content moderation for both text and image
  • Download of individual files, including LFS files
  • Tracking of model and dataset activity data, such as downloads and likes volume

Demo

In order to help users to quickly understand the features and usage of CSGHub, we have recorded a demo video. You can watch this video to get a quick understanding of the main features and operation procedures of this program.

  • CSGHub Demo video is as blew,you can also check it at YouTube or Bilibili

Please visit the OpenCSG website to experience the powerful management features.

Quick Start

System resource requirements: 4c CPU/8GB memory

Please install Docker yourself. This project has been tested in Ubuntu22 environment.

You can quickly deploy the localized CSGHub Server service through docker-compose:

# The API token should be at least 128 characters long, and HTTP requests to csghub-server require the API token to be sent as a Bearer token for authentication.
export STARHUB_SERVER_API_TOKEN=<API token>
mkdir -m 777 gitea minio_data
curl -L https://raw.githubusercontent.com/OpenCSGs/csghub-server/main/docker-compose.yml -o docker-compose.yml
docker-compose -f docker-compose.yml up -d

Start CSGHub Server Services Locally

CSGHub supports TOML format for config files. When starting any service from the command line, you can specify the config file with the --config option:

go run cmd/csghub-server/main.go start server --config local.toml
go run cmd/csghub-server/main.go deploy runner --config local.toml
...

We provide an example config file, you can rename it, modify as needed and use. All available configurations are defined in this Go file. The TOML configuration uses snake_case naming convention, and names automatically map to corresponding struct field names.

Technical Architecture

csghub-server architecture

Extensible and customizable

  • Supports different git servers, such as Gitea, GitLab, etc.
  • Supports flexible configuration of the LFS storage system, and you can choose to use local or any third-party cloud storage service that is compatible with the S3 protocol.
  • Enable content moderation on demand, and choose any third-party content moderation service.

Roadmap

  • [x] Support more Git Servers: Currently supports Gitea, and plans to support mainstream Git repositories in the future.
  • [x] Git LFS: Git LFS supports large files, and supports Git command operations and online download through the Web UI.
  • [x] DataSet online viewer: Data set preview, supports the Top20/TopN loading preview of LFS format data sets.
  • [x] Model/Dataset AutoTag: Supports custom metadata and automatic extraction of model/dataset tags.
  • [x] S3 Protocol Support: Supports S3 (MinIO) storage protocol, providing higher reliability and storage cost-effectiveness.
  • [ ] Model format convert: Conversion of mainstream model formats.
  • [x] Model oneclick deploy: Supports integration with OpenCSG llm-inference, one-click to start model inference.

License

We use the Apache 2.0 license, the content of which is detailed in the LICENSE file.

Contributing

If you wish to contribute, please follow the Contribution Guidelines. We are very excited about your contributions!

Acknowledgments

This project is based on open source projects such as Gin, DuckDB, minio, and Gitea. We would like to express our sincere gratitude to them for their open source contributions!

CONTACT WITH US

If you meet any problem during usage, you can contact with us by any following way:

  1. initiate an issue in github
  2. join our WeChat group by scaning wechat helper qrcode
  3. join our offical discord channel: OpenCSG Discord Channel
  4. join our slack workspace:OpenCSG Slack Channel
                                     

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for csghub-server

Similar Open Source Tools

For similar tasks

For similar jobs