chat-ui

Open source codebase powering the HuggingChat app

Stars: 8514

Visit

A chat interface using open source models, eg OpenAssistant or Llama. It is a SvelteKit app and it powers the HuggingChat app on hf.co/chat.

README:

Chat UI

Find the docs at hf.co/docs/chat-ui.

A chat interface using open source models, eg OpenAssistant or Llama. It is a SvelteKit app and it powers the HuggingChat app on hf.co/chat.

Quickstart
No Setup Deploy
Setup
Launch
Web Search
Text Embedding Models
Extra parameters
Common issues
Deploying to a HF Space
Building

Quickstart

Docker image

You can deploy a chat-ui instance in a single command using the docker image. Get your huggingface token from here.

docker run -p 3000 -e HF_TOKEN=hf_*** -v db:/data ghcr.io/huggingface/chat-ui-db:latest

Take a look at the .env file and the readme to see all the environment variables that you can set. We have endpoint support for all OpenAI API compatible local services as well as many other providers like Anthropic, Cloudflare, Google Vertex AI, etc.

Local setup

You can quickly start a locally running chat-ui & LLM text-generation server thanks to chat-ui's llama.cpp server support.

Step 1 (Start llama.cpp server):

Install llama.cpp w/ brew (for Mac):

# install llama.cpp
brew install llama.cpp

or build directly from the source for your target device:

git clone https://github.com/ggerganov/llama.cpp && cd llama.cpp && make

Next, start the server with the LLM of your choice:

# start llama.cpp server (using hf.co/microsoft/Phi-3-mini-4k-instruct-gguf as an example)
llama-server --hf-repo microsoft/Phi-3-mini-4k-instruct-gguf --hf-file Phi-3-mini-4k-instruct-q4.gguf -c 4096

A local LLaMA.cpp HTTP Server will start on http://localhost:8080. Read more here.

Step 3 (make sure you have MongoDb running locally):

docker run -d -p 27017:27017 --name mongo-chatui mongo:latest

No Setup Deploy

If you don't want to configure, setup, and launch your own Chat UI yourself, you can use this option as a fast deploy alternative.

You can deploy your own customized Chat UI instance with any supported LLM of your choice on Hugging Face Spaces. To do so, use the chat-ui template available here.

Set HF_TOKEN in Space secrets to deploy a model with gated access or a model in a private repository. It's also compatible with Inference for PROs curated list of powerful models with higher rate limits. Make sure to create your personal token first in your User Access Tokens settings.

Read the full tutorial here.

Setup

The default config for Chat UI is stored in the .env file. You will need to override some values to get Chat UI to run locally. This is done in .env.local.

Start by creating a .env.local file in the root of the repository. The bare minimum config you need to get Chat UI to run locally is the following:

MONGODB_URL=<the URL to your MongoDB instance>
HF_TOKEN=<your access token>

Database

The chat history is stored in a MongoDB instance, and having a DB instance available is needed for Chat UI to work.

You can use a local MongoDB instance. The easiest way is to spin one up using docker:

docker run -d -p 27017:27017 --name mongo-chatui mongo:latest

In which case the url of your DB will be MONGODB_URL=mongodb://localhost:27017.

Alternatively, you can use a free MongoDB Atlas instance for this, Chat UI should fit comfortably within their free tier. After which you can set the MONGODB_URL variable in .env.local to match your instance.

Hugging Face Access Token

If you use a remote inference endpoint, you will need a Hugging Face access token to run Chat UI locally. You can get one from your Hugging Face profile.

Launch

After you're done with the .env.local file you can run Chat UI locally with:

npm install
npm run dev

Web Search

Chat UI features a powerful Web Search feature. It works by:

Generating an appropriate search query from the user prompt.
Performing web search and extracting content from webpages.
Creating embeddings from texts using a text embedding model.
From these embeddings, find the ones that are closest to the user query using a vector similarity search. Specifically, we use inner product distance.
Get the corresponding texts to those closest embeddings and perform Retrieval-Augmented Generation (i.e. expand user prompt by adding those texts so that an LLM can use this information).

Text Embedding Models

By default (for backward compatibility), when TEXT_EMBEDDING_MODELS environment variable is not defined, transformers.js embedding models will be used for embedding tasks, specifically, Xenova/gte-small model.

You can customize the embedding model by setting TEXT_EMBEDDING_MODELS in your .env.local file. For example:

TEXT_EMBEDDING_MODELS = `[
  {
    "name": "Xenova/gte-small",
    "displayName": "Xenova/gte-small",
    "description": "locally running embedding",
    "chunkCharLength": 512,
    "endpoints": [
      {"type": "transformersjs"}
    ]
  },
  {
    "name": "intfloat/e5-base-v2",
    "displayName": "intfloat/e5-base-v2",
    "description": "hosted embedding model",
    "chunkCharLength": 768,
    "preQuery": "query: ", # See https://huggingface.co/intfloat/e5-base-v2#faq
    "prePassage": "passage: ", # See https://huggingface.co/intfloat/e5-base-v2#faq
    "endpoints": [
      {
        "type": "tei",
        "url": "http://127.0.0.1:8080/",
        "authorization": "TOKEN_TYPE TOKEN" // optional authorization field. Example: "Basic VVNFUjpQQVNT"
      }
    ]
  }
]`

The required fields are name, chunkCharLength and endpoints. Supported text embedding backends are: transformers.js, TEI and OpenAI. transformers.js models run locally as part of chat-ui, whereas TEI models run in a different environment & accessed through an API endpoint. openai models are accessed through the OpenAI API.

When more than one embedding models are supplied in .env.local file, the first will be used by default, and the others will only be used on LLM's which configured embeddingModel to the name of the model.

Extra parameters

OpenID connect

The login feature is disabled by default and users are attributed a unique ID based on their browser. But if you want to use OpenID to authenticate your users, you can add the following to your .env.local file:

OPENID_CONFIG=`{
  PROVIDER_URL: "<your OIDC issuer>",
  CLIENT_ID: "<your OIDC client ID>",
  CLIENT_SECRET: "<your OIDC client secret>",
  SCOPES: "openid profile",
  TOLERANCE: // optional
  RESOURCE: // optional
}`

These variables will enable the openID sign-in modal for users.

Trusted header authentication

You can set the env variable TRUSTED_EMAIL_HEADER to point to the header that contains the user's email address. This will allow you to authenticate users from the header. This setup is usually combined with a proxy that will be in front of chat-ui and will handle the auth and set the header.

[!WARNING] Make sure to only allow requests to chat-ui through your proxy which handles authentication, otherwise users could authenticate as anyone by setting the header manually! Only set this up if you understand the implications and know how to do it correctly.

Here is a list of header names for common auth providers:

Tailscale Serve: Tailscale-User-Login
Cloudflare Access: Cf-Access-Authenticated-User-Email
oauth2-proxy: X-Forwarded-Email

Theming

You can use a few environment variables to customize the look and feel of chat-ui. These are by default:

PUBLIC_APP_NAME=ChatUI
PUBLIC_APP_ASSETS=chatui
PUBLIC_APP_COLOR=blue
PUBLIC_APP_DESCRIPTION="Making the community's best AI chat models available to everyone."
PUBLIC_APP_DATA_SHARING=
PUBLIC_APP_DISCLAIMER=

PUBLIC_APP_NAME The name used as a title throughout the app.
PUBLIC_APP_ASSETS Is used to find logos & favicons in static/$PUBLIC_APP_ASSETS, current options are chatui and huggingchat.
PUBLIC_APP_COLOR Can be any of the tailwind colors.
PUBLIC_APP_DATA_SHARING Can be set to 1 to add a toggle in the user settings that lets your users opt-in to data sharing with models creator.
PUBLIC_APP_DISCLAIMER If set to 1, we show a disclaimer about generated outputs on login.

Web Search config

You can enable the web search through an API by adding YDC_API_KEY (docs.you.com) or SERPER_API_KEY (serper.dev) or SERPAPI_KEY (serpapi.com) or SERPSTACK_API_KEY (serpstack.com) or SEARCHAPI_KEY (searchapi.io) to your .env.local.

You can also simply enable the local google websearch by setting USE_LOCAL_WEBSEARCH=true in your .env.local or specify a SearXNG instance by adding the query URL to SEARXNG_QUERY_URL.

You can enable javascript when parsing webpages to improve compatibility with WEBSEARCH_JAVASCRIPT=true at the cost of increased CPU usage. You'll want at least 4 cores when enabling.

Custom models

You can customize the parameters passed to the model or even use a new model by updating the MODELS variable in your .env.local. The default one can be found in .env and looks like this :

MODELS=`[
  {
    "name": "mistralai/Mistral-7B-Instruct-v0.2",
    "displayName": "mistralai/Mistral-7B-Instruct-v0.2",
    "description": "Mistral 7B is a new Apache 2.0 model, released by Mistral AI that outperforms Llama2 13B in benchmarks.",
    "websiteUrl": "https://mistral.ai/news/announcing-mistral-7b/",
    "preprompt": "",
    "chatPromptTemplate" : "<s>{{#each messages}}{{#ifUser}}[INST] {{#if @first}}{{#if @root.preprompt}}{{@root.preprompt}}\n{{/if}}{{/if}}{{content}} [/INST]{{/ifUser}}{{#ifAssistant}}{{content}}</s>{{/ifAssistant}}{{/each}}",
    "parameters": {
      "temperature": 0.3,
      "top_p": 0.95,
      "repetition_penalty": 1.2,
      "top_k": 50,
      "truncate": 3072,
      "max_new_tokens": 1024,
      "stop": ["</s>"]
    },
    "promptExamples": [
      {
        "title": "Write an email from bullet list",
        "prompt": "As a restaurant owner, write a professional email to the supplier to get these products every week: \n\n- Wine (x10)\n- Eggs (x24)\n- Bread (x12)"
      }, {
        "title": "Code a snake game",
        "prompt": "Code a basic snake game in python, give explanations for each step."
      }, {
        "title": "Assist in a task",
        "prompt": "How do I make a delicious lemon cheesecake?"
      }
    ]
  }
]`

You can change things like the parameters, or customize the preprompt to better suit your needs. You can also add more models by adding more objects to the array, with different preprompts for example.

chatPromptTemplate

When querying the model for a chat response, the chatPromptTemplate template is used. messages is an array of chat messages, it has the format [{ content: string }, ...]. To identify if a message is a user message or an assistant message the ifUser and ifAssistant block helpers can be used.

The following is the default chatPromptTemplate, although newlines and indentiation have been added for readability. You can find the prompts used in production for HuggingChat here.

{{preprompt}}
{{#each messages}}
  {{#ifUser}}{{@root.userMessageToken}}{{content}}{{@root.userMessageEndToken}}{{/ifUser}}
  {{#ifAssistant}}{{@root.assistantMessageToken}}{{content}}{{@root.assistantMessageEndToken}}{{/ifAssistant}}
{{/each}}
{{assistantMessageToken}}

[!INFO] We also support Jinja2 templates for the chatPromptTemplate in addition to Handlebars templates. On startup we first try to compile with Jinja and if that fails we fall back to interpreting chatPromptTemplate as handlebars.

Multi modal model

We currently support IDEFICS (hosted on TGI), OpenAI and Claude 3 as multimodal models. You can enable it by setting multimodal: true in your MODELS configuration. For IDEFICS, you must have a PRO HF Api token. For OpenAI, see the OpenAI section. For Anthropic, see the Anthropic section.

    {
      "name": "HuggingFaceM4/idefics-80b-instruct",
      "multimodal" : true,
      "description": "IDEFICS is the new multimodal model by Hugging Face.",
      "preprompt": "",
      "chatPromptTemplate" : "{{#each messages}}{{#ifUser}}User: {{content}}{{/ifUser}}<end_of_utterance>\nAssistant: {{#ifAssistant}}{{content}}\n{{/ifAssistant}}{{/each}}",
      "parameters": {
        "temperature": 0.1,
        "top_p": 0.95,
        "repetition_penalty": 1.2,
        "top_k": 12,
        "truncate": 1000,
        "max_new_tokens": 1024,
        "stop": ["<end_of_utterance>", "User:", "\nUser:"]
      }
    }

Running your own models using a custom endpoint

If you want to, instead of hitting models on the Hugging Face Inference API, you can run your own models locally.

A good option is to hit a text-generation-inference endpoint. This is what is done in the official Chat UI Spaces Docker template for instance: both this app and a text-generation-inference server run inside the same container.

To do this, you can add your own endpoints to the MODELS variable in .env.local, by adding an "endpoints" key for each model in MODELS.

{
// rest of the model config here
"endpoints": [{
  "type" : "tgi",
  "url": "https://HOST:PORT",
  }]
}

If endpoints are left unspecified, ChatUI will look for the model on the hosted Hugging Face inference API using the model name.

OpenAI API compatible models

Chat UI can be used with any API server that supports OpenAI API compatibility, for example text-generation-webui, LocalAI, FastChat, llama-cpp-python, and ialacol and vllm.

The following example config makes Chat UI works with text-generation-webui, the endpoint.baseUrl is the url of the OpenAI API compatible server, this overrides the baseUrl to be used by OpenAI instance. The endpoint.completion determine which endpoint to be used, default is chat_completions which uses v1/chat/completions, change to endpoint.completion to completions to use the v1/completions endpoint.

Parameters not supported by OpenAI (e.g., top_k, repetition_penalty, etc.) must be set in the extraBody of endpoints. Be aware that setting them in parameters will cause them to be omitted.

MODELS=`[
  {
    "name": "text-generation-webui",
    "id": "text-generation-webui",
    "parameters": {
      "temperature": 0.9,
      "top_p": 0.95,
      "max_new_tokens": 1024,
      "stop": []
    },
    "endpoints": [{
      "type" : "openai",
      "baseURL": "http://localhost:8000/v1",
      "extraBody": {
        "repetition_penalty": 1.2,
        "top_k": 50,
        "truncate": 1000
      }
    }]
  }
]`

The openai type includes official OpenAI models. You can add, for example, GPT4/GPT3.5 as a "openai" model:

OPENAI_API_KEY=#your openai api key here
MODELS=`[{
      "name": "gpt-4",
      "displayName": "GPT 4",
      "endpoints" : [{
        "type": "openai"
      }]
},
      {
      "name": "gpt-3.5-turbo",
      "displayName": "GPT 3.5 Turbo",
      "endpoints" : [{
        "type": "openai"
      }]
}]`

You may also consume any model provider that provides compatible OpenAI API endpoint. For example, you may self-host Portkey gateway and experiment with Claude or GPTs offered by Azure OpenAI. Example for Claude from Anthropic:

MODELS=`[{
  "name": "claude-2.1",
  "displayName": "Claude 2.1",
  "description": "Anthropic has been founded by former OpenAI researchers...",
  "parameters": {
      "temperature": 0.5,
      "max_new_tokens": 4096,
  },
  "endpoints": [
      {
          "type": "openai",
          "baseURL": "https://gateway.example.com/v1",
          "defaultHeaders": {
              "x-portkey-config": '{"provider":"anthropic","api_key":"sk-ant-abc...xyz"}'
          }
      }
  ]
}]`

Example for GPT 4 deployed on Azure OpenAI:

MODELS=`[{
  "id": "gpt-4-1106-preview",
  "name": "gpt-4-1106-preview",
  "displayName": "gpt-4-1106-preview",
  "parameters": {
      "temperature": 0.5,
      "max_new_tokens": 4096,
  },
  "endpoints": [
      {
          "type": "openai",
          "baseURL": "https://{resource-name}.openai.azure.com/openai/deployments/{deployment-id}",
          "defaultHeaders": {
              "api-key": "{api-key}"
          },
          "defaultQuery": {
              "api-version": "2023-05-15"
          }
      }
  ]
}]`

Or try Mistral from Deepinfra:

Note, apiKey can either be set custom per endpoint, or globally using OPENAI_API_KEY variable.

MODELS=`[{
  "name": "mistral-7b",
  "displayName": "Mistral 7B",
  "description": "A 7B dense Transformer, fast-deployed and easily customisable. Small, yet powerful for a variety of use cases. Supports English and code, and a 8k context window.",
  "parameters": {
      "temperature": 0.5,
      "max_new_tokens": 4096,
  },
  "endpoints": [
      {
          "type": "openai",
          "baseURL": "https://api.deepinfra.com/v1/openai",
          "apiKey": "abc...xyz"
      }
  ]
}]`

Non-streaming endpoints

For endpoints that don´t support streaming like o1 on Azure, you can pass streamingSupported: false in your endpoint config:

MODELS=`[{
  "id": "o1-preview",
  "name": "o1-preview",
  "displayName": "o1-preview",
  "systemRoleSupported": false,
  "endpoints": [
    {
      "type": "openai",
      "baseURL": "https://my-deployment.openai.azure.com/openai/deployments/o1-preview",
      "defaultHeaders": {
        "api-key": "$SECRET"
      },
      "streamingSupported": false,
    }
  ]
}]`

Llama.cpp API server

chat-ui also supports the llama.cpp API server directly without the need for an adapter. You can do this using the llamacpp endpoint type.

If you want to run Chat UI with llama.cpp, you can do the following, using microsoft/Phi-3-mini-4k-instruct-gguf as an example model:

# install llama.cpp
brew install llama.cpp
# start llama.cpp server
llama-server --hf-repo microsoft/Phi-3-mini-4k-instruct-gguf --hf-file Phi-3-mini-4k-instruct-q4.gguf -c 4096

MODELS=`[
  {
      "name": "Local Zephyr",
      "chatPromptTemplate": "<|system|>\n{{preprompt}}</s>\n{{#each messages}}{{#ifUser}}<|user|>\n{{content}}</s>\n<|assistant|>\n{{/ifUser}}{{#ifAssistant}}{{content}}</s>\n{{/ifAssistant}}{{/each}}",
      "parameters": {
        "temperature": 0.1,
        "top_p": 0.95,
        "repetition_penalty": 1.2,
        "top_k": 50,
        "truncate": 1000,
        "max_new_tokens": 2048,
        "stop": ["</s>"]
      },
      "endpoints": [
        {
         "url": "http://127.0.0.1:8080",
         "type": "llamacpp"
        }
      ]
  }
]`

Start chat-ui with npm run dev and you should be able to chat with Zephyr locally.

Ollama

We also support the Ollama inference server. Spin up a model with

ollama run mistral

Then specify the endpoints like so:

MODELS=`[
  {
      "name": "Ollama Mistral",
      "chatPromptTemplate": "<s>{{#each messages}}{{#ifUser}}[INST] {{#if @first}}{{#if @root.preprompt}}{{@root.preprompt}}\n{{/if}}{{/if}} {{content}} [/INST]{{/ifUser}}{{#ifAssistant}}{{content}}</s> {{/ifAssistant}}{{/each}}",
      "parameters": {
        "temperature": 0.1,
        "top_p": 0.95,
        "repetition_penalty": 1.2,
        "top_k": 50,
        "truncate": 3072,
        "max_new_tokens": 1024,
        "stop": ["</s>"]
      },
      "endpoints": [
        {
         "type": "ollama",
         "url" : "http://127.0.0.1:11434",
         "ollamaName" : "mistral"
        }
      ]
  }
]`

Anthropic

We also support Anthropic models (including multimodal ones via multmodal: true) through the official SDK. You may provide your API key via the ANTHROPIC_API_KEY env variable, or alternatively, through the endpoints.apiKey as per the following example.

MODELS=`[
  {
      "name": "claude-3-haiku-20240307",
      "displayName": "Claude 3 Haiku",
      "description": "Fastest and most compact model for near-instant responsiveness",
      "multimodal": true,
      "parameters": {
        "max_new_tokens": 4096,
      },
      "endpoints": [
        {
          "type": "anthropic",
          // optionals
          "apiKey": "sk-ant-...",
          "baseURL": "https://api.anthropic.com",
          "defaultHeaders": {},
          "defaultQuery": {}
        }
      ]
  },
  {
      "name": "claude-3-sonnet-20240229",
      "displayName": "Claude 3 Sonnet",
      "description": "Ideal balance of intelligence and speed",
      "multimodal": true,
      "parameters": {
        "max_new_tokens": 4096,
      },
      "endpoints": [
        {
          "type": "anthropic",
          // optionals
          "apiKey": "sk-ant-...",
          "baseURL": "https://api.anthropic.com",
          "defaultHeaders": {},
          "defaultQuery": {}
        }
      ]
  },
  {
      "name": "claude-3-opus-20240229",
      "displayName": "Claude 3 Opus",
      "description": "Most powerful model for highly complex tasks",
      "multimodal": true,
      "parameters": {
         "max_new_tokens": 4096
      },
      "endpoints": [
        {
          "type": "anthropic",
          // optionals
          "apiKey": "sk-ant-...",
          "baseURL": "https://api.anthropic.com",
          "defaultHeaders": {},
          "defaultQuery": {}
        }
      ]
  }
]`

We also support using Anthropic models running on Vertex AI. Authentication is done using Google Application Default Credentials. Project ID can be provided through the endpoints.projectId as per the following example:

MODELS=`[
  {
      "name": "claude-3-sonnet@20240229",
      "displayName": "Claude 3 Sonnet",
      "description": "Ideal balance of intelligence and speed",
      "multimodal": true,
      "parameters": {
        "max_new_tokens": 4096,
      },
      "endpoints": [
        {
          "type": "anthropic-vertex",
          "region": "us-central1",
          "projectId": "gcp-project-id",
          // optionals
          "defaultHeaders": {},
          "defaultQuery": {}
        }
      ]
  },
  {
      "name": "claude-3-haiku@20240307",
      "displayName": "Claude 3 Haiku",
      "description": "Fastest, most compact model for near-instant responsiveness",
      "multimodal": true,
      "parameters": {
         "max_new_tokens": 4096
      },
      "endpoints": [
        {
          "type": "anthropic-vertex",
          "region": "us-central1",
          "projectId": "gcp-project-id",
          // optionals
          "defaultHeaders": {},
          "defaultQuery": {}
        }
      ]
  }
]`

Amazon

You can also specify your Amazon SageMaker instance as an endpoint for chat-ui. The config goes like this:

"endpoints": [
    {
      "type" : "aws",
      "service" : "sagemaker"
      "url": "",
      "accessKey": "",
      "secretKey" : "",
      "sessionToken": "",
      "region": "",

      "weight": 1
    }
]

You can also set "service" : "lambda" to use a lambda instance.

You can get the accessKey and secretKey from your AWS user, under programmatic access.

Cloudflare Workers AI

You can also use Cloudflare Workers AI to run your own models with serverless inference.

You will need to have a Cloudflare account, then get your account ID as well as your API token for Workers AI.

You can either specify them directly in your .env.local using the CLOUDFLARE_ACCOUNT_ID and CLOUDFLARE_API_TOKEN variables, or you can set them directly in the endpoint config.

You can find the list of models available on Cloudflare here.

  {
  "name" : "nousresearch/hermes-2-pro-mistral-7b",
  "tokenizer": "nousresearch/hermes-2-pro-mistral-7b",
  "parameters": {
    "stop": ["<|im_end|>"]
  },
  "endpoints" : [
    {
      "type" : "cloudflare"
      <!-- optionally specify these
      "accountId": "your-account-id",
      "authToken": "your-api-token"
      -->
    }
  ]
}

Cohere

You can also use Cohere to run their models directly from chat-ui. You will need to have a Cohere account, then get your API token. You can either specify it directly in your .env.local using the COHERE_API_TOKEN variable, or you can set it in the endpoint config.

Here is an example of a Cohere model config. You can set which model you want to use by setting the id field to the model name.

  {
    "name" : "CohereForAI/c4ai-command-r-v01",
    "id": "command-r",
    "description": "C4AI Command-R is a research release of a 35 billion parameter highly performant generative model",
    "endpoints": [
      {
        "type": "cohere",
        <!-- optionally specify these, or use COHERE_API_TOKEN
        "apiKey": "your-api-token"
        -->
      }
    ]
  }

Google Vertex models

Chat UI can connect to the google Vertex API endpoints (List of supported models).

To enable:

Select or create a Google Cloud project.
Enable billing for your project.
Enable the Vertex AI API.
Set up authentication with a service account so you can access the API from your local workstation.

The service account credentials file can be imported as an environmental variable:

    GOOGLE_APPLICATION_CREDENTIALS = clientid.json

Make sure your docker container has access to the file and the variable is correctly set. Afterwards Google Vertex endpoints can be configured as following:

MODELS=`[
//...
    {
       "name": "gemini-1.5-pro",
       "displayName": "Vertex Gemini Pro 1.5",
       "multimodal": true,
       "endpoints" : [{
          "type": "vertex",
          "project": "abc-xyz",
          "location": "europe-west3",
          "extraBody": {
          "model_version": "gemini-1.5-pro-preview-0409",
          },

          // Optional
          "safetyThreshold": "BLOCK_MEDIUM_AND_ABOVE",
          "apiEndpoint": "", // alternative api endpoint url,
          "tools": [{
            "googleSearchRetrieval": {
              "disableAttribution": true
            }
          }],
          "multimodal": {
            "image": {
              "supportedMimeTypes": ["image/png", "image/jpeg", "image/webp"],
              "preferredMimeType": "image/png",
              "maxSizeInMB": 5,
              "maxWidth": 2000,
              "maxHeight": 1000,
            }
          }
       }]
     },
]`

LangServe

LangChain applications that are deployed using LangServe can be called with the following config:

MODELS=`[
//...
    {
       "name": "summarization-chain", //model-name
       "endpoints" : [{
         "type": "langserve",
         "url" : "http://127.0.0.1:8100",
       }]
     },
]`

Custom endpoint authorization

Basic and Bearer

Custom endpoints may require authorization, depending on how you configure them. Authentication will usually be set either with Basic or Bearer.

For Basic we will need to generate a base64 encoding of the username and password.

echo -n "USER:PASS" | base64

VVNFUjpQQVNT

For Bearer you can use a token, which can be grabbed from here.

You can then add the generated information and the authorization parameter to your .env.local.

"endpoints": [
  {
    "url": "https://HOST:PORT",
    "authorization": "Basic VVNFUjpQQVNT",
  }
]

Please note that if HF_TOKEN is also set or not empty, it will take precedence.

Models hosted on multiple custom endpoints

If the model being hosted will be available on multiple servers/instances add the weight parameter to your .env.local. The weight will be used to determine the probability of requesting a particular endpoint.

"endpoints": [
  {
    "url": "https://HOST:PORT",
    "weight": 1
  },
  {
    "url": "https://HOST:PORT",
    "weight": 2
  }
  ...
]

Client Certificate Authentication (mTLS)

Custom endpoints may require client certificate authentication, depending on how you configure them. To enable mTLS between Chat UI and your custom endpoint, you will need to set the USE_CLIENT_CERTIFICATE to true, and add the CERT_PATH and KEY_PATH parameters to your .env.local. These parameters should point to the location of the certificate and key files on your local machine. The certificate and key files should be in PEM format. The key file can be encrypted with a passphrase, in which case you will also need to add the CLIENT_KEY_PASSWORD parameter to your .env.local.

If you're using a certificate signed by a private CA, you will also need to add the CA_PATH parameter to your .env.local. This parameter should point to the location of the CA certificate file on your local machine.

If you're using a self-signed certificate, e.g. for testing or development purposes, you can set the REJECT_UNAUTHORIZED parameter to false in your .env.local. This will disable certificate validation, and allow Chat UI to connect to your custom endpoint.

Specific Embedding Model

A model can use any of the embedding models defined in .env.local, (currently used when web searching), by default it will use the first embedding model, but it can be changed with the field embeddingModel:

TEXT_EMBEDDING_MODELS = `[
  {
    "name": "Xenova/gte-small",
    "chunkCharLength": 512,
    "endpoints": [
      {"type": "transformersjs"}
    ]
  },
  {
    "name": "intfloat/e5-base-v2",
    "chunkCharLength": 768,
    "endpoints": [
      {"type": "tei", "url": "http://127.0.0.1:8080/", "authorization": "Basic VVNFUjpQQVNT"},
      {"type": "tei", "url": "http://127.0.0.1:8081/"}
    ]
  }
]`

MODELS=`[
  {
      "name": "Ollama Mistral",
      "chatPromptTemplate": "...",
      "embeddingModel": "intfloat/e5-base-v2"
      "parameters": {
        ...
      },
      "endpoints": [
        ...
      ]
  }
]`

Reasoning Models

ChatUI supports specialized reasoning/Chain-of-Thought (CoT) models through the reasoning configuration field. When properly configured, this displays a UI widget that allows users to view or collapse the model’s reasoning steps. We support three types of reasoning parsing:

Token-Based Delimitations

For models like DeepSeek R1, token-based delimitations can be used to identify reasoning steps. This is done by specifying the beginToken and endToken fields in the reasoning configuration.

Example configuration for DeepSeek R1 (token-based):

{
	"name": "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B",
	// ...
	"reasoning": {
		"type": "tokens",
		"beginToken": "<think>",
		"endToken": "</think>"
	}
}

Summarizing the Chain of Thought

For models like QwQ, which return a chain of thought but do not explicitly provide a final answer, the summarize type can be used. This automatically summarizes the reasoning steps using the TASK_MODEL (or the first model in the configuration if TASK_MODEL is not specified) and displays the summary as the final answer.

Example configuration for QwQ (summarize-based):

{
	"name": "Qwen/QwQ-32B-Preview",
	// ...
	"reasoning": {
		"type": "summarize"
	}
}

Regex-Based Delimitations

In some cases, the final answer can be extracted from the model output using a regular expression. This is achieved by specifying the regex field in the reasoning configuration. For example, if your model wraps the final answer in a \boxed{} tag, you can use the following configuration:

{
	"name": "model/yourmodel",
	// ...
	"reasoning": {
		"type": "regex",
		"regex": "\\\\boxed\\{(.+?)\\}"
	}
}

Common issues

403：You don't have access to this conversation

Most likely you are running chat-ui over HTTP. The recommended option is to setup something like NGINX to handle HTTPS and proxy the requests to chat-ui. If you really need to run over HTTP you can add COOKIE_SECURE=false and COOKIE_SAMESITE=lax to your .env.local.

Make sure to set your PUBLIC_ORIGIN in your .env.local to the correct URL as well.

Deploying to a HF Space

Create a DOTENV_LOCAL secret to your HF space with the content of your .env.local, and they will be picked up automatically when you run.

Building

To create a production version of your app:

npm run build

You can preview the production build with npm run preview.

To deploy your app, you may need to install an adapter for your target environment.

Config changes for HuggingChat

The config file for HuggingChat is stored in the chart/env/prod.yaml file. It is the source of truth for the environment variables used for our CI/CD pipeline. For HuggingChat, as we need to customize the app color, as well as the base path, we build a custom docker image. You can find the workflow here.

[!TIP] If you want to make changes to the model config used in production for HuggingChat, you should do so against chart/env/prod.yaml.

Running a copy of HuggingChat locally

If you want to run an exact copy of HuggingChat locally, you will need to do the following first:

Create an OAuth App on the hub with openid profile email permissions. Make sure to set the callback URL to something like http://localhost:5173/chat/login/callback which matches the right path for your local instance.
Create a HF Token with your Hugging Face account. You will need a Pro account to be able to access some of the larger models available through HuggingChat.
Create a free account with serper.dev (you will get 2500 free search queries)
Run an instance of mongoDB, however you want. (Local or remote)

You can then create a new .env.SECRET_CONFIG file with the following content

MONGODB_URL=<link to your mongo DB from step 4>
HF_TOKEN=<your HF token from step 2>
OPENID_CONFIG=`{
  PROVIDER_URL: "https://huggingface.co",
  CLIENT_ID: "<your client ID from step 1>",
  CLIENT_SECRET: "<your client secret from step 1>",
}`
SERPER_API_KEY=<your serper API key from step 3>
MESSAGES_BEFORE_LOGIN=<can be any numerical value, or set to 0 to require login>

You can then run npm run updateLocalEnv in the root of chat-ui. This will create a .env.local file which combines the chart/env/prod.yaml and the .env.SECRET_CONFIG file. You can then run npm run dev to start your local instance of HuggingChat.

Populate database

[!WARNING] The MONGODB_URL used for this script will be fetched from .env.local. Make sure it's correct! The command runs directly on the database.

You can populate the database using faker data using the populate script:

npm run populate <flags here>

At least one flag must be specified, the following flags are available:

reset - resets the database
all - populates all tables
users - populates the users table
settings - populates the settings table for existing users
assistants - populates the assistants table for existing users
conversations - populates the conversations table for existing users

For example, you could use it like so:

npm run populate reset

to clear out the database. Then login in the app to create your user and run the following command:

npm run populate users settings assistants conversations

to populate the database with fake data, including fake conversations and assistants for your user.

Building the docker images locally

You can build the docker images locally using the following commands:

docker build -t chat-ui-db:latest --build-arg INCLUDE_DB=true .
docker build -t chat-ui:latest --build-arg INCLUDE_DB=false .
docker build -t huggingchat:latest --build-arg INCLUDE_DB=false --build-arg APP_BASE=/chat --build-arg PUBLIC_APP_COLOR=yellow .

If you want to run the images with your local .env.local you have two options

DOTENV_LOCAL=$(<.env.local)  docker run --rm --network=host -e DOTENV_LOCAL -p 3000:3000 chat-ui

docker run --rm --network=host --mount type=bind,source="$(pwd)/.env.local",target=/app/.env.local -p 3000:3000 chat-ui

For Tasks:

Click tags to check more tools for each tasks

write a story translate a text answer a question generate code summarize a text

For Jobs:

content writer chatbot developer ai researcher product manager ux designer

Alternative AI tools for chat-ui

Similar Open Source Tools

chat-ui

A chat interface using open source models, eg OpenAssistant or Llama. It is a SvelteKit app and it powers the HuggingChat app on hf.co/chat.

github

: 8.5k

openmacro

Openmacro is a multimodal personal agent that allows users to run code locally. It acts as a personal agent capable of completing and automating tasks autonomously via self-prompting. The tool provides a CLI natural-language interface for completing and automating tasks, analyzing and plotting data, browsing the web, and manipulating files. Currently, it supports API keys for models powered by SambaNova, with plans to add support for other hosts like OpenAI and Anthropic in future versions.

github

: 62

mistreevous

Mistreevous is a library written in TypeScript for Node and browsers, used to declaratively define, build, and execute behaviour trees for creating complex AI. It allows defining trees with JSON or a minimal DSL, providing in-browser editor and visualizer. The tool offers methods for tree state, stepping, resetting, and getting node details, along with various composite, decorator, leaf nodes, callbacks, guards, and global functions/subtrees. Version history includes updates for node types, callbacks, global functions, and TypeScript conversion.

github

: 82

ruby-openai

Use the OpenAI API with Ruby! 🤖🩵 Stream text with GPT-4, transcribe and translate audio with Whisper, or create images with DALL·E... Hire me | 🎮 Ruby AI Builders Discord | 🐦 Twitter | 🧠 Anthropic Gem | 🚂 Midjourney Gem ## Table of Contents * Ruby OpenAI * Table of Contents * Installation * Bundler * Gem install * Usage * Quickstart * With Config * Custom timeout or base URI * Extra Headers per Client * Logging * Errors * Faraday middleware * Azure * Ollama * Counting Tokens * Models * Examples * Chat * Streaming Chat * Vision * JSON Mode * Functions * Edits * Embeddings * Batches * Files * Finetunes * Assistants * Threads and Messages * Runs * Runs involving function tools * Image Generation * DALL·E 2 * DALL·E 3 * Image Edit * Image Variations * Moderations * Whisper * Translate * Transcribe * Speech * Errors * Development * Release * Contributing * License * Code of Conduct

github

: 3.0k

notebook-intelligence

Notebook Intelligence (NBI) is an AI coding assistant and extensible AI framework for JupyterLab. It greatly boosts the productivity of JupyterLab users with AI assistance by providing features such as code generation with inline chat, auto-complete, and chat interface. NBI supports various LLM Providers and AI Models, including local models from Ollama. Users can configure model provider and model options, remember GitHub Copilot login, and save configuration files. NBI seamlessly integrates with Model Context Protocol (MCP) servers, supporting both Standard Input/Output (stdio) and Server-Sent Events (SSE) transports. Users can easily add MCP servers to NBI, auto-approve tools, set environment variables, and group servers based on functionality. Additionally, NBI allows access to built-in tools from an MCP participant, enhancing the user experience and productivity.

github

: 101

aiavatarkit

AIAvatarKit is a tool for building AI-based conversational avatars quickly. It supports various platforms like VRChat and cluster, along with real-world devices. The tool is extensible, allowing unlimited capabilities based on user needs. It requires VOICEVOX API, Google or Azure Speech Services API keys, and Python 3.10. Users can start conversations out of the box and enjoy seamless interactions with the avatars.

github

: 303

llm-rag-workshop

The LLM RAG Workshop repository provides a workshop on using Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) to generate and understand text in a human-like manner. It includes instructions on setting up the environment, indexing Zoomcamp FAQ documents, creating a Q&A system, and using OpenAI for generation based on retrieved information. The repository focuses on enhancing language model responses with retrieved information from external sources, such as document databases or search engines, to improve factual accuracy and relevance of generated text.

github

: 166

VectorETL

VectorETL is a lightweight ETL framework designed to assist Data & AI engineers in processing data for AI applications quickly. It streamlines the conversion of diverse data sources into vector embeddings and storage in various vector databases. The framework supports multiple data sources, embedding models, and vector database targets, simplifying the creation and management of vector search systems for semantic search, recommendation systems, and other vector-based operations.

github

: 72

008

008 is an open-source event-driven AI powered WebRTC Softphone compatible with macOS, Windows, and Linux. It is also accessible on the web. The name '008' or 'agent 008' reflects our ambition: beyond crafting the premier Open Source Softphone, we aim to introduce a programmable, event-driven AI agent. This agent utilizes embedded artificial intelligence models operating directly on the softphone, ensuring efficiency and reduced operational costs.

github

: 75

sparrow

Sparrow is an innovative open-source solution for efficient data extraction and processing from various documents and images. It seamlessly handles forms, invoices, receipts, and other unstructured data sources. Sparrow stands out with its modular architecture, offering independent services and pipelines all optimized for robust performance. One of the critical functionalities of Sparrow - pluggable architecture. You can easily integrate and run data extraction pipelines using tools and frameworks like LlamaIndex, Haystack, or Unstructured. Sparrow enables local LLM data extraction pipelines through Ollama or Apple MLX. With Sparrow solution you get API, which helps to process and transform your data into structured output, ready to be integrated with custom workflows. Sparrow Agents - with Sparrow you can build independent LLM agents, and use API to invoke them from your system. **List of available agents:** * **llamaindex** - RAG pipeline with LlamaIndex for PDF processing * **vllamaindex** - RAG pipeline with LLamaIndex multimodal for image processing * **vprocessor** - RAG pipeline with OCR and LlamaIndex for image processing * **haystack** - RAG pipeline with Haystack for PDF processing * **fcall** - Function call pipeline * **unstructured-light** - RAG pipeline with Unstructured and LangChain, supports PDF and image processing * **unstructured** - RAG pipeline with Weaviate vector DB query, Unstructured and LangChain, supports PDF and image processing * **instructor** - RAG pipeline with Unstructured and Instructor libraries, supports PDF and image processing. Works great for JSON response generation

github

: 4.5k

promptic

Promptic is a tool designed for LLM app development, providing a productive and pythonic way to build LLM applications. It leverages LiteLLM, allowing flexibility to switch LLM providers easily. Promptic focuses on building features by providing type-safe structured outputs, easy-to-build agents, streaming support, automatic prompt caching, and built-in conversation memory.

github

: 223

chat-mcp

A Cross-Platform Interface for Large Language Models (LLMs) utilizing the Model Context Protocol (MCP) to connect and interact with various LLMs. The desktop app, built on Electron, ensures compatibility across Linux, macOS, and Windows. It simplifies understanding MCP principles, facilitates testing of multiple servers and LLMs, and supports dynamic LLM configuration and multi-client management. The UI can be extracted for web use, ensuring consistency across web and desktop versions.

github

: 93

firecrawl

Firecrawl is an API service that takes a URL, crawls it, and converts it into clean markdown. It crawls all accessible subpages and provides clean markdown for each, without requiring a sitemap. The API is easy to use and can be self-hosted. It also integrates with Langchain and Llama Index. The Python SDK makes it easy to crawl and scrape websites in Python code.

github

: 34.1k

motorhead

Motorhead is a memory and information retrieval server for LLMs. It provides three simple APIs to assist with memory handling in chat applications using LLMs. The first API, GET /sessions/:id/memory, returns messages up to a maximum window size. The second API, POST /sessions/:id/memory, allows you to send an array of messages to Motorhead for storage. The third API, DELETE /sessions/:id/memory, deletes the session's message list. Motorhead also features incremental summarization, where it processes half of the maximum window size of messages and summarizes them when the maximum is reached. Additionally, it supports searching by text query using vector search. Motorhead is configurable through environment variables, including the maximum window size, whether to enable long-term memory, the model used for incremental summarization, the server port, your OpenAI API key, and the Redis URL.

github

: 840

pipecat-flows

Pipecat Flows is a framework designed for building structured conversations in AI applications. It allows users to create both predefined conversation paths and dynamically generated flows, handling state management and LLM interactions. The framework includes a Python module for building conversation flows and a visual editor for designing and exporting flow configurations. Pipecat Flows is suitable for scenarios such as customer service scripts, intake forms, personalized experiences, and complex decision trees.

github

: 222

vim-ai

vim-ai is a plugin that adds Artificial Intelligence (AI) capabilities to Vim and Neovim. It allows users to generate code, edit text, and have interactive conversations with GPT models powered by OpenAI's API. The plugin uses OpenAI's API to generate responses, requiring users to set up an account and obtain an API key. It supports various commands for text generation, editing, and chat interactions, providing a seamless integration of AI features into the Vim text editor environment.

github

: 878

For similar tasks

blog

这是一个程序员关于 ChatGPT 学习过程的记录，其中包括了 ChatGPT 的使用技巧、相关工具和资源的整理，以及一些个人见解和思考。 **使用技巧** * **充值 OpenAI API**：可以通过 https://beta.openai.com/account/api-keys 进行充值，支持信用卡和 PayPal。 * **使用专梯**：推荐使用稳定的专梯，可以有效提高 ChatGPT 的访问速度和稳定性。 * **使用魔法**：可以通过 https://my.x-air.app:666/#/register?aff=32853 访问 ChatGPT，无需魔法即可访问。 * **下载各种 apk**：可以通过 https://apkcombo.com 下载各种安卓应用的 apk 文件。 * **ChatGPT 官网**：ChatGPT 的官方网站是 https://ai.com。 * **Midjourney**：Midjourney 是一个生成式 AI 图像平台，可以通过 https://midjourney.com 访问。 * **文本转视频**：可以通过 https://www.d-id.com 将文本转换为视频。 * **国内大模型**：国内也有很多大模型，如阿里巴巴的通义千问、百度文心一言、讯飞星火、阿里巴巴通义听悟等。 * **查看 OpenAI 状态**：可以通过 https://status.openai.com/ 查看 OpenAI 的服务状态。 * **Canva 画图**：Canva 是一个在线平面设计平台，可以通过 https://www.canva.cn 进行画图。 **相关工具和资源** * **文字转语音**：可以通过 https://modelscope.cn/models?page=1&tasks=text-to-speech&type=audio 找到文字转语音的模型。 * **可好好玩玩的项目**： * https://github.com/sunner/ChatALL * https://github.com/labring/FastGPT * https://github.com/songquanpeng/one-api * **个人博客**： * https://baoyu.io/ * https://gorden-sun.notion.site/527689cd2b294e60912f040095e803c5?v=4f6cc12006c94f47aee4dc909511aeb5 * **srt 2 lrc 歌词**：可以通过 https://gotranscript.com/subtitle-converter 将 srt 格式的字幕转换为 lrc 格式的歌词。 * **5 种速率限制**：OpenAI API 有 5 种速率限制：RPM（每分钟请求数）、RPD（每天请求数）、TPM（每分钟 tokens 数量）、TPD（每天 tokens 数量）、IPM（每分钟图像数量）。 * **扣子平台**：coze.cn 是一个扣子平台，可以提供各种扣子。 * **通过云函数免费使用 GPT-3.5**：可以通过 https://juejin.cn/post/7353849549540589587 免费使用 GPT-3.5。 * **不蒜子统计网页基数**：可以通过 https://busuanzi.ibruce.info/ 统计网页的基数。 * **视频总结和翻译网页**：可以通过 https://glarity.app/zh-CN 总结和翻译视频。 * **视频翻译和配音工具**：可以通过 https://github.com/jianchang512/pyvideotrans 翻译和配音视频。 * **文字生成音频**：可以通过 https://www.cnblogs.com/jijunjian/p/18118366 将文字生成音频。 * **memo ai**：memo.ac 是一个多模态 AI 平台，可以将视频链接、播客链接、本地音视频转换为文字，支持多语言转录后翻译，还可以将文字转换为新的音频。 * **视频总结工具**：可以通过 https://summarize.ing/ 总结视频。 * **可每天免费玩玩**：可以通过 https://www.perplexity.ai/ 每天免费玩玩。 * **Suno.ai**：Suno.ai 是一个 AI 语言模型，可以通过 https://bibigpt.co/ 访问。 * **CapCut**：CapCut 是一个视频编辑软件，可以通过 https://www.capcut.cn/ 下载。 * **Valla.ai**：Valla.ai 是一个多模态 AI 模型，可以通过 https://www.valla.ai/ 访问。 * **Viggle.ai**：Viggle.ai 是一个 AI 视频生成平台，可以通过 https://viggle.ai 访问。 * **使用免费的 GPU 部署文生图大模型**：可以通过 https://www.cnblogs.com/xuxiaona/p/18088404 部署文生图大模型。 * **语音转文字**：可以通过 https://speech.microsoft.com/portal 将语音转换为文字。 * **投资界的 ai**：可以通过 https://reportify.cc/ 了解投资界的 ai。 * **抓取小视频 app 的各种信息**：可以通过 https://github.com/NanmiCoder/MediaCrawler 抓取小视频 app 的各种信息。 * **马斯克 Grok1 开源**：马斯克的 Grok1 模型已经开源，可以通过 https://github.com/xai-org/grok-1 访问。 * **ChatALL**：ChatALL 是一个跨端支持的聊天机器人，可以通过 https://github.com/sunner/ChatALL 访问。 * **零一万物**：零一万物是一个 AI 平台，可以通过 https://www.01.ai/cn 访问。 * **智普**：智普是一个 AI 语言模型，可以通过 https://chatglm.cn/ 访问。 * **memo ai 下载**：可以通过 https://memo.ac/ 下载 memo ai。 * **ffmpeg 学习**：可以通过 https://www.ruanyifeng.com/blog/2020/01/ffmpeg.html 学习 ffmpeg。 * **自动生成文章小工具**：可以通过 https://www.cognition-labs.com/blog 生成文章。 * **简易商城**：可以通过 https://www.cnblogs.com/whuanle/p/18086537 搭建简易商城。 * **物联网**：可以通过 https://www.cnblogs.com/xuxiaona/p/18088404 学习物联网。 * **自定义表单、自定义列表、自定义上传和下载、自定义流程、自定义报表**：可以通过 https://www.cnblogs.com/whuanle/p/18086537 实现自定义表单、自定义列表、自定义上传和下载、自定义流程、自定义报表。 **个人见解和思考** * ChatGPT 是一个强大的工具，可以用来提高工作效率和创造力。 * ChatGPT 的使用门槛较低，即使是非技术人员也可以轻松上手。 * ChatGPT 的发展速度非常快，未来可能会对各个行业产生深远的影响。 * 我们应该理性看待 ChatGPT，既要看到它的优点，也要意识到它的局限性。 * 我们应该积极探索 ChatGPT 的应用场景，为社会创造价值。

github

: 81

chat-ui

A chat interface using open source models, eg OpenAssistant or Llama. It is a SvelteKit app and it powers the HuggingChat app on hf.co/chat.

github

: 8.5k

ChatterUI

ChatterUI is a mobile app that allows users to manage chat files and character cards, and to interact with Large Language Models (LLMs). It supports multiple backends, including local, koboldcpp, text-generation-webui, Generic Text Completions, AI Horde, Mancer, Open Router, and OpenAI. ChatterUI provides a mobile-friendly interface for interacting with LLMs, making it easy to use them for a variety of tasks, such as generating text, translating languages, writing code, and answering questions.

github

: 1.1k

99AI

99AI is a commercializable AI web application based on NineAI 2.4.2 (no authorization, no backdoors, no piracy, integrated front-end and back-end integration packages, supports Docker rapid deployment). The uncompiled source code is temporarily closed. Compared with the stable version, the development version is faster.

github

: 736

chatnio

Chat Nio is a next-generation AI one-stop solution that provides a rich and user-friendly interface for interacting with various AI models. It offers features such as AI chat conversation, rich format compatibility, markdown support, message menu support, multi-platform adaptation, dialogue memory, full-model file parsing, full-model DuckDuckGo online search, full-screen large text editing, model marketplace, preset support, site announcements, preference settings, internationalization support, and a rich admin system. Chat Nio also boasts a powerful channel management system that utilizes a self-developed channel distribution algorithm, supports multi-channel management, is compatible with multiple formats, allows for custom models, supports channel retries, enables balanced load within the same channel, and provides channel model mapping and user grouping. Additionally, Chat Nio offers forwarding API services that are compatible with multiple formats in the OpenAI universal format and support multiple model compatible layers. It also provides a custom build and install option for highly customizable deployments. Chat Nio is an open-source project licensed under the Apache License 2.0 and welcomes contributions from the community.

github

: 2.8k

Awesome-LLM-Reasoning

**Curated collection of papers and resources on how to unlock the reasoning ability of LLMs and MLLMs.** **Description in less than 400 words, no line breaks and quotation marks.** Large Language Models (LLMs) have revolutionized the NLP landscape, showing improved performance and sample efficiency over smaller models. However, increasing model size alone has not proved sufficient for high performance on challenging reasoning tasks, such as solving arithmetic or commonsense problems. This curated collection of papers and resources presents the latest advancements in unlocking the reasoning abilities of LLMs and Multimodal LLMs (MLLMs). It covers various techniques, benchmarks, and applications, providing a comprehensive overview of the field. **5 jobs suitable for this tool, in lowercase letters.** - content writer - researcher - data analyst - software engineer - product manager **Keywords of the tool, in lowercase letters.** - llm - reasoning - multimodal - chain-of-thought - prompt engineering **5 specific tasks user can use this tool to do, in less than 3 words, Verb + noun form, in daily spoken language.** - write a story - answer a question - translate a language - generate code - summarize a document

github

: 2.3k

Chinese-LLaMA-Alpaca-2

Chinese-LLaMA-Alpaca-2 is a large Chinese language model developed by Meta AI. It is based on the Llama-2 model and has been further trained on a large dataset of Chinese text. Chinese-LLaMA-Alpaca-2 can be used for a variety of natural language processing tasks, including text generation, question answering, and machine translation. Here are some of the key features of Chinese-LLaMA-Alpaca-2: * It is the largest Chinese language model ever trained, with 13 billion parameters. * It is trained on a massive dataset of Chinese text, including books, news articles, and social media posts. * It can be used for a variety of natural language processing tasks, including text generation, question answering, and machine translation. * It is open-source and available for anyone to use. Chinese-LLaMA-Alpaca-2 is a powerful tool that can be used to improve the performance of a wide range of natural language processing tasks. It is a valuable resource for researchers and developers working in the field of artificial intelligence.

github

: 6.8k

Linly-Talker

Linly-Talker is an innovative digital human conversation system that integrates the latest artificial intelligence technologies, including Large Language Models (LLM) 🤖, Automatic Speech Recognition (ASR) 🎙️, Text-to-Speech (TTS) 🗣️, and voice cloning technology 🎤. This system offers an interactive web interface through the Gradio platform 🌐, allowing users to upload images 📷 and engage in personalized dialogues with AI 💬.

github

: 2.2k

For similar jobs

weave

Weave is a toolkit for developing Generative AI applications, built by Weights & Biases. With Weave, you can log and debug language model inputs, outputs, and traces; build rigorous, apples-to-apples evaluations for language model use cases; and organize all the information generated across the LLM workflow, from experimentation to evaluations to production. Weave aims to bring rigor, best-practices, and composability to the inherently experimental process of developing Generative AI software, without introducing cognitive overhead.

github

: 855

LLMStack

LLMStack is a no-code platform for building generative AI agents, workflows, and chatbots. It allows users to connect their own data, internal tools, and GPT-powered models without any coding experience. LLMStack can be deployed to the cloud or on-premise and can be accessed via HTTP API or triggered from Slack or Discord.

github

: 1.5k

VisionCraft

The VisionCraft API is a free API for using over 100 different AI models. From images to sound.

github

: 94

kaito

Kaito is an operator that automates the AI/ML inference model deployment in a Kubernetes cluster. It manages large model files using container images, avoids tuning deployment parameters to fit GPU hardware by providing preset configurations, auto-provisions GPU nodes based on model requirements, and hosts large model images in the public Microsoft Container Registry (MCR) if the license allows. Using Kaito, the workflow of onboarding large AI inference models in Kubernetes is largely simplified.

github

: 405

PyRIT

PyRIT is an open access automation framework designed to empower security professionals and ML engineers to red team foundation models and their applications. It automates AI Red Teaming tasks to allow operators to focus on more complicated and time-consuming tasks and can also identify security harms such as misuse (e.g., malware generation, jailbreaking), and privacy harms (e.g., identity theft). The goal is to allow researchers to have a baseline of how well their model and entire inference pipeline is doing against different harm categories and to be able to compare that baseline to future iterations of their model. This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements.

github

: 2.3k

tabby

Tabby is a self-hosted AI coding assistant, offering an open-source and on-premises alternative to GitHub Copilot. It boasts several key features: * Self-contained, with no need for a DBMS or cloud service. * OpenAPI interface, easy to integrate with existing infrastructure (e.g Cloud IDE). * Supports consumer-grade GPUs.

github

: 30.6k

spear

SPEAR (Simulator for Photorealistic Embodied AI Research) is a powerful tool for training embodied agents. It features 300 unique virtual indoor environments with 2,566 unique rooms and 17,234 unique objects that can be manipulated individually. Each environment is designed by a professional artist and features detailed geometry, photorealistic materials, and a unique floor plan and object layout. SPEAR is implemented as Unreal Engine assets and provides an OpenAI Gym interface for interacting with the environments via Python.

github

: 224

Magick

Magick is a groundbreaking visual AIDE (Artificial Intelligence Development Environment) for no-code data pipelines and multimodal agents. Magick can connect to other services and comes with nodes and templates well-suited for intelligent agents, chatbots, complex reasoning systems and realistic characters.

github

: 675