tokencost
Easy token price estimates for 400+ LLMs. TokenOps.
Stars: 1613
Tokencost is a clientside tool for calculating the USD cost of using major Large Language Model (LLMs) APIs by estimating the cost of prompts and completions. It helps track the latest price changes of major LLM providers, accurately count prompt tokens before sending OpenAI requests, and easily integrate to get the cost of a prompt or completion with a single function. Users can calculate prompt and completion costs using OpenAI requests, count tokens in prompts formatted as message lists or string prompts, and refer to a cost table with updated prices for various LLM models. The tool also supports callback handlers for LLM wrapper/framework libraries like LlamaIndex and Langchain.
README:
Clientside token counting + price estimation for LLM apps and AI agents.
🐦 Twitter • 📢 Discord • 🖇️ AgentOps
Tokencost helps calculate the USD cost of using major Large Language Model (LLMs) APIs by calculating the estimated cost of prompts and completions.
Building AI agents? Check out AgentOps
- LLM Price Tracking Major LLM providers frequently add new models and update pricing. This repo helps track the latest price changes
- Token counting Accurately count prompt tokens before sending OpenAI requests
- Easy integration Get the cost of a prompt or completion with a single function
from tokencost import calculate_prompt_cost, calculate_completion_cost
model = "gpt-3.5-turbo"
prompt = [{ "role": "user", "content": "Hello world"}]
completion = "How may I assist you today?"
prompt_cost = calculate_prompt_cost(prompt, model)
completion_cost = calculate_completion_cost(completion, model)
print(f"{prompt_cost} + {completion_cost} = {prompt_cost + completion_cost}")
# 0.0000135 + 0.000014 = 0.0000275Recommended: PyPI:
pip install tokencostCalculating the cost of prompts and completions from OpenAI requests
from openai import OpenAI
client = OpenAI()
model = "gpt-3.5-turbo"
prompt = [{ "role": "user", "content": "Say this is a test"}]
chat_completion = client.chat.completions.create(
messages=prompt, model=model
)
completion = chat_completion.choices[0].message.content
# "This is a test."
prompt_cost = calculate_prompt_cost(prompt, model)
completion_cost = calculate_completion_cost(completion, model)
print(f"{prompt_cost} + {completion_cost} = {prompt_cost + completion_cost}")
# 0.0000180 + 0.000010 = 0.0000280Calculating cost using string prompts instead of messages:
from tokencost import calculate_prompt_cost
prompt_string = "Hello world"
response = "How may I assist you today?"
model= "gpt-3.5-turbo"
prompt_cost = calculate_prompt_cost(prompt_string, model)
print(f"Cost: ${prompt_cost}")
# Cost: $3e-06Counting tokens
from tokencost import count_message_tokens, count_string_tokens
message_prompt = [{ "role": "user", "content": "Hello world"}]
# Counting tokens in prompts formatted as message lists
print(count_message_tokens(message_prompt, model="gpt-3.5-turbo"))
# 9
# Alternatively, counting tokens in string prompts
print(count_string_tokens(prompt="Hello world", model="gpt-3.5-turbo"))
# 2Under the hood, strings and ChatML messages are tokenized using Tiktoken, OpenAI's official tokenizer. Tiktoken splits text into tokens (which can be parts of words or individual characters) and handles both raw strings and message formats with additional tokens for message formatting and roles.
For Anthropic models above version 3 (i.e. Sonnet 3.5, Haiku 3.5, and Opus 3), we use the Anthropic beta token counting API to ensure accurate token counts. For older Claude models, we approximate using Tiktoken with the cl100k_base encoding.
Units denominated in USD. All prices can be located in model_prices.json.
- Prices last updated Jan 30, 2024 from LiteLLM's cost dictionary
| Model Name | Prompt Cost (USD) per 1M tokens | Completion Cost (USD) per 1M tokens | Max Prompt Tokens | Max Output Tokens |
|---|---|---|---|---|
| gpt-4 | $30.00 | $60.00 | 8192 | 4096 |
| gpt-4o | $2.5 | $10.00 | 128,000 | 16384 |
| gpt-4o-audio-preview | $2.5 | $10.00 | 128,000 | 16384 |
| gpt-4o-audio-preview-2024-10-01 | $2.5 | $10.00 | 128,000 | 16384 |
| gpt-4o-mini | $0.15 | $0.6 | 128,000 | 16384 |
| gpt-4o-mini-2024-07-18 | $0.15 | $0.6 | 128,000 | 16384 |
| o1-mini | $1.1 | $4.4 | 128,000 | 65536 |
| o1-mini-2024-09-12 | $ 3.00 | $12.00 | 128,000 | 65536 |
| o1-preview | $15.00 | $60.00 | 128,000 | 32768 |
| o1-preview-2024-09-12 | $15.00 | $60.00 | 128,000 | 32768 |
| chatgpt-4o-latest | $ 5.00 | $15.00 | 128,000 | 4096 |
| gpt-4o-2024-05-13 | $ 5.00 | $15.00 | 128,000 | 4096 |
| gpt-4o-2024-08-06 | $2.5 | $10.00 | 128,000 | 16384 |
| gpt-4-turbo-preview | $10.00 | $30.00 | 128,000 | 4096 |
| gpt-4-0314 | $30.00 | $60.00 | 8,192 | 4096 |
| gpt-4-0613 | $30.00 | $60.00 | 8,192 | 4096 |
| gpt-4-32k | $60.00 | $120.00 | 32,768 | 4096 |
| gpt-4-32k-0314 | $60.00 | $120.00 | 32,768 | 4096 |
| gpt-4-32k-0613 | $60.00 | $120.00 | 32,768 | 4096 |
| gpt-4-turbo | $10.00 | $30.00 | 128,000 | 4096 |
| gpt-4-turbo-2024-04-09 | $10.00 | $30.00 | 128,000 | 4096 |
| gpt-4-1106-preview | $10.00 | $30.00 | 128,000 | 4096 |
| gpt-4-0125-preview | $10.00 | $30.00 | 128,000 | 4096 |
| gpt-4-vision-preview | $10.00 | $30.00 | 128,000 | 4096 |
| gpt-4-1106-vision-preview | $10.00 | $30.00 | 128,000 | 4096 |
| gpt-3.5-turbo | $1.5 | $ 2.00 | 16,385 | 4096 |
| gpt-3.5-turbo-0301 | $1.5 | $ 2.00 | 4,097 | 4096 |
| gpt-3.5-turbo-0613 | $1.5 | $ 2.00 | 4,097 | 4096 |
| gpt-3.5-turbo-1106 | $ 1.00 | $ 2.00 | 16,385 | 4096 |
| gpt-3.5-turbo-0125 | $0.5 | $1.5 | 16,385 | 4096 |
| gpt-3.5-turbo-16k | $ 3.00 | $ 4.00 | 16,385 | 4096 |
| gpt-3.5-turbo-16k-0613 | $ 3.00 | $ 4.00 | 16,385 | 4096 |
| ft:gpt-3.5-turbo | $ 3.00 | $ 6.00 | 16,385 | 4096 |
| ft:gpt-3.5-turbo-0125 | $ 3.00 | $ 6.00 | 16,385 | 4096 |
| ft:gpt-3.5-turbo-1106 | $ 3.00 | $ 6.00 | 16,385 | 4096 |
| ft:gpt-3.5-turbo-0613 | $ 3.00 | $ 6.00 | 4,096 | 4096 |
| ft:gpt-4-0613 | $30.00 | $60.00 | 8,192 | 4096 |
| ft:gpt-4o-2024-08-06 | $3.75 | $15.00 | 128,000 | 16384 |
| ft:gpt-4o-mini-2024-07-18 | $0.3 | $1.2 | 128,000 | 16384 |
| ft:davinci-002 | $ 2.00 | $ 2.00 | 16,384 | 4096 |
| ft:babbage-002 | $0.4 | $0.4 | 16,384 | 4096 |
| text-embedding-3-large | $0.13 | $ 0.00 | 8,191 | nan |
| text-embedding-3-small | $0.02 | $ 0.00 | 8,191 | nan |
| text-embedding-ada-002 | $0.1 | $ 0.00 | 8,191 | nan |
| text-embedding-ada-002-v2 | $0.1 | $ 0.00 | 8,191 | nan |
| text-moderation-stable | $ 0.00 | $ 0.00 | 32,768 | 0 |
| text-moderation-007 | $ 0.00 | $ 0.00 | 32,768 | 0 |
| text-moderation-latest | $ 0.00 | $ 0.00 | 32,768 | 0 |
| 256-x-256/dall-e-2 | -- | -- | nan | nan |
| 512-x-512/dall-e-2 | -- | -- | nan | nan |
| 1024-x-1024/dall-e-2 | -- | -- | nan | nan |
| hd/1024-x-1792/dall-e-3 | -- | -- | nan | nan |
| hd/1792-x-1024/dall-e-3 | -- | -- | nan | nan |
| hd/1024-x-1024/dall-e-3 | -- | -- | nan | nan |
| standard/1024-x-1792/dall-e-3 | -- | -- | nan | nan |
| standard/1792-x-1024/dall-e-3 | -- | -- | nan | nan |
| standard/1024-x-1024/dall-e-3 | -- | -- | nan | nan |
| whisper-1 | -- | -- | nan | nan |
| tts-1 | -- | -- | nan | nan |
| tts-1-hd | -- | -- | nan | nan |
| azure/tts-1 | -- | -- | nan | nan |
| azure/tts-1-hd | -- | -- | nan | nan |
| azure/whisper-1 | -- | -- | nan | nan |
| azure/o1-mini | $1.21 | $4.84 | 128,000 | 65536 |
| azure/o1-mini-2024-09-12 | $1.21 | $4.84 | 128,000 | 65536 |
| azure/o1-preview | $15.00 | $60.00 | 128,000 | 32768 |
| azure/o1-preview-2024-09-12 | $15.00 | $60.00 | 128,000 | 32768 |
| azure/gpt-4o | $2.5 | $10.00 | 128,000 | 16384 |
| azure/gpt-4o-2024-08-06 | $2.5 | $10.00 | 128,000 | 16384 |
| azure/gpt-4o-2024-05-13 | $ 5.00 | $15.00 | 128,000 | 4096 |
| azure/global-standard/gpt-4o-2024-08-06 | $2.5 | $10.00 | 128,000 | 16384 |
| azure/global-standard/gpt-4o-mini | $0.15 | $0.6 | 128,000 | 16384 |
| azure/gpt-4o-mini | $0.165 | $0.66 | 128,000 | 16384 |
| azure/gpt-4-turbo-2024-04-09 | $10.00 | $30.00 | 128,000 | 4096 |
| azure/gpt-4-0125-preview | $10.00 | $30.00 | 128,000 | 4096 |
| azure/gpt-4-1106-preview | $10.00 | $30.00 | 128,000 | 4096 |
| azure/gpt-4-0613 | $30.00 | $60.00 | 8,192 | 4096 |
| azure/gpt-4-32k-0613 | $60.00 | $120.00 | 32,768 | 4096 |
| azure/gpt-4-32k | $60.00 | $120.00 | 32,768 | 4096 |
| azure/gpt-4 | $30.00 | $60.00 | 8,192 | 4096 |
| azure/gpt-4-turbo | $10.00 | $30.00 | 128,000 | 4096 |
| azure/gpt-4-turbo-vision-preview | $10.00 | $30.00 | 128,000 | 4096 |
| azure/gpt-35-turbo-16k-0613 | $ 3.00 | $ 4.00 | 16,385 | 4096 |
| azure/gpt-35-turbo-1106 | $ 1.00 | $ 2.00 | 16,384 | 4096 |
| azure/gpt-35-turbo-0613 | $1.5 | $ 2.00 | 4,097 | 4096 |
| azure/gpt-35-turbo-0301 | $0.2 | $ 2.00 | 4,097 | 4096 |
| azure/gpt-35-turbo-0125 | $0.5 | $1.5 | 16,384 | 4096 |
| azure/gpt-35-turbo-16k | $ 3.00 | $ 4.00 | 16,385 | 4096 |
| azure/gpt-35-turbo | $0.5 | $1.5 | 4,097 | 4096 |
| azure/gpt-3.5-turbo-instruct-0914 | $1.5 | $ 2.00 | 4,097 | nan |
| azure/gpt-35-turbo-instruct | $1.5 | $ 2.00 | 4,097 | nan |
| azure/gpt-35-turbo-instruct-0914 | $1.5 | $ 2.00 | 4,097 | nan |
| azure/mistral-large-latest | $ 8.00 | $24.00 | 32,000 | nan |
| azure/mistral-large-2402 | $ 8.00 | $24.00 | 32,000 | nan |
| azure/command-r-plus | $ 3.00 | $15.00 | 128,000 | 4096 |
| azure/ada | $0.1 | $ 0.00 | 8,191 | nan |
| azure/text-embedding-ada-002 | $0.1 | $ 0.00 | 8,191 | nan |
| azure/text-embedding-3-large | $0.13 | $ 0.00 | 8,191 | nan |
| azure/text-embedding-3-small | $0.02 | $ 0.00 | 8,191 | nan |
| azure/standard/1024-x-1024/dall-e-3 | -- | $ 0.00 | nan | nan |
| azure/hd/1024-x-1024/dall-e-3 | -- | $ 0.00 | nan | nan |
| azure/standard/1024-x-1792/dall-e-3 | -- | $ 0.00 | nan | nan |
| azure/standard/1792-x-1024/dall-e-3 | -- | $ 0.00 | nan | nan |
| azure/hd/1024-x-1792/dall-e-3 | -- | $ 0.00 | nan | nan |
| azure/hd/1792-x-1024/dall-e-3 | -- | $ 0.00 | nan | nan |
| azure/standard/1024-x-1024/dall-e-2 | -- | $ 0.00 | nan | nan |
| azure_ai/jamba-instruct | $0.5 | $0.7 | 70,000 | 4096 |
| azure_ai/mistral-large | $ 4.00 | $12.00 | 32,000 | 8191 |
| azure_ai/mistral-small | $ 1.00 | $ 3.00 | 32,000 | 8191 |
| azure_ai/Meta-Llama-3-70B-Instruct | $1.1 | $0.37 | 8,192 | 2048 |
| azure_ai/Meta-Llama-3.1-8B-Instruct | $0.3 | $0.61 | 128,000 | 2048 |
| azure_ai/Meta-Llama-3.1-70B-Instruct | $2.68 | $3.54 | 128,000 | 2048 |
| azure_ai/Meta-Llama-3.1-405B-Instruct | $5.33 | $16.00 | 128,000 | 2048 |
| azure_ai/cohere-rerank-v3-multilingual | $ 0.00 | $ 0.00 | 4,096 | 4096 |
| azure_ai/cohere-rerank-v3-english | $ 0.00 | $ 0.00 | 4,096 | 4096 |
| azure_ai/Cohere-embed-v3-english | $0.1 | $ 0.00 | 512 | nan |
| azure_ai/Cohere-embed-v3-multilingual | $0.1 | $ 0.00 | 512 | nan |
| babbage-002 | $0.4 | $0.4 | 16,384 | 4096 |
| davinci-002 | $ 2.00 | $ 2.00 | 16,384 | 4096 |
| gpt-3.5-turbo-instruct | $1.5 | $ 2.00 | 8,192 | 4096 |
| gpt-3.5-turbo-instruct-0914 | $1.5 | $ 2.00 | 8,192 | 4097 |
| claude-instant-1 | $1.63 | $5.51 | 100,000 | 8191 |
| mistral/mistral-tiny | $0.25 | $0.25 | 32,000 | 8191 |
| mistral/mistral-small | $0.1 | $0.3 | 32,000 | 8191 |
| mistral/mistral-small-latest | $0.1 | $0.3 | 32,000 | 8191 |
| mistral/mistral-medium | $2.7 | $8.1 | 32,000 | 8191 |
| mistral/mistral-medium-latest | $2.7 | $8.1 | 32,000 | 8191 |
| mistral/mistral-medium-2312 | $2.7 | $8.1 | 32,000 | 8191 |
| mistral/mistral-large-latest | $ 2.00 | $ 6.00 | 128,000 | 128000 |
| mistral/mistral-large-2402 | $ 4.00 | $12.00 | 32,000 | 8191 |
| mistral/mistral-large-2407 | $ 3.00 | $ 9.00 | 128,000 | 128000 |
| mistral/pixtral-12b-2409 | $0.15 | $0.15 | 128,000 | 128000 |
| mistral/open-mistral-7b | $0.25 | $0.25 | 32,000 | 8191 |
| mistral/open-mixtral-8x7b | $0.7 | $0.7 | 32,000 | 8191 |
| mistral/open-mixtral-8x22b | $ 2.00 | $ 6.00 | 65,336 | 8191 |
| mistral/codestral-latest | $ 1.00 | $ 3.00 | 32,000 | 8191 |
| mistral/codestral-2405 | $ 1.00 | $ 3.00 | 32,000 | 8191 |
| mistral/open-mistral-nemo | $0.3 | $0.3 | 128,000 | 128000 |
| mistral/open-mistral-nemo-2407 | $0.3 | $0.3 | 128,000 | 128000 |
| mistral/open-codestral-mamba | $0.25 | $0.25 | 256,000 | 256000 |
| mistral/codestral-mamba-latest | $0.25 | $0.25 | 256,000 | 256000 |
| mistral/mistral-embed | $0.1 | -- | 8,192 | nan |
| deepseek-chat | $0.14 | $0.28 | 128,000 | 4096 |
| codestral/codestral-latest | $ 0.00 | $ 0.00 | 32,000 | 8191 |
| codestral/codestral-2405 | $ 0.00 | $ 0.00 | 32,000 | 8191 |
| text-completion-codestral/codestral-latest | $ 0.00 | $ 0.00 | 32,000 | 8191 |
| text-completion-codestral/codestral-2405 | $ 0.00 | $ 0.00 | 32,000 | 8191 |
| deepseek-coder | $0.14 | $0.28 | 128,000 | 4096 |
| groq/llama2-70b-4096 | $0.7 | $0.8 | 4,096 | 4096 |
| groq/llama3-8b-8192 | $0.05 | $0.08 | 8,192 | 8192 |
| groq/llama3-70b-8192 | $0.59 | $0.79 | 8,192 | 8192 |
| groq/llama-3.1-8b-instant | $0.05 | $0.08 | 8,192 | 8192 |
| groq/llama-3.1-70b-versatile | $0.59 | $0.79 | 8,192 | 8192 |
| groq/llama-3.1-405b-reasoning | $0.59 | $0.79 | 8,192 | 8192 |
| groq/mixtral-8x7b-32768 | $0.24 | $0.24 | 32,768 | 32768 |
| groq/gemma-7b-it | $0.07 | $0.07 | 8,192 | 8192 |
| groq/gemma2-9b-it | $0.2 | $0.2 | 8,192 | 8192 |
| groq/llama3-groq-70b-8192-tool-use-preview | $0.89 | $0.89 | 8,192 | 8192 |
| groq/llama3-groq-8b-8192-tool-use-preview | $0.19 | $0.19 | 8,192 | 8192 |
| cerebras/llama3.1-8b | $0.1 | $0.1 | 128,000 | 128000 |
| cerebras/llama3.1-70b | $0.6 | $0.6 | 128,000 | 128000 |
| friendliai/mixtral-8x7b-instruct-v0-1 | $0.4 | $0.4 | 32,768 | 32768 |
| friendliai/meta-llama-3-8b-instruct | $0.1 | $0.1 | 8,192 | 8192 |
| friendliai/meta-llama-3-70b-instruct | $0.8 | $0.8 | 8,192 | 8192 |
| claude-instant-1.2 | $0.163 | $0.551 | 100,000 | 8191 |
| claude-2 | $ 8.00 | $24.00 | 100,000 | 8191 |
| claude-2.1 | $ 8.00 | $24.00 | 200,000 | 8191 |
| claude-3-haiku-20240307 | $0.25 | $1.25 | 200,000 | 4096 |
| claude-3-haiku-latest | $0.25 | $1.25 | 200,000 | 4096 |
| claude-3-opus-20240229 | $15.00 | $75.00 | 200,000 | 4096 |
| claude-3-opus-latest | $15.00 | $75.00 | 200,000 | 4096 |
| claude-3-sonnet-20240229 | $ 3.00 | $15.00 | 200,000 | 4096 |
| claude-3-5-sonnet-20240620 | $ 3.00 | $15.00 | 200,000 | 8192 |
| claude-3-5-sonnet-20241022 | $ 3.00 | $15.00 | 200,000 | 8192 |
| claude-3-5-sonnet-latest | $ 3.00 | $15.00 | 200,000 | 8192 |
| text-bison | -- | -- | 8,192 | 2048 |
| text-bison@001 | -- | -- | 8,192 | 1024 |
| text-bison@002 | -- | -- | 8,192 | 1024 |
| text-bison32k | $0.125 | $0.125 | 8,192 | 1024 |
| text-bison32k@002 | $0.125 | $0.125 | 8,192 | 1024 |
| text-unicorn | $10.00 | $28.00 | 8,192 | 1024 |
| text-unicorn@001 | $10.00 | $28.00 | 8,192 | 1024 |
| chat-bison | $0.125 | $0.125 | 8,192 | 4096 |
| chat-bison@001 | $0.125 | $0.125 | 8,192 | 4096 |
| chat-bison@002 | $0.125 | $0.125 | 8,192 | 4096 |
| chat-bison-32k | $0.125 | $0.125 | 32,000 | 8192 |
| chat-bison-32k@002 | $0.125 | $0.125 | 32,000 | 8192 |
| code-bison | $0.125 | $0.125 | 6,144 | 1024 |
| code-bison@001 | $0.125 | $0.125 | 6,144 | 1024 |
| code-bison@002 | $0.125 | $0.125 | 6,144 | 1024 |
| code-bison32k | $0.125 | $0.125 | 6,144 | 1024 |
| code-bison-32k@002 | $0.125 | $0.125 | 6,144 | 1024 |
| code-gecko@001 | $0.125 | $0.125 | 2,048 | 64 |
| code-gecko@002 | $0.125 | $0.125 | 2,048 | 64 |
| code-gecko | $0.125 | $0.125 | 2,048 | 64 |
| code-gecko-latest | $0.125 | $0.125 | 2,048 | 64 |
| codechat-bison@latest | $0.125 | $0.125 | 6,144 | 1024 |
| codechat-bison | $0.125 | $0.125 | 6,144 | 1024 |
| codechat-bison@001 | $0.125 | $0.125 | 6,144 | 1024 |
| codechat-bison@002 | $0.125 | $0.125 | 6,144 | 1024 |
| codechat-bison-32k | $0.125 | $0.125 | 32,000 | 8192 |
| codechat-bison-32k@002 | $0.125 | $0.125 | 32,000 | 8192 |
| gemini-pro | $0.5 | $1.5 | 32,760 | 8192 |
| gemini-1.0-pro | $0.5 | $1.5 | 32,760 | 8192 |
| gemini-1.0-pro-001 | $0.5 | $1.5 | 32,760 | 8192 |
| gemini-1.0-ultra | $0.5 | $1.5 | 8,192 | 2048 |
| gemini-1.0-ultra-001 | $0.5 | $1.5 | 8,192 | 2048 |
| gemini-1.0-pro-002 | $0.5 | $1.5 | 32,760 | 8192 |
| gemini-1.5-pro | $1.25 | $ 5.00 | 2,097,152 | 8192 |
| gemini-1.5-pro-002 | $1.25 | $ 5.00 | 2,097,152 | 8192 |
| gemini-1.5-pro-001 | $1.25 | $ 5.00 | 1,000,000 | 8192 |
| gemini-1.5-pro-preview-0514 | $0.078125 | $0.3125 | 1,000,000 | 8192 |
| gemini-1.5-pro-preview-0215 | $0.078125 | $0.3125 | 1,000,000 | 8192 |
| gemini-1.5-pro-preview-0409 | $0.078125 | $0.3125 | 1,000,000 | 8192 |
| gemini-1.5-flash | $0.075 | $0.3 | 1,000,000 | 8192 |
| gemini-1.5-flash-exp-0827 | $0.004688 | $0.0046875 | 1,000,000 | 8192 |
| gemini-1.5-flash-002 | $0.075 | $0.3 | 1,048,576 | 8192 |
| gemini-1.5-flash-001 | $0.075 | $0.3 | 1,000,000 | 8192 |
| gemini-1.5-flash-preview-0514 | $0.075 | $0.0046875 | 1,000,000 | 8192 |
| gemini-pro-experimental | $ 0.00 | $ 0.00 | 1,000,000 | 8192 |
| gemini-flash-experimental | $ 0.00 | $ 0.00 | 1,000,000 | 8192 |
| gemini-pro-vision | $0.5 | $1.5 | 16,384 | 2048 |
| gemini-1.0-pro-vision | $0.5 | $1.5 | 16,384 | 2048 |
| gemini-1.0-pro-vision-001 | $0.5 | $1.5 | 16,384 | 2048 |
| medlm-medium | -- | -- | 32,768 | 8192 |
| medlm-large | -- | -- | 8,192 | 1024 |
| vertex_ai/claude-3-sonnet@20240229 | $ 3.00 | $15.00 | 200,000 | 4096 |
| vertex_ai/claude-3-5-sonnet@20240620 | $ 3.00 | $15.00 | 200,000 | 8192 |
| vertex_ai/claude-3-5-sonnet-v2@20241022 | $ 3.00 | $15.00 | 200,000 | 8192 |
| vertex_ai/claude-3-haiku@20240307 | $0.25 | $1.25 | 200,000 | 4096 |
| vertex_ai/claude-3-opus@20240229 | $15.00 | $75.00 | 200,000 | 4096 |
| vertex_ai/meta/llama3-405b-instruct-maas | $ 0.00 | $ 0.00 | 32,000 | 32000 |
| vertex_ai/meta/llama3-70b-instruct-maas | $ 0.00 | $ 0.00 | 32,000 | 32000 |
| vertex_ai/meta/llama3-8b-instruct-maas | $ 0.00 | $ 0.00 | 32,000 | 32000 |
| vertex_ai/meta/llama-3.2-90b-vision-instruct-maas | $ 0.00 | $ 0.00 | 128,000 | 2048 |
| vertex_ai/mistral-large@latest | $ 2.00 | $ 6.00 | 128,000 | 8191 |
| vertex_ai/mistral-large@2407 | $ 2.00 | $ 6.00 | 128,000 | 8191 |
| vertex_ai/mistral-nemo@latest | $0.15 | $0.15 | 128,000 | 128000 |
| vertex_ai/jamba-1.5-mini@001 | $0.2 | $0.4 | 256,000 | 256000 |
| vertex_ai/jamba-1.5-large@001 | $ 2.00 | $ 8.00 | 256,000 | 256000 |
| vertex_ai/jamba-1.5 | $0.2 | $0.4 | 256,000 | 256000 |
| vertex_ai/jamba-1.5-mini | $0.2 | $0.4 | 256,000 | 256000 |
| vertex_ai/jamba-1.5-large | $ 2.00 | $ 8.00 | 256,000 | 256000 |
| vertex_ai/mistral-nemo@2407 | $ 3.00 | $ 3.00 | 128,000 | 128000 |
| vertex_ai/codestral@latest | $0.2 | $0.6 | 128,000 | 128000 |
| vertex_ai/codestral@2405 | $0.2 | $0.6 | 128,000 | 128000 |
| vertex_ai/imagegeneration@006 | -- | -- | nan | nan |
| vertex_ai/imagen-3.0-generate-001 | -- | -- | nan | nan |
| vertex_ai/imagen-3.0-fast-generate-001 | -- | -- | nan | nan |
| text-embedding-004 | $0.1 | $ 0.00 | 2,048 | nan |
| text-multilingual-embedding-002 | $0.1 | $ 0.00 | 2,048 | nan |
| textembedding-gecko | $0.1 | $ 0.00 | 3,072 | nan |
| textembedding-gecko-multilingual | $0.1 | $ 0.00 | 3,072 | nan |
| textembedding-gecko-multilingual@001 | $0.1 | $ 0.00 | 3,072 | nan |
| textembedding-gecko@001 | $0.1 | $ 0.00 | 3,072 | nan |
| textembedding-gecko@003 | $0.1 | $ 0.00 | 3,072 | nan |
| text-embedding-preview-0409 | $0.00625 | $ 0.00 | 3,072 | nan |
| text-multilingual-embedding-preview-0409 | $0.00625 | $ 0.00 | 3,072 | nan |
| palm/chat-bison | $0.125 | $0.125 | 8,192 | 4096 |
| palm/chat-bison-001 | $0.125 | $0.125 | 8,192 | 4096 |
| palm/text-bison | $0.125 | $0.125 | 8,192 | 1024 |
| palm/text-bison-001 | $0.125 | $0.125 | 8,192 | 1024 |
| palm/text-bison-safety-off | $0.125 | $0.125 | 8,192 | 1024 |
| palm/text-bison-safety-recitation-off | $0.125 | $0.125 | 8,192 | 1024 |
| gemini/gemini-1.5-flash-002 | $0.075 | $0.3 | 1,048,576 | 8192 |
| gemini/gemini-1.5-flash-001 | $0.075 | $0.3 | 1,048,576 | 8192 |
| gemini/gemini-1.5-flash | $0.075 | $0.3 | 1,048,576 | 8192 |
| gemini/gemini-1.5-flash-latest | $0.075 | $0.3 | 1,048,576 | 8192 |
| gemini/gemini-1.5-flash-8b-exp-0924 | $ 0.00 | $ 0.00 | 1,048,576 | 8192 |
| gemini/gemini-1.5-flash-exp-0827 | $ 0.00 | $ 0.00 | 1,048,576 | 8192 |
| gemini/gemini-1.5-flash-8b-exp-0827 | $ 0.00 | $ 0.00 | 1,000,000 | 8192 |
| gemini/gemini-pro | $0.35 | $1.05 | 32,760 | 8192 |
| gemini/gemini-1.5-pro | $3.5 | $10.5 | 2,097,152 | 8192 |
| gemini/gemini-1.5-pro-002 | $3.5 | $10.5 | 2,097,152 | 8192 |
| gemini/gemini-1.5-pro-001 | $3.5 | $10.5 | 2,097,152 | 8192 |
| gemini/gemini-1.5-pro-exp-0801 | $3.5 | $10.5 | 2,097,152 | 8192 |
| gemini/gemini-1.5-pro-exp-0827 | $ 0.00 | $ 0.00 | 2,097,152 | 8192 |
| gemini/gemini-1.5-pro-latest | $3.5 | $1.05 | 1,048,576 | 8192 |
| gemini/gemini-pro-vision | $0.35 | $1.05 | 30,720 | 2048 |
| gemini/gemini-gemma-2-27b-it | $0.35 | $1.05 | nan | 8192 |
| gemini/gemini-gemma-2-9b-it | $0.35 | $1.05 | nan | 8192 |
| command-r | $0.15 | $0.6 | 128,000 | 4096 |
| command-r-08-2024 | $0.15 | $0.6 | 128,000 | 4096 |
| command-light | $0.3 | $0.6 | 4,096 | 4096 |
| command-r-plus | $2.5 | $10.00 | 128,000 | 4096 |
| command-r-plus-08-2024 | $2.5 | $10.00 | 128,000 | 4096 |
| command-nightly | $ 1.00 | $ 2.00 | 4,096 | 4096 |
| command | $ 1.00 | $ 2.00 | 4,096 | 4096 |
| rerank-english-v3.0 | $ 0.00 | $ 0.00 | 4,096 | 4096 |
| rerank-multilingual-v3.0 | $ 0.00 | $ 0.00 | 4,096 | 4096 |
| rerank-english-v2.0 | $ 0.00 | $ 0.00 | 4,096 | 4096 |
| rerank-multilingual-v2.0 | $ 0.00 | $ 0.00 | 4,096 | 4096 |
| embed-english-v3.0 | $0.1 | $ 0.00 | 1,024 | nan |
| embed-english-light-v3.0 | $0.1 | $ 0.00 | 1,024 | nan |
| embed-multilingual-v3.0 | $0.1 | $ 0.00 | 1,024 | nan |
| embed-english-v2.0 | $0.1 | $ 0.00 | 4,096 | nan |
| embed-english-light-v2.0 | $0.1 | $ 0.00 | 1,024 | nan |
| embed-multilingual-v2.0 | $0.1 | $ 0.00 | 768 | nan |
| replicate/meta/llama-2-13b | $0.1 | $0.5 | 4,096 | 4096 |
| replicate/meta/llama-2-13b-chat | $0.1 | $0.5 | 4,096 | 4096 |
| replicate/meta/llama-2-70b | $0.65 | $2.75 | 4,096 | 4096 |
| replicate/meta/llama-2-70b-chat | $0.65 | $2.75 | 4,096 | 4096 |
| replicate/meta/llama-2-7b | $0.05 | $0.25 | 4,096 | 4096 |
| replicate/meta/llama-2-7b-chat | $0.05 | $0.25 | 4,096 | 4096 |
| replicate/meta/llama-3-70b | $0.65 | $2.75 | 8,192 | 8192 |
| replicate/meta/llama-3-70b-instruct | $0.65 | $2.75 | 8,192 | 8192 |
| replicate/meta/llama-3-8b | $0.05 | $0.25 | 8,086 | 8086 |
| replicate/meta/llama-3-8b-instruct | $0.05 | $0.25 | 8,086 | 8086 |
| replicate/mistralai/mistral-7b-v0.1 | $0.05 | $0.25 | 4,096 | 4096 |
| replicate/mistralai/mistral-7b-instruct-v0.2 | $0.05 | $0.25 | 4,096 | 4096 |
| replicate/mistralai/mixtral-8x7b-instruct-v0.1 | $0.3 | $ 1.00 | 4,096 | 4096 |
| openrouter/deepseek/deepseek-coder | $0.14 | $0.28 | 66,000 | 4096 |
| openrouter/microsoft/wizardlm-2-8x22b:nitro | $ 1.00 | $ 1.00 | nan | nan |
| openrouter/google/gemini-pro-1.5 | $2.5 | $7.5 | 1,000,000 | 8192 |
| openrouter/mistralai/mixtral-8x22b-instruct | $0.65 | $0.65 | nan | nan |
| openrouter/cohere/command-r-plus | $ 3.00 | $15.00 | nan | nan |
| openrouter/databricks/dbrx-instruct | $0.6 | $0.6 | nan | nan |
| openrouter/anthropic/claude-3-haiku | $0.25 | $1.25 | nan | nan |
| openrouter/anthropic/claude-3-haiku-20240307 | $0.25 | $1.25 | 200,000 | 4096 |
| anthropic/claude-3-5-sonnet-20241022 | $ 3.00 | $15.00 | 200,000 | 8192 |
| anthropic/claude-3-5-sonnet-latest | $ 3.00 | $15.00 | 200,000 | 8192 |
| openrouter/anthropic/claude-3.5-sonnet | $ 3.00 | $15.00 | 200,000 | 8192 |
| openrouter/anthropic/claude-3.5-sonnet:beta | $ 3.00 | $15.00 | 200,000 | 8192 |
| openrouter/anthropic/claude-3-sonnet | $ 3.00 | $15.00 | nan | nan |
| openrouter/mistralai/mistral-large | $ 8.00 | $24.00 | nan | nan |
| openrouter/cognitivecomputations/dolphin-mixtral-8x7b | $0.5 | $0.5 | nan | nan |
| openrouter/google/gemini-pro-vision | $0.125 | $0.375 | nan | nan |
| openrouter/fireworks/firellava-13b | $0.2 | $0.2 | nan | nan |
| openrouter/meta-llama/llama-3-8b-instruct:free | $ 0.00 | $ 0.00 | nan | nan |
| openrouter/meta-llama/llama-3-8b-instruct:extended | $0.225 | $2.25 | nan | nan |
| openrouter/meta-llama/llama-3-70b-instruct:nitro | $0.9 | $0.9 | nan | nan |
| openrouter/meta-llama/llama-3-70b-instruct | $0.59 | $0.79 | nan | nan |
| openrouter/openai/o1-mini | $ 3.00 | $12.00 | 128,000 | 65536 |
| openrouter/openai/o1-mini-2024-09-12 | $ 3.00 | $12.00 | 128,000 | 65536 |
| openrouter/openai/o1-preview | $15.00 | $60.00 | 128,000 | 32768 |
| openrouter/openai/o1-preview-2024-09-12 | $15.00 | $60.00 | 128,000 | 32768 |
| openrouter/openai/gpt-4o | $ 5.00 | $15.00 | 128,000 | 4096 |
| openrouter/openai/gpt-4o-2024-05-13 | $ 5.00 | $15.00 | 128,000 | 4096 |
| openrouter/openai/gpt-4-vision-preview | $10.00 | $30.00 | nan | nan |
| openrouter/openai/gpt-3.5-turbo | $1.5 | $ 2.00 | nan | nan |
| openrouter/openai/gpt-3.5-turbo-16k | $ 3.00 | $ 4.00 | nan | nan |
| openrouter/openai/gpt-4 | $30.00 | $60.00 | nan | nan |
| openrouter/anthropic/claude-instant-v1 | $1.63 | $5.51 | nan | 8191 |
| openrouter/anthropic/claude-2 | $11.02 | $32.68 | nan | 8191 |
| openrouter/anthropic/claude-3-opus | $15.00 | $75.00 | 200,000 | 4096 |
| openrouter/google/palm-2-chat-bison | $0.5 | $0.5 | nan | nan |
| openrouter/google/palm-2-codechat-bison | $0.5 | $0.5 | nan | nan |
| openrouter/meta-llama/llama-2-13b-chat | $0.2 | $0.2 | nan | nan |
| openrouter/meta-llama/llama-2-70b-chat | $1.5 | $1.5 | nan | nan |
| openrouter/meta-llama/codellama-34b-instruct | $0.5 | $0.5 | nan | nan |
| openrouter/nousresearch/nous-hermes-llama2-13b | $0.2 | $0.2 | nan | nan |
| openrouter/mancer/weaver | $5.625 | $5.625 | nan | nan |
| openrouter/gryphe/mythomax-l2-13b | $1.875 | $1.875 | nan | nan |
| openrouter/jondurbin/airoboros-l2-70b-2.1 | $13.875 | $13.875 | nan | nan |
| openrouter/undi95/remm-slerp-l2-13b | $1.875 | $1.875 | nan | nan |
| openrouter/pygmalionai/mythalion-13b | $1.875 | $1.875 | nan | nan |
| openrouter/mistralai/mistral-7b-instruct | $0.13 | $0.13 | nan | nan |
| openrouter/mistralai/mistral-7b-instruct:free | $ 0.00 | $ 0.00 | nan | nan |
| j2-ultra | $15.00 | $15.00 | 8,192 | 8192 |
| jamba-1.5-mini@001 | $0.2 | $0.4 | 256,000 | 256000 |
| jamba-1.5-large@001 | $ 2.00 | $ 8.00 | 256,000 | 256000 |
| jamba-1.5 | $0.2 | $0.4 | 256,000 | 256000 |
| jamba-1.5-mini | $0.2 | $0.4 | 256,000 | 256000 |
| jamba-1.5-large | $ 2.00 | $ 8.00 | 256,000 | 256000 |
| j2-mid | $10.00 | $10.00 | 8,192 | 8192 |
| j2-light | $ 3.00 | $ 3.00 | 8,192 | 8192 |
| dolphin | $0.5 | $0.5 | 16,384 | 16384 |
| chatdolphin | $0.5 | $0.5 | 16,384 | 16384 |
| luminous-base | $30.00 | $33.00 | nan | nan |
| luminous-base-control | $37.5 | $41.25 | nan | nan |
| luminous-extended | $45.00 | $49.5 | nan | nan |
| luminous-extended-control | $56.25 | $61.875 | nan | nan |
| luminous-supreme | $175.00 | $192.5 | nan | nan |
| luminous-supreme-control | $218.75 | $240.625 | nan | nan |
| ai21.j2-mid-v1 | $12.5 | $12.5 | 8,191 | 8191 |
| ai21.j2-ultra-v1 | $18.8 | $18.8 | 8,191 | 8191 |
| ai21.jamba-instruct-v1:0 | $0.5 | $0.7 | 70,000 | 4096 |
| amazon.titan-text-lite-v1 | $0.3 | $0.4 | 42,000 | 4000 |
| amazon.titan-text-express-v1 | $1.3 | $1.7 | 42,000 | 8000 |
| amazon.titan-text-premier-v1:0 | $0.5 | $1.5 | 42,000 | 32000 |
| amazon.titan-embed-text-v1 | $0.1 | $ 0.00 | 8,192 | nan |
| amazon.titan-embed-text-v2:0 | $0.2 | $ 0.00 | 8,192 | nan |
| mistral.mistral-7b-instruct-v0:2 | $0.15 | $0.2 | 32,000 | 8191 |
| mistral.mixtral-8x7b-instruct-v0:1 | $0.45 | $0.7 | 32,000 | 8191 |
| mistral.mistral-large-2402-v1:0 | $ 8.00 | $24.00 | 32,000 | 8191 |
| mistral.mistral-large-2407-v1:0 | $ 3.00 | $ 9.00 | 128,000 | 8191 |
| mistral.mistral-small-2402-v1:0 | $ 1.00 | $ 3.00 | 32,000 | 8191 |
| bedrock/us-west-2/mistral.mixtral-8x7b-instruct-v0:1 | $0.45 | $0.7 | 32,000 | 8191 |
| bedrock/us-east-1/mistral.mixtral-8x7b-instruct-v0:1 | $0.45 | $0.7 | 32,000 | 8191 |
| bedrock/eu-west-3/mistral.mixtral-8x7b-instruct-v0:1 | $0.59 | $0.91 | 32,000 | 8191 |
| bedrock/us-west-2/mistral.mistral-7b-instruct-v0:2 | $0.15 | $0.2 | 32,000 | 8191 |
| bedrock/us-east-1/mistral.mistral-7b-instruct-v0:2 | $0.15 | $0.2 | 32,000 | 8191 |
| bedrock/eu-west-3/mistral.mistral-7b-instruct-v0:2 | $0.2 | $0.26 | 32,000 | 8191 |
| bedrock/us-east-1/mistral.mistral-large-2402-v1:0 | $ 8.00 | $24.00 | 32,000 | 8191 |
| bedrock/us-west-2/mistral.mistral-large-2402-v1:0 | $ 8.00 | $24.00 | 32,000 | 8191 |
| bedrock/eu-west-3/mistral.mistral-large-2402-v1:0 | $10.4 | $31.2 | 32,000 | 8191 |
| anthropic.claude-3-sonnet-20240229-v1:0 | $ 3.00 | $15.00 | 200,000 | 4096 |
| anthropic.claude-3-5-sonnet-20240620-v1:0 | $ 3.00 | $15.00 | 200,000 | 4096 |
| anthropic.claude-3-5-sonnet-20241022-v2:0 | $ 3.00 | $15.00 | 200,000 | 8192 |
| anthropic.claude-3-5-sonnet-latest-v2:0 | $ 3.00 | $15.00 | 200,000 | 4096 |
| anthropic.claude-3-haiku-20240307-v1:0 | $0.25 | $1.25 | 200,000 | 4096 |
| anthropic.claude-3-opus-20240229-v1:0 | $15.00 | $75.00 | 200,000 | 4096 |
| us.anthropic.claude-3-sonnet-20240229-v1:0 | $ 3.00 | $15.00 | 200,000 | 4096 |
| us.anthropic.claude-3-5-sonnet-20240620-v1:0 | $ 3.00 | $15.00 | 200,000 | 4096 |
| us.anthropic.claude-3-5-sonnet-20241022-v2:0 | $ 3.00 | $15.00 | 200,000 | 8192 |
| us.anthropic.claude-3-haiku-20240307-v1:0 | $0.25 | $1.25 | 200,000 | 4096 |
| us.anthropic.claude-3-opus-20240229-v1:0 | $15.00 | $75.00 | 200,000 | 4096 |
| eu.anthropic.claude-3-sonnet-20240229-v1:0 | $ 3.00 | $15.00 | 200,000 | 4096 |
| eu.anthropic.claude-3-5-sonnet-20240620-v1:0 | $ 3.00 | $15.00 | 200,000 | 4096 |
| eu.anthropic.claude-3-5-sonnet-20241022-v2:0 | $ 3.00 | $15.00 | 200,000 | 8192 |
| eu.anthropic.claude-3-haiku-20240307-v1:0 | $0.25 | $1.25 | 200,000 | 4096 |
| eu.anthropic.claude-3-opus-20240229-v1:0 | $15.00 | $75.00 | 200,000 | 4096 |
| anthropic.claude-v1 | $ 8.00 | $24.00 | 100,000 | 8191 |
| bedrock/us-east-1/anthropic.claude-v1 | $ 8.00 | $24.00 | 100,000 | 8191 |
| bedrock/us-west-2/anthropic.claude-v1 | $ 8.00 | $24.00 | 100,000 | 8191 |
| bedrock/ap-northeast-1/anthropic.claude-v1 | $ 8.00 | $24.00 | 100,000 | 8191 |
| bedrock/ap-northeast-1/1-month-commitment/anthropic.claude-v1 | -- | -- | 100,000 | 8191 |
| bedrock/ap-northeast-1/6-month-commitment/anthropic.claude-v1 | -- | -- | 100,000 | 8191 |
| bedrock/eu-central-1/anthropic.claude-v1 | $ 8.00 | $24.00 | 100,000 | 8191 |
| bedrock/eu-central-1/1-month-commitment/anthropic.claude-v1 | -- | -- | 100,000 | 8191 |
| bedrock/eu-central-1/6-month-commitment/anthropic.claude-v1 | -- | -- | 100,000 | 8191 |
| bedrock/us-east-1/1-month-commitment/anthropic.claude-v1 | -- | -- | 100,000 | 8191 |
| bedrock/us-east-1/6-month-commitment/anthropic.claude-v1 | -- | -- | 100,000 | 8191 |
| bedrock/us-west-2/1-month-commitment/anthropic.claude-v1 | -- | -- | 100,000 | 8191 |
| bedrock/us-west-2/6-month-commitment/anthropic.claude-v1 | -- | -- | 100,000 | 8191 |
| anthropic.claude-v2 | $ 8.00 | $24.00 | 100,000 | 8191 |
| bedrock/us-east-1/anthropic.claude-v2 | $ 8.00 | $24.00 | 100,000 | 8191 |
| bedrock/us-west-2/anthropic.claude-v2 | $ 8.00 | $24.00 | 100,000 | 8191 |
| bedrock/ap-northeast-1/anthropic.claude-v2 | $ 8.00 | $24.00 | 100,000 | 8191 |
| bedrock/ap-northeast-1/1-month-commitment/anthropic.claude-v2 | -- | -- | 100,000 | 8191 |
| bedrock/ap-northeast-1/6-month-commitment/anthropic.claude-v2 | -- | -- | 100,000 | 8191 |
| bedrock/eu-central-1/anthropic.claude-v2 | $ 8.00 | $24.00 | 100,000 | 8191 |
| bedrock/eu-central-1/1-month-commitment/anthropic.claude-v2 | -- | -- | 100,000 | 8191 |
| bedrock/eu-central-1/6-month-commitment/anthropic.claude-v2 | -- | -- | 100,000 | 8191 |
| bedrock/us-east-1/1-month-commitment/anthropic.claude-v2 | -- | -- | 100,000 | 8191 |
| bedrock/us-east-1/6-month-commitment/anthropic.claude-v2 | -- | -- | 100,000 | 8191 |
| bedrock/us-west-2/1-month-commitment/anthropic.claude-v2 | -- | -- | 100,000 | 8191 |
| bedrock/us-west-2/6-month-commitment/anthropic.claude-v2 | -- | -- | 100,000 | 8191 |
| anthropic.claude-v2:1 | $ 8.00 | $24.00 | 100,000 | 8191 |
| bedrock/us-east-1/anthropic.claude-v2:1 | $ 8.00 | $24.00 | 100,000 | 8191 |
| bedrock/us-west-2/anthropic.claude-v2:1 | $ 8.00 | $24.00 | 100,000 | 8191 |
| bedrock/ap-northeast-1/anthropic.claude-v2:1 | $ 8.00 | $24.00 | 100,000 | 8191 |
| bedrock/ap-northeast-1/1-month-commitment/anthropic.claude-v2:1 | -- | -- | 100,000 | 8191 |
| bedrock/ap-northeast-1/6-month-commitment/anthropic.claude-v2:1 | -- | -- | 100,000 | 8191 |
| bedrock/eu-central-1/anthropic.claude-v2:1 | $ 8.00 | $24.00 | 100,000 | 8191 |
| bedrock/eu-central-1/1-month-commitment/anthropic.claude-v2:1 | -- | -- | 100,000 | 8191 |
| bedrock/eu-central-1/6-month-commitment/anthropic.claude-v2:1 | -- | -- | 100,000 | 8191 |
| bedrock/us-east-1/1-month-commitment/anthropic.claude-v2:1 | -- | -- | 100,000 | 8191 |
| bedrock/us-east-1/6-month-commitment/anthropic.claude-v2:1 | -- | -- | 100,000 | 8191 |
| bedrock/us-west-2/1-month-commitment/anthropic.claude-v2:1 | -- | -- | 100,000 | 8191 |
| bedrock/us-west-2/6-month-commitment/anthropic.claude-v2:1 | -- | -- | 100,000 | 8191 |
| anthropic.claude-instant-v1 | $0.8 | $2.4 | 100,000 | 8191 |
| bedrock/us-east-1/anthropic.claude-instant-v1 | $0.8 | $2.4 | 100,000 | 8191 |
| bedrock/us-east-1/1-month-commitment/anthropic.claude-instant-v1 | -- | -- | 100,000 | 8191 |
| bedrock/us-east-1/6-month-commitment/anthropic.claude-instant-v1 | -- | -- | 100,000 | 8191 |
| bedrock/us-west-2/1-month-commitment/anthropic.claude-instant-v1 | -- | -- | 100,000 | 8191 |
| bedrock/us-west-2/6-month-commitment/anthropic.claude-instant-v1 | -- | -- | 100,000 | 8191 |
| bedrock/us-west-2/anthropic.claude-instant-v1 | $0.8 | $2.4 | 100,000 | 8191 |
| bedrock/ap-northeast-1/anthropic.claude-instant-v1 | $2.23 | $7.55 | 100,000 | 8191 |
| bedrock/ap-northeast-1/1-month-commitment/anthropic.claude-instant-v1 | -- | -- | 100,000 | 8191 |
| bedrock/ap-northeast-1/6-month-commitment/anthropic.claude-instant-v1 | -- | -- | 100,000 | 8191 |
| bedrock/eu-central-1/anthropic.claude-instant-v1 | $2.48 | $8.38 | 100,000 | 8191 |
| bedrock/eu-central-1/1-month-commitment/anthropic.claude-instant-v1 | -- | -- | 100,000 | 8191 |
| bedrock/eu-central-1/6-month-commitment/anthropic.claude-instant-v1 | -- | -- | 100,000 | 8191 |
| cohere.command-text-v14 | $1.5 | $ 2.00 | 4,096 | 4096 |
| bedrock/*/1-month-commitment/cohere.command-text-v14 | -- | -- | 4,096 | 4096 |
| bedrock/*/6-month-commitment/cohere.command-text-v14 | -- | -- | 4,096 | 4096 |
| cohere.command-light-text-v14 | $0.3 | $0.6 | 4,096 | 4096 |
| bedrock/*/1-month-commitment/cohere.command-light-text-v14 | -- | -- | 4,096 | 4096 |
| bedrock/*/6-month-commitment/cohere.command-light-text-v14 | -- | -- | 4,096 | 4096 |
| cohere.command-r-plus-v1:0 | $ 3.00 | $15.00 | 128,000 | 4096 |
| cohere.command-r-v1:0 | $0.5 | $1.5 | 128,000 | 4096 |
| cohere.embed-english-v3 | $0.1 | $ 0.00 | 512 | nan |
| cohere.embed-multilingual-v3 | $0.1 | $ 0.00 | 512 | nan |
| meta.llama2-13b-chat-v1 | $0.75 | $ 1.00 | 4,096 | 4096 |
| meta.llama2-70b-chat-v1 | $1.95 | $2.56 | 4,096 | 4096 |
| meta.llama3-8b-instruct-v1:0 | $0.3 | $0.6 | 8,192 | 8192 |
| bedrock/us-east-1/meta.llama3-8b-instruct-v1:0 | $0.3 | $0.6 | 8,192 | 8192 |
| bedrock/us-west-1/meta.llama3-8b-instruct-v1:0 | $0.3 | $0.6 | 8,192 | 8192 |
| bedrock/ap-south-1/meta.llama3-8b-instruct-v1:0 | $0.36 | $0.72 | 8,192 | 8192 |
| bedrock/ca-central-1/meta.llama3-8b-instruct-v1:0 | $0.35 | $0.69 | 8,192 | 8192 |
| bedrock/eu-west-1/meta.llama3-8b-instruct-v1:0 | $0.32 | $0.65 | 8,192 | 8192 |
| bedrock/eu-west-2/meta.llama3-8b-instruct-v1:0 | $0.39 | $0.78 | 8,192 | 8192 |
| bedrock/sa-east-1/meta.llama3-8b-instruct-v1:0 | $0.5 | $1.01 | 8,192 | 8192 |
| meta.llama3-70b-instruct-v1:0 | $2.65 | $3.5 | 8,192 | 8192 |
| bedrock/us-east-1/meta.llama3-70b-instruct-v1:0 | $2.65 | $3.5 | 8,192 | 8192 |
| bedrock/us-west-1/meta.llama3-70b-instruct-v1:0 | $2.65 | $3.5 | 8,192 | 8192 |
| bedrock/ap-south-1/meta.llama3-70b-instruct-v1:0 | $3.18 | $4.2 | 8,192 | 8192 |
| bedrock/ca-central-1/meta.llama3-70b-instruct-v1:0 | $3.05 | $4.03 | 8,192 | 8192 |
| bedrock/eu-west-1/meta.llama3-70b-instruct-v1:0 | $2.86 | $3.78 | 8,192 | 8192 |
| bedrock/eu-west-2/meta.llama3-70b-instruct-v1:0 | $3.45 | $4.55 | 8,192 | 8192 |
| bedrock/sa-east-1/meta.llama3-70b-instruct-v1:0 | $4.45 | $5.88 | 8,192 | 8192 |
| meta.llama3-1-8b-instruct-v1:0 | $0.22 | $0.22 | 128,000 | 2048 |
| meta.llama3-1-70b-instruct-v1:0 | $0.99 | $0.99 | 128,000 | 2048 |
| meta.llama3-1-405b-instruct-v1:0 | $5.32 | $16.00 | 128,000 | 4096 |
| meta.llama3-2-1b-instruct-v1:0 | $0.1 | $0.1 | 128,000 | 4096 |
| us.meta.llama3-2-1b-instruct-v1:0 | $0.1 | $0.1 | 128,000 | 4096 |
| eu.meta.llama3-2-1b-instruct-v1:0 | $0.13 | $0.13 | 128,000 | 4096 |
| meta.llama3-2-3b-instruct-v1:0 | $0.15 | $0.15 | 128,000 | 4096 |
| us.meta.llama3-2-3b-instruct-v1:0 | $0.15 | $0.15 | 128,000 | 4096 |
| eu.meta.llama3-2-3b-instruct-v1:0 | $0.19 | $0.19 | 128,000 | 4096 |
| meta.llama3-2-11b-instruct-v1:0 | $0.35 | $0.35 | 128,000 | 4096 |
| us.meta.llama3-2-11b-instruct-v1:0 | $0.35 | $0.35 | 128,000 | 4096 |
| meta.llama3-2-90b-instruct-v1:0 | $ 2.00 | $ 2.00 | 128,000 | 4096 |
| us.meta.llama3-2-90b-instruct-v1:0 | $ 2.00 | $ 2.00 | 128,000 | 4096 |
| 512-x-512/50-steps/stability.stable-diffusion-xl-v0 | -- | -- | 77 | nan |
| 512-x-512/max-steps/stability.stable-diffusion-xl-v0 | -- | -- | 77 | nan |
| max-x-max/50-steps/stability.stable-diffusion-xl-v0 | -- | -- | 77 | nan |
| max-x-max/max-steps/stability.stable-diffusion-xl-v0 | -- | -- | 77 | nan |
| 1024-x-1024/50-steps/stability.stable-diffusion-xl-v1 | -- | -- | 77 | nan |
| 1024-x-1024/max-steps/stability.stable-diffusion-xl-v1 | -- | -- | 77 | nan |
| sagemaker/meta-textgeneration-llama-2-7b | $ 0.00 | $ 0.00 | 4,096 | 4096 |
| sagemaker/meta-textgeneration-llama-2-7b-f | $ 0.00 | $ 0.00 | 4,096 | 4096 |
| sagemaker/meta-textgeneration-llama-2-13b | $ 0.00 | $ 0.00 | 4,096 | 4096 |
| sagemaker/meta-textgeneration-llama-2-13b-f | $ 0.00 | $ 0.00 | 4,096 | 4096 |
| sagemaker/meta-textgeneration-llama-2-70b | $ 0.00 | $ 0.00 | 4,096 | 4096 |
| sagemaker/meta-textgeneration-llama-2-70b-b-f | $ 0.00 | $ 0.00 | 4,096 | 4096 |
| together-ai-up-to-4b | $0.1 | $0.1 | nan | nan |
| together-ai-4.1b-8b | $0.2 | $0.2 | nan | nan |
| together-ai-8.1b-21b | $0.3 | $0.3 | nan | nan |
| together-ai-21.1b-41b | $0.8 | $0.8 | nan | nan |
| together-ai-41.1b-80b | $0.9 | $0.9 | nan | nan |
| together-ai-81.1b-110b | $1.8 | $1.8 | nan | nan |
| together-ai-embedding-up-to-150m | $0.008 | $ 0.00 | nan | nan |
| together-ai-embedding-151m-to-350m | $0.016 | $ 0.00 | nan | nan |
| together_ai/mistralai/Mixtral-8x7B-Instruct-v0.1 | $0.6 | $0.6 | nan | nan |
| together_ai/mistralai/Mistral-7B-Instruct-v0.1 | -- | -- | nan | nan |
| together_ai/togethercomputer/CodeLlama-34b-Instruct | -- | -- | nan | nan |
| ollama/codegemma | $ 0.00 | $ 0.00 | 8,192 | 8192 |
| ollama/codegeex4 | $ 0.00 | $ 0.00 | 32,768 | 8192 |
| ollama/deepseek-coder-v2-instruct | $ 0.00 | $ 0.00 | 32,768 | 8192 |
| ollama/deepseek-coder-v2-base | $ 0.00 | $ 0.00 | 8,192 | 8192 |
| ollama/deepseek-coder-v2-lite-instruct | $ 0.00 | $ 0.00 | 32,768 | 8192 |
| ollama/deepseek-coder-v2-lite-base | $ 0.00 | $ 0.00 | 8,192 | 8192 |
| ollama/internlm2_5-20b-chat | $ 0.00 | $ 0.00 | 32,768 | 8192 |
| ollama/llama2 | $ 0.00 | $ 0.00 | 4,096 | 4096 |
| ollama/llama2:7b | $ 0.00 | $ 0.00 | 4,096 | 4096 |
| ollama/llama2:13b | $ 0.00 | $ 0.00 | 4,096 | 4096 |
| ollama/llama2:70b | $ 0.00 | $ 0.00 | 4,096 | 4096 |
| ollama/llama2-uncensored | $ 0.00 | $ 0.00 | 4,096 | 4096 |
| ollama/llama3 | $ 0.00 | $ 0.00 | 8,192 | 8192 |
| ollama/llama3:8b | $ 0.00 | $ 0.00 | 8,192 | 8192 |
| ollama/llama3:70b | $ 0.00 | $ 0.00 | 8,192 | 8192 |
| ollama/llama3.1 | $ 0.00 | $ 0.00 | 8,192 | 8192 |
| ollama/mistral-large-instruct-2407 | $ 0.00 | $ 0.00 | 65,536 | 8192 |
| ollama/mistral | $ 0.00 | $ 0.00 | 8,192 | 8192 |
| ollama/mistral-7B-Instruct-v0.1 | $ 0.00 | $ 0.00 | 8,192 | 8192 |
| ollama/mistral-7B-Instruct-v0.2 | $ 0.00 | $ 0.00 | 32,768 | 32768 |
| ollama/mixtral-8x7B-Instruct-v0.1 | $ 0.00 | $ 0.00 | 32,768 | 32768 |
| ollama/mixtral-8x22B-Instruct-v0.1 | $ 0.00 | $ 0.00 | 65,536 | 65536 |
| ollama/codellama | $ 0.00 | $ 0.00 | 4,096 | 4096 |
| ollama/orca-mini | $ 0.00 | $ 0.00 | 4,096 | 4096 |
| ollama/vicuna | $ 0.00 | $ 0.00 | 2,048 | 2048 |
| deepinfra/lizpreciatior/lzlv_70b_fp16_hf | $0.7 | $0.9 | 4,096 | 4096 |
| deepinfra/Gryphe/MythoMax-L2-13b | $0.22 | $0.22 | 4,096 | 4096 |
| deepinfra/mistralai/Mistral-7B-Instruct-v0.1 | $0.13 | $0.13 | 32,768 | 8191 |
| deepinfra/meta-llama/Llama-2-70b-chat-hf | $0.7 | $0.9 | 4,096 | 4096 |
| deepinfra/cognitivecomputations/dolphin-2.6-mixtral-8x7b | $0.27 | $0.27 | 32,768 | 8191 |
| deepinfra/codellama/CodeLlama-34b-Instruct-hf | $0.6 | $0.6 | 4,096 | 4096 |
| deepinfra/deepinfra/mixtral | $0.27 | $0.27 | 32,000 | 4096 |
| deepinfra/Phind/Phind-CodeLlama-34B-v2 | $0.6 | $0.6 | 16,384 | 4096 |
| deepinfra/mistralai/Mixtral-8x7B-Instruct-v0.1 | $0.27 | $0.27 | 32,768 | 8191 |
| deepinfra/deepinfra/airoboros-70b | $0.7 | $0.9 | 4,096 | 4096 |
| deepinfra/01-ai/Yi-34B-Chat | $0.6 | $0.6 | 4,096 | 4096 |
| deepinfra/01-ai/Yi-6B-200K | $0.13 | $0.13 | 200,000 | 4096 |
| deepinfra/jondurbin/airoboros-l2-70b-gpt4-1.4.1 | $0.7 | $0.9 | 4,096 | 4096 |
| deepinfra/meta-llama/Llama-2-13b-chat-hf | $0.22 | $0.22 | 4,096 | 4096 |
| deepinfra/amazon/MistralLite | $0.2 | $0.2 | 32,768 | 8191 |
| deepinfra/meta-llama/Llama-2-7b-chat-hf | $0.13 | $0.13 | 4,096 | 4096 |
| deepinfra/meta-llama/Meta-Llama-3-8B-Instruct | $0.08 | $0.08 | 8,191 | 4096 |
| deepinfra/meta-llama/Meta-Llama-3-70B-Instruct | $0.59 | $0.79 | 8,191 | 4096 |
| deepinfra/01-ai/Yi-34B-200K | $0.6 | $0.6 | 200,000 | 4096 |
| deepinfra/openchat/openchat_3.5 | $0.13 | $0.13 | 4,096 | 4096 |
| perplexity/codellama-34b-instruct | $0.35 | $1.4 | 16,384 | 16384 |
| perplexity/codellama-70b-instruct | $0.7 | $2.8 | 16,384 | 16384 |
| perplexity/llama-3.1-70b-instruct | $ 1.00 | $ 1.00 | 131,072 | 131072 |
| perplexity/llama-3.1-8b-instruct | $0.2 | $0.2 | 131,072 | 131072 |
| perplexity/llama-3.1-sonar-huge-128k-online | $ 5.00 | $ 5.00 | 127,072 | 127072 |
| perplexity/llama-3.1-sonar-large-128k-online | $ 1.00 | $ 1.00 | 127,072 | 127072 |
| perplexity/llama-3.1-sonar-large-128k-chat | $ 1.00 | $ 1.00 | 131,072 | 131072 |
| perplexity/llama-3.1-sonar-small-128k-chat | $0.2 | $0.2 | 131,072 | 131072 |
| perplexity/llama-3.1-sonar-small-128k-online | $0.2 | $0.2 | 127,072 | 127072 |
| perplexity/pplx-7b-chat | $0.07 | $0.28 | 8,192 | 8192 |
| perplexity/pplx-70b-chat | $0.7 | $2.8 | 4,096 | 4096 |
| perplexity/pplx-7b-online | $ 0.00 | $0.28 | 4,096 | 4096 |
| perplexity/pplx-70b-online | $ 0.00 | $2.8 | 4,096 | 4096 |
| perplexity/llama-2-70b-chat | $0.7 | $2.8 | 4,096 | 4096 |
| perplexity/mistral-7b-instruct | $0.07 | $0.28 | 4,096 | 4096 |
| perplexity/mixtral-8x7b-instruct | $0.07 | $0.28 | 4,096 | 4096 |
| perplexity/sonar-small-chat | $0.07 | $0.28 | 16,384 | 16384 |
| perplexity/sonar-small-online | $ 0.00 | $0.28 | 12,000 | 12000 |
| perplexity/sonar-medium-chat | $0.6 | $1.8 | 16,384 | 16384 |
| perplexity/sonar-medium-online | $ 0.00 | $1.8 | 12,000 | 12000 |
| fireworks_ai/accounts/fireworks/models/llama-v3p2-1b-instruct | $0.1 | $0.1 | 16,384 | 16384 |
| fireworks_ai/accounts/fireworks/models/llama-v3p2-3b-instruct | $0.1 | $0.1 | 16,384 | 16384 |
| fireworks_ai/accounts/fireworks/models/llama-v3p2-11b-vision-instruct | $0.2 | $0.2 | 16,384 | 16384 |
| accounts/fireworks/models/llama-v3p2-90b-vision-instruct | $0.9 | $0.9 | 16,384 | 16384 |
| fireworks_ai/accounts/fireworks/models/firefunction-v2 | $0.9 | $0.9 | 8,192 | 8192 |
| fireworks_ai/accounts/fireworks/models/mixtral-8x22b-instruct-hf | $1.2 | $1.2 | 65,536 | 65536 |
| fireworks_ai/accounts/fireworks/models/qwen2-72b-instruct | $0.9 | $0.9 | 32,768 | 32768 |
| fireworks_ai/accounts/fireworks/models/yi-large | $ 3.00 | $ 3.00 | 32,768 | 32768 |
| fireworks_ai/accounts/fireworks/models/deepseek-coder-v2-instruct | $1.2 | $1.2 | 65,536 | 8192 |
| fireworks_ai/nomic-ai/nomic-embed-text-v1.5 | $0.008 | $ 0.00 | 8,192 | nan |
| fireworks_ai/nomic-ai/nomic-embed-text-v1 | $0.008 | $ 0.00 | 8,192 | nan |
| fireworks_ai/WhereIsAI/UAE-Large-V1 | $0.016 | $ 0.00 | 512 | nan |
| fireworks_ai/thenlper/gte-large | $0.016 | $ 0.00 | 512 | nan |
| fireworks_ai/thenlper/gte-base | $0.008 | $ 0.00 | 512 | nan |
| fireworks-ai-up-to-16b | $0.2 | $0.2 | nan | nan |
| fireworks-ai-16.1b-to-80b | $0.9 | $0.9 | nan | nan |
| fireworks-ai-moe-up-to-56b | $0.5 | $0.5 | nan | nan |
| fireworks-ai-56b-to-176b | $1.2 | $1.2 | nan | nan |
| fireworks-ai-default | $ 0.00 | $ 0.00 | nan | nan |
| fireworks-ai-embedding-up-to-150m | $0.008 | $ 0.00 | nan | nan |
| fireworks-ai-embedding-150m-to-350m | $0.016 | $ 0.00 | nan | nan |
| anyscale/mistralai/Mistral-7B-Instruct-v0.1 | $0.15 | $0.15 | 16,384 | 16384 |
| anyscale/mistralai/Mixtral-8x7B-Instruct-v0.1 | $0.15 | $0.15 | 16,384 | 16384 |
| anyscale/mistralai/Mixtral-8x22B-Instruct-v0.1 | $0.9 | $0.9 | 65,536 | 65536 |
| anyscale/HuggingFaceH4/zephyr-7b-beta | $0.15 | $0.15 | 16,384 | 16384 |
| anyscale/google/gemma-7b-it | $0.15 | $0.15 | 8,192 | 8192 |
| anyscale/meta-llama/Llama-2-7b-chat-hf | $0.15 | $0.15 | 4,096 | 4096 |
| anyscale/meta-llama/Llama-2-13b-chat-hf | $0.25 | $0.25 | 4,096 | 4096 |
| anyscale/meta-llama/Llama-2-70b-chat-hf | $ 1.00 | $ 1.00 | 4,096 | 4096 |
| anyscale/codellama/CodeLlama-34b-Instruct-hf | $ 1.00 | $ 1.00 | 4,096 | 4096 |
| anyscale/codellama/CodeLlama-70b-Instruct-hf | $ 1.00 | $ 1.00 | 4,096 | 4096 |
| anyscale/meta-llama/Meta-Llama-3-8B-Instruct | $0.15 | $0.15 | 8,192 | 8192 |
| anyscale/meta-llama/Meta-Llama-3-70B-Instruct | $ 1.00 | $ 1.00 | 8,192 | 8192 |
| cloudflare/@cf/meta/llama-2-7b-chat-fp16 | $1.923 | $1.923 | 3,072 | 3072 |
| cloudflare/@cf/meta/llama-2-7b-chat-int8 | $1.923 | $1.923 | 2,048 | 2048 |
| cloudflare/@cf/mistral/mistral-7b-instruct-v0.1 | $1.923 | $1.923 | 8,192 | 8192 |
| cloudflare/@hf/thebloke/codellama-7b-instruct-awq | $1.923 | $1.923 | 4,096 | 4096 |
| voyage/voyage-01 | $0.1 | $ 0.00 | 4,096 | nan |
| voyage/voyage-lite-01 | $0.1 | $ 0.00 | 4,096 | nan |
| voyage/voyage-large-2 | $0.12 | $ 0.00 | 16,000 | nan |
| voyage/voyage-law-2 | $0.12 | $ 0.00 | 16,000 | nan |
| voyage/voyage-code-2 | $0.12 | $ 0.00 | 16,000 | nan |
| voyage/voyage-2 | $0.1 | $ 0.00 | 4,000 | nan |
| voyage/voyage-lite-02-instruct | $0.1 | $ 0.00 | 4,000 | nan |
| voyage/voyage-finance-2 | $0.12 | $ 0.00 | 32,000 | nan |
| databricks/databricks-meta-llama-3-1-405b-instruct | $ 5.00 | $15.00002 | 128,000 | 128000 |
| databricks/databricks-meta-llama-3-1-70b-instruct | $1.00002 | $2.99999 | 128,000 | 128000 |
| databricks/databricks-dbrx-instruct | $0.74998 | $2.24901 | 32,768 | 32768 |
| databricks/databricks-meta-llama-3-70b-instruct | $1.00002 | $2.99999 | 128,000 | 128000 |
| databricks/databricks-llama-2-70b-chat | $0.50001 | $1.5 | 4,096 | 4096 |
| databricks/databricks-mixtral-8x7b-instruct | $0.50001 | $0.99902 | 4,096 | 4096 |
| databricks/databricks-mpt-30b-instruct | $0.99902 | $0.99902 | 8,192 | 8192 |
| databricks/databricks-mpt-7b-instruct | $0.50001 | $ 0.00 | 8,192 | 8192 |
| databricks/databricks-bge-large-en | $0.10003 | $ 0.00 | 512 | nan |
| databricks/databricks-gte-large-en | $0.12999 | $ 0.00 | 8,192 | nan |
| azure/gpt-4o-mini-2024-07-18 | $0.165 | $0.66 | 128,000 | 16384 |
| amazon.titan-embed-image-v1 | $0.8 | $ 0.00 | 128 | nan |
| azure_ai/mistral-large-2407 | $ 2.00 | $ 6.00 | 128,000 | 4096 |
| azure_ai/ministral-3b | $0.04 | $0.04 | 128,000 | 4096 |
| azure_ai/Llama-3.2-11B-Vision-Instruct | $0.37 | $0.37 | 128,000 | 2048 |
| azure_ai/Llama-3.2-90B-Vision-Instruct | $2.04 | $2.04 | 128,000 | 2048 |
| azure_ai/Phi-3.5-mini-instruct | $0.13 | $0.52 | 128,000 | 4096 |
| azure_ai/Phi-3.5-vision-instruct | $0.13 | $0.52 | 128,000 | 4096 |
| azure_ai/Phi-3.5-MoE-instruct | $0.16 | $0.64 | 128,000 | 4096 |
| azure_ai/Phi-3-mini-4k-instruct | $0.13 | $0.52 | 4,096 | 4096 |
| azure_ai/Phi-3-mini-128k-instruct | $0.13 | $0.52 | 128,000 | 4096 |
| azure_ai/Phi-3-small-8k-instruct | $0.15 | $0.6 | 8,192 | 4096 |
| azure_ai/Phi-3-small-128k-instruct | $0.15 | $0.6 | 128,000 | 4096 |
| azure_ai/Phi-3-medium-4k-instruct | $0.17 | $0.68 | 4,096 | 4096 |
| azure_ai/Phi-3-medium-128k-instruct | $0.17 | $0.68 | 128,000 | 4096 |
| xai/grok-beta | $ 5.00 | $15.00 | 131,072 | 131072 |
| claude-3-5-haiku-20241022 | $0.8 | $ 4.00 | 200,000 | 8192 |
| vertex_ai/claude-3-5-haiku@20241022 | $ 1.00 | $ 5.00 | 200,000 | 8192 |
| openrouter/anthropic/claude-3-5-haiku | $ 1.00 | $ 5.00 | nan | nan |
| openrouter/anthropic/claude-3-5-haiku-20241022 | $ 1.00 | $ 5.00 | 200,000 | 8192 |
| anthropic.claude-3-5-haiku-20241022-v1:0 | $0.8 | $ 4.00 | 200,000 | 8192 |
| us.anthropic.claude-3-5-haiku-20241022-v1:0 | $0.8 | $ 4.00 | 200,000 | 8192 |
| eu.anthropic.claude-3-5-haiku-20241022-v1:0 | $0.25 | $1.25 | 200,000 | 8192 |
| stability.sd3-large-v1:0 | -- | -- | 77 | nan |
| gpt-4o-2024-11-20 | $2.5 | $10.00 | 128,000 | 16384 |
| ft:gpt-4o-2024-11-20 | $3.75 | $15.00 | 128,000 | 16384 |
| azure/gpt-4o-2024-11-20 | $2.75 | $11.00 | 128,000 | 16384 |
| azure/global-standard/gpt-4o-2024-11-20 | $2.5 | $10.00 | 128,000 | 16384 |
| groq/llama-3.2-1b-preview | $0.04 | $0.04 | 8,192 | 8192 |
| groq/llama-3.2-3b-preview | $0.06 | $0.06 | 8,192 | 8192 |
| groq/llama-3.2-11b-text-preview | $0.18 | $0.18 | 8,192 | 8192 |
| groq/llama-3.2-11b-vision-preview | $0.18 | $0.18 | 8,192 | 8192 |
| groq/llama-3.2-90b-text-preview | $0.9 | $0.9 | 8,192 | 8192 |
| groq/llama-3.2-90b-vision-preview | $0.9 | $0.9 | 8,192 | 8192 |
| vertex_ai/claude-3-sonnet | $ 3.00 | $15.00 | 200,000 | 4096 |
| vertex_ai/claude-3-5-sonnet | $ 3.00 | $15.00 | 200,000 | 8192 |
| vertex_ai/claude-3-5-sonnet-v2 | $ 3.00 | $15.00 | 200,000 | 8192 |
| vertex_ai/claude-3-haiku | $0.25 | $1.25 | 200,000 | 4096 |
| vertex_ai/claude-3-5-haiku | $ 1.00 | $ 5.00 | 200,000 | 8192 |
| vertex_ai/claude-3-opus | $15.00 | $75.00 | 200,000 | 4096 |
| gemini/gemini-exp-1114 | $ 0.00 | $ 0.00 | 1,048,576 | 8192 |
| openrouter/qwen/qwen-2.5-coder-32b-instruct | $0.18 | $0.18 | 33,792 | 33792 |
| us.meta.llama3-1-8b-instruct-v1:0 | $0.22 | $0.22 | 128,000 | 2048 |
| us.meta.llama3-1-70b-instruct-v1:0 | $0.99 | $0.99 | 128,000 | 2048 |
| us.meta.llama3-1-405b-instruct-v1:0 | $5.32 | $16.00 | 128,000 | 4096 |
| stability.stable-image-ultra-v1:0 | -- | -- | 77 | nan |
| fireworks_ai/accounts/fireworks/models/qwen2p5-coder-32b-instruct | $0.9 | $0.9 | 4,096 | 4096 |
| omni-moderation-latest | $ 0.00 | $ 0.00 | 32,768 | 0 |
| omni-moderation-latest-intents | $ 0.00 | $ 0.00 | 32,768 | 0 |
| omni-moderation-2024-09-26 | $ 0.00 | $ 0.00 | 32,768 | 0 |
| gpt-4o-audio-preview-2024-12-17 | $2.5 | $10.00 | 128,000 | 16384 |
| gpt-4o-mini-audio-preview-2024-12-17 | $0.15 | $0.6 | 128,000 | 16384 |
| o1 | $15.00 | $60.00 | 200,000 | 100000 |
| o1-2024-12-17 | $15.00 | $60.00 | 200,000 | 100000 |
| gpt-4o-realtime-preview-2024-10-01 | $ 5.00 | $20.00 | 128,000 | 4096 |
| gpt-4o-realtime-preview | $ 5.00 | $20.00 | 128,000 | 4096 |
| gpt-4o-realtime-preview-2024-12-17 | $ 5.00 | $20.00 | 128,000 | 4096 |
| gpt-4o-mini-realtime-preview | $0.6 | $2.4 | 128,000 | 4096 |
| gpt-4o-mini-realtime-preview-2024-12-17 | $0.6 | $2.4 | 128,000 | 4096 |
| azure/o1 | $15.00 | $60.00 | 200,000 | 100000 |
| azure_ai/Llama-3.3-70B-Instruct | $0.71 | $0.71 | 128,000 | 2048 |
| mistral/mistral-large-2411 | $ 2.00 | $ 6.00 | 128,000 | 128000 |
| mistral/pixtral-large-latest | $ 2.00 | $ 6.00 | 128,000 | 128000 |
| mistral/pixtral-large-2411 | $ 2.00 | $ 6.00 | 128,000 | 128000 |
| deepseek/deepseek-chat | $0.27 | $1.1 | 65,536 | 8192 |
| deepseek/deepseek-coder | $0.14 | $0.28 | 128,000 | 4096 |
| groq/llama-3.3-70b-versatile | $0.59 | $0.79 | 128,000 | 8192 |
| groq/llama-3.3-70b-specdec | $0.59 | $0.99 | 8,192 | 8192 |
| friendliai/meta-llama-3.1-8b-instruct | $0.1 | $0.1 | 8,192 | 8192 |
| friendliai/meta-llama-3.1-70b-instruct | $0.6 | $0.6 | 8,192 | 8192 |
| gemini-2.0-flash-exp | $ 0.00 | $ 0.00 | 1,048,576 | 8192 |
| gemini/gemini-2.0-flash-exp | $ 0.00 | $ 0.00 | 1,048,576 | 8192 |
| vertex_ai/mistral-large@2411-001 | $ 2.00 | $ 6.00 | 128,000 | 8191 |
| vertex_ai/mistral-large-2411 | $ 2.00 | $ 6.00 | 128,000 | 8191 |
| text-embedding-005 | $0.1 | $ 0.00 | 2,048 | nan |
| gemini/gemini-1.5-flash-8b | $ 0.00 | $ 0.00 | 1,048,576 | 8192 |
| gemini/gemini-exp-1206 | $ 0.00 | $ 0.00 | 2,097,152 | 8192 |
| command-r7b-12-2024 | $0.15 | $0.0375 | 128,000 | 4096 |
| rerank-v3.5 | $ 0.00 | $ 0.00 | 4,096 | 4096 |
| openrouter/deepseek/deepseek-chat | $0.14 | $0.28 | 65,536 | 8192 |
| openrouter/openai/o1 | $15.00 | $60.00 | 200,000 | 100000 |
| amazon.nova-micro-v1:0 | $0.035 | $0.14 | 300,000 | 4096 |
| amazon.nova-lite-v1:0 | $0.06 | $0.24 | 128,000 | 4096 |
| amazon.nova-pro-v1:0 | $0.8 | $3.2 | 300,000 | 4096 |
| meta.llama3-3-70b-instruct-v1:0 | $0.72 | $0.72 | 128,000 | 4096 |
| together_ai/meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo | $0.18 | $0.18 | nan | nan |
| together_ai/meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo | $0.88 | $0.88 | nan | nan |
| together_ai/meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo | $3.5 | $3.5 | nan | nan |
| deepinfra/meta-llama/Meta-Llama-3.1-405B-Instruct | $0.9 | $0.9 | 32,768 | 32768 |
| fireworks_ai/accounts/fireworks/models/deepseek-v3 | $0.9 | $0.9 | 128,000 | 8192 |
| voyage/voyage-3-large | $0.18 | $ 0.00 | 32,000 | nan |
| voyage/voyage-3 | $0.06 | $ 0.00 | 32,000 | nan |
| voyage/voyage-3-lite | $0.02 | $ 0.00 | 32,000 | nan |
| voyage/voyage-code-3 | $0.18 | $ 0.00 | 32,000 | nan |
| voyage/voyage-multimodal-3 | $0.12 | $ 0.00 | 32,000 | nan |
| voyage/rerank-2 | $0.05 | $ 0.00 | 16,000 | 16000 |
| voyage/rerank-2-lite | $0.02 | $ 0.00 | 8,000 | 8000 |
| databricks/meta-llama-3.3-70b-instruct | $1.00002 | $2.99999 | 128,000 | 128000 |
| sambanova/Meta-Llama-3.1-8B-Instruct | $0.1 | $0.2 | 16,000 | 16000 |
| sambanova/Meta-Llama-3.1-70B-Instruct | $0.6 | $1.2 | 128,000 | 128000 |
| sambanova/Meta-Llama-3.1-405B-Instruct | $ 5.00 | $10.00 | 16,000 | 16000 |
| sambanova/Meta-Llama-3.2-1B-Instruct | $0.4 | $0.8 | 16,000 | 16000 |
| sambanova/Meta-Llama-3.2-3B-Instruct | $0.8 | $1.6 | 4,000 | 4000 |
| sambanova/Meta-Llama-3.3-70B-Instruct | $0.6 | $1.2 | 128,000 | 128000 |
| sambanova/Qwen2.5-Coder-32B-Instruct | $1.5 | $ 3.00 | 8,000 | 8000 |
| sambanova/Qwen2.5-72B-Instruct | $ 2.00 | $ 4.00 | 8,000 | 8000 |
| o3-mini | $1.1 | $4.4 | 200,000 | 100000 |
| o3-mini-2025-01-31 | $1.1 | $4.4 | 200,000 | 100000 |
| azure/o3-mini-2025-01-31 | $1.1 | $4.4 | 200,000 | 100000 |
| azure/o3-mini | $1.1 | $4.4 | 200,000 | 100000 |
| azure/o1-2024-12-17 | $15.00 | $60.00 | 200,000 | 100000 |
| azure_ai/deepseek-r1 | $1.35 | $5.4 | 128,000 | 8192 |
| deepseek/deepseek-reasoner | $0.55 | $2.19 | 65,536 | 8192 |
| xai/grok-2-vision-1212 | $ 2.00 | $10.00 | 32,768 | 32768 |
| xai/grok-2-vision-latest | $ 2.00 | $10.00 | 32,768 | 32768 |
| xai/grok-2-vision | $ 2.00 | $10.00 | 32,768 | 32768 |
| xai/grok-vision-beta | $ 5.00 | $15.00 | 8,192 | 8192 |
| xai/grok-2-1212 | $ 2.00 | $10.00 | 131,072 | 131072 |
| xai/grok-2 | $ 2.00 | $10.00 | 131,072 | 131072 |
| xai/grok-2-latest | $ 2.00 | $10.00 | 131,072 | 131072 |
| groq/deepseek-r1-distill-llama-70b | $0.75 | $0.99 | 131,072 | 131072 |
| gemini/gemini-2.0-flash | $0.1 | $0.4 | 1,048,576 | 8192 |
| gemini-2.0-flash-001 | $0.15 | $0.6 | 1,048,576 | 8192 |
| gemini-2.0-flash-thinking-exp | $ 0.00 | $ 0.00 | 1,048,576 | 8192 |
| gemini-2.0-flash-thinking-exp-01-21 | $ 0.00 | $ 0.00 | 1,048,576 | 65536 |
| gemini/gemini-2.0-flash-001 | $0.1 | $0.4 | 1,048,576 | 8192 |
| gemini/gemini-2.0-flash-lite-preview-02-05 | $0.075 | $0.3 | 1,048,576 | 8192 |
| gemini/gemini-2.0-flash-thinking-exp | $ 0.00 | $ 0.00 | 1,048,576 | 65536 |
| vertex_ai/codestral-2501 | $0.2 | $0.6 | 128,000 | 128000 |
| openrouter/deepseek/deepseek-r1 | $0.55 | $2.19 | 65,336 | 8192 |
| ai21.jamba-1-5-large-v1:0 | $ 2.00 | $ 8.00 | 256,000 | 256000 |
| ai21.jamba-1-5-mini-v1:0 | $0.2 | $0.4 | 256,000 | 256000 |
| us.amazon.nova-micro-v1:0 | $0.035 | $0.14 | 300,000 | 4096 |
| us.amazon.nova-lite-v1:0 | $0.06 | $0.24 | 128,000 | 4096 |
| us.amazon.nova-pro-v1:0 | $0.8 | $3.2 | 300,000 | 4096 |
| stability.sd3-5-large-v1:0 | -- | -- | 77 | nan |
| stability.stable-image-core-v1:0 | -- | -- | 77 | nan |
| stability.stable-image-core-v1:1 | -- | -- | 77 | nan |
| stability.stable-image-ultra-v1:1 | -- | -- | 77 | nan |
| together_ai/meta-llama/Llama-3.3-70B-Instruct-Turbo | $0.88 | $0.88 | nan | nan |
| together_ai/meta-llama/Llama-3.3-70B-Instruct-Turbo-Free | $ 0.00 | $ 0.00 | nan | nan |
| fireworks_ai/accounts/fireworks/models/llama-v3p1-8b-instruct | $0.1 | $0.1 | 16,384 | 16384 |
| assemblyai/nano | -- | -- | nan | nan |
| assemblyai/best | -- | -- | nan | nan |
| azure/gpt-3.5-turbo-0125 | $0.5 | $1.5 | 16,384 | 4096 |
| azure/gpt-3.5-turbo | $0.5 | $1.5 | 4,097 | 4096 |
| gemini-2.0-pro-exp-02-05 | $ 0.00 | $ 0.00 | 2,097,152 | 8192 |
| us.meta.llama3-3-70b-instruct-v1:0 | $0.72 | $0.72 | 128,000 | 4096 |
| perplexity/sonar | $ 1.00 | $ 1.00 | 127,072 | 127072 |
| perplexity/sonar-pro | $ 3.00 | $15.00 | 200,000 | 8096 |
| openrouter/google/gemini-2.0-flash-001 | $0.1 | $0.4 | 1,048,576 | 8192 |
| gpt-4.5-preview | $75.00 | $150.00 | 128,000 | 16384 |
| gpt-4.5-preview-2025-02-27 | $75.00 | $150.00 | 128,000 | 16384 |
| azure_ai/Phi-4 | $0.125 | $0.5 | 16,384 | 16384 |
| cerebras/llama3.3-70b | $0.85 | $1.2 | 128,000 | 128000 |
| claude-3-5-haiku-latest | $ 1.00 | $ 5.00 | 200,000 | 8192 |
| claude-3-7-sonnet-latest | $ 3.00 | $15.00 | 200,000 | 128000 |
| claude-3-7-sonnet-20250219 | $ 3.00 | $15.00 | 200,000 | 128000 |
| vertex_ai/claude-3-7-sonnet@20250219 | $ 3.00 | $15.00 | 200,000 | 8192 |
| openrouter/anthropic/claude-3.7-sonnet | $ 3.00 | $15.00 | 200,000 | 8192 |
| openrouter/anthropic/claude-3.7-sonnet:beta | $ 3.00 | $15.00 | 200,000 | 8192 |
| amazon.rerank-v1:0 | $ 0.00 | $ 0.00 | 32,000 | 32000 |
| anthropic.claude-3-7-sonnet-20250219-v1:0 | $ 3.00 | $15.00 | 200,000 | 8192 |
| us.anthropic.claude-3-7-sonnet-20250219-v1:0 | $ 3.00 | $15.00 | 200,000 | 8192 |
| cohere.rerank-v3-5:0 | $ 0.00 | $ 0.00 | 32,000 | 32000 |
| jina-reranker-v2-base-multilingual | $0.018 | $0.018 | 1,024 | 1024 |
| bedrock/invoke/anthropic.claude-3-5-sonnet-20240620-v1:0 | $ 3.00 | $15.00 | 200,000 | 4096 |
| azure/gpt-4o-mini-realtime-preview-2024-12-17 | $0.6 | $2.4 | 128,000 | 4096 |
| azure/eu/gpt-4o-mini-realtime-preview-2024-12-17 | $0.66 | $2.64 | 128,000 | 4096 |
| azure/us/gpt-4o-mini-realtime-preview-2024-12-17 | $0.66 | $2.64 | 128,000 | 4096 |
| azure/gpt-4o-realtime-preview-2024-10-01 | $ 5.00 | $20.00 | 128,000 | 4096 |
| azure/us/gpt-4o-realtime-preview-2024-10-01 | $5.5 | $22.00 | 128,000 | 4096 |
| azure/eu/gpt-4o-realtime-preview-2024-10-01 | $5.5 | $22.00 | 128,000 | 4096 |
| azure/us/o3-mini-2025-01-31 | $1.21 | $4.84 | 200,000 | 100000 |
| azure/eu/o3-mini-2025-01-31 | $1.21 | $4.84 | 200,000 | 100000 |
| azure/us/o1-mini-2024-09-12 | $1.21 | $4.84 | 128,000 | 65536 |
| azure/eu/o1-mini-2024-09-12 | $1.21 | $4.84 | 128,000 | 65536 |
| azure/us/o1-2024-12-17 | $16.5 | $66.00 | 200,000 | 100000 |
| azure/eu/o1-2024-12-17 | $16.5 | $66.00 | 200,000 | 100000 |
| azure/us/o1-preview-2024-09-12 | $16.5 | $66.00 | 128,000 | 32768 |
| azure/eu/o1-preview-2024-09-12 | $16.5 | $66.00 | 128,000 | 32768 |
| azure/us/gpt-4o-2024-11-20 | $2.75 | $11.00 | 128,000 | 16384 |
| azure/eu/gpt-4o-2024-11-20 | $2.75 | $11.00 | 128,000 | 16384 |
| azure/us/gpt-4o-2024-08-06 | $2.75 | $11.00 | 128,000 | 16384 |
| azure/eu/gpt-4o-2024-08-06 | $2.75 | $11.00 | 128,000 | 16384 |
| azure/us/gpt-4o-mini-2024-07-18 | $0.165 | $0.66 | 128,000 | 16384 |
| azure/eu/gpt-4o-mini-2024-07-18 | $0.165 | $0.66 | 128,000 | 16384 |
| azure_ai/deepseek-v3 | $1.14 | $4.56 | 128,000 | 8192 |
| azure_ai/mistral-nemo | $0.15 | $0.15 | 131,072 | 4096 |
| azure_ai/Phi-4-mini-instruct | $ 0.00 | $ 0.00 | 131,072 | 4096 |
| azure_ai/Phi-4-multimodal-instruct | $ 0.00 | $ 0.00 | 131,072 | 4096 |
| gemini/gemini-2.0-pro-exp-02-05 | $ 0.00 | $ 0.00 | 2,097,152 | 8192 |
| gemini/gemini-2.0-flash-thinking-exp-01-21 | $ 0.00 | $ 0.00 | 1,048,576 | 65536 |
| gemini/gemma-3-27b-it | $ 0.00 | $ 0.00 | 131,072 | 8192 |
| gemini/learnlm-1.5-pro-experimental | $ 0.00 | $ 0.00 | 32,767 | 8192 |
| vertex_ai/imagen-3.0-generate-002 | -- | -- | nan | nan |
| jamba-large-1.6 | $ 2.00 | $ 8.00 | 256,000 | 256000 |
| jamba-mini-1.6 | $0.2 | $0.4 | 256,000 | 256000 |
| eu.amazon.nova-micro-v1:0 | $0.046 | $0.184 | 300,000 | 4096 |
| eu.amazon.nova-lite-v1:0 | $0.078 | $0.312 | 128,000 | 4096 |
| 1024-x-1024/50-steps/bedrock/amazon.nova-canvas-v1:0 | -- | -- | 2,600 | nan |
| eu.amazon.nova-pro-v1:0 | $1.05 | $4.2 | 300,000 | 4096 |
| us.deepseek.r1-v1:0 | $1.35 | $5.4 | 128,000 | 4096 |
| snowflake/deepseek-r1 | -- | -- | 32,768 | 8192 |
| snowflake/snowflake-arctic | -- | -- | 4,096 | 8192 |
| snowflake/claude-3-5-sonnet | -- | -- | 18,000 | 8192 |
| snowflake/mistral-large | -- | -- | 32,000 | 8192 |
| snowflake/mistral-large2 | -- | -- | 128,000 | 8192 |
| snowflake/reka-flash | -- | -- | 100,000 | 8192 |
| snowflake/reka-core | -- | -- | 32,000 | 8192 |
| snowflake/jamba-instruct | -- | -- | 256,000 | 8192 |
| snowflake/jamba-1.5-mini | -- | -- | 256,000 | 8192 |
| snowflake/jamba-1.5-large | -- | -- | 256,000 | 8192 |
| snowflake/mixtral-8x7b | -- | -- | 32,000 | 8192 |
| snowflake/llama2-70b-chat | -- | -- | 4,096 | 8192 |
| snowflake/llama3-8b | -- | -- | 8,000 | 8192 |
| snowflake/llama3-70b | -- | -- | 8,000 | 8192 |
| snowflake/llama3.1-8b | -- | -- | 128,000 | 8192 |
| snowflake/llama3.1-70b | -- | -- | 128,000 | 8192 |
| snowflake/llama3.3-70b | -- | -- | 128,000 | 8192 |
| snowflake/snowflake-llama-3.3-70b | -- | -- | 8,000 | 8192 |
| snowflake/llama3.1-405b | -- | -- | 128,000 | 8192 |
| snowflake/snowflake-llama-3.1-405b | -- | -- | 8,000 | 8192 |
| snowflake/llama3.2-1b | -- | -- | 128,000 | 8192 |
| snowflake/llama3.2-3b | -- | -- | 128,000 | 8192 |
| snowflake/mistral-7b | -- | -- | 32,000 | 8192 |
| snowflake/gemma-7b | -- | -- | 8,000 | 8192 |
| azure/global/gpt-4o-2024-11-20 | $2.5 | $10.00 | 128,000 | 16384 |
| azure/global/gpt-4o-2024-08-06 | $2.5 | $10.00 | 128,000 | 16384 |
| o1-pro | $150.00 | $600.00 | 200,000 | 100000 |
| o1-pro-2025-03-19 | $150.00 | $600.00 | 200,000 | 100000 |
| gpt-4o-search-preview-2025-03-11 | $2.5 | $10.00 | 128,000 | 16384 |
| gpt-4o-search-preview | $2.5 | $10.00 | 128,000 | 16384 |
| gpt-4o-mini-search-preview-2025-03-11 | $0.15 | $0.6 | 128,000 | 16384 |
| gpt-4o-mini-search-preview | $0.15 | $0.6 | 128,000 | 16384 |
| azure/gpt-4.5-preview | $75.00 | $150.00 | 128,000 | 16384 |
| azure_ai/mistral-small-2503 | $ 1.00 | $ 3.00 | 128,000 | 128000 |
| text-embedding-large-exp-03-07 | $0.1 | $ 0.00 | 8,192 | nan |
Installation via GitHub:
git clone [email protected]:AgentOps-AI/tokencost.git
cd tokencost
pip install -e .- Install
pytestif you don't have it already
pip install pytest- Run the
tests/folder while in the parent directory
pytest testsThis repo also supports tox, simply run python -m tox.
Contributions to TokenCost are welcome! Feel free to create an issue for any bug reports, complaints, or feature suggestions.
TokenCost is released under the MIT License.
For Tasks:
Click tags to check more tools for each tasksFor Jobs:
Alternative AI tools for tokencost
Similar Open Source Tools
tokencost
Tokencost is a clientside tool for calculating the USD cost of using major Large Language Model (LLMs) APIs by estimating the cost of prompts and completions. It helps track the latest price changes of major LLM providers, accurately count prompt tokens before sending OpenAI requests, and easily integrate to get the cost of a prompt or completion with a single function. Users can calculate prompt and completion costs using OpenAI requests, count tokens in prompts formatted as message lists or string prompts, and refer to a cost table with updated prices for various LLM models. The tool also supports callback handlers for LLM wrapper/framework libraries like LlamaIndex and Langchain.
llm_benchmark
The 'llm_benchmark' repository is a personal evaluation project that tracks and tests various large models using a private question bank. It focuses on testing models' logic, mathematics, programming, and human intuition. The evaluation is not authoritative or comprehensive but aims to observe the long-term evolution trends of different large models. The question bank is small, with around 30 questions/240 test cases, and is not publicly available on the internet. The questions are updated monthly to share evaluation methods and personal insights. Users should assess large models based on their own needs and not blindly trust any evaluation. Model scores may vary by around +/-4 points each month due to question changes, but the overall ranking remains stable.
BlossomLM
BlossomLM is a series of open-source conversational large language models. This project aims to provide a high-quality general-purpose SFT dataset in both Chinese and English, making fine-tuning accessible while also providing pre-trained model weights. **Hint**: BlossomLM is a personal non-commercial project.
Chinese-LLaMA-Alpaca-3
Chinese-LLaMA-Alpaca-3 is a project based on Meta's latest release of the new generation open-source large model Llama-3. It is the third phase of the Chinese-LLaMA-Alpaca open-source large model series projects (Phase 1, Phase 2). This project open-sources the Chinese Llama-3 base model and the Chinese Llama-3-Instruct instruction fine-tuned large model. These models incrementally pre-train with a large amount of Chinese data on the basis of the original Llama-3 and further fine-tune using selected instruction data, enhancing Chinese basic semantics and instruction understanding capabilities. Compared to the second-generation related models, significant performance improvements have been achieved.
adata
AData is a free and open-source A-share database that focuses on transaction-related data. It provides comprehensive data on stocks, including basic information, market data, and sentiment analysis. AData is designed to be easy to use and integrate with other applications, making it a valuable tool for quantitative trading and AI training.
yudao-ui-admin-vue3
The yudao-ui-admin-vue3 repository is an open-source project focused on building a fast development platform for developers in China. It utilizes Vue3 and Element Plus to provide features such as configurable themes, internationalization, dynamic route permission generation, common component encapsulation, and rich examples. The project supports the latest front-end technologies like Vue3 and Vite4, and also includes tools like TypeScript, pinia, vueuse, vue-i18n, vue-router, unocss, iconify, and wangeditor. It offers a range of development tools and features for system functions, infrastructure, workflow management, payment systems, member centers, data reporting, e-commerce systems, WeChat public accounts, ERP systems, and CRM systems.
sanic-web
Sanic-Web is a lightweight, end-to-end, and easily customizable large model application project built on technologies such as Dify, Ollama & Vllm, Sanic, and Text2SQL. It provides a one-stop solution for developing large model applications, supporting graphical data-driven Q&A using ECharts, handling table-based Q&A with CSV files, and integrating with third-party RAG systems for general knowledge Q&A. As a lightweight framework, Sanic-Web enables rapid iteration and extension to facilitate the quick implementation of large model projects.
yudao-cloud
Yudao-cloud is an open-source project designed to provide a fast development platform for developers in China. It includes various system functions, infrastructure, member center, data reports, workflow, mall system, WeChat public account, CRM, ERP, etc. The project is based on Java backend with Spring Boot and Spring Cloud Alibaba microservices architecture. It supports multiple databases, message queues, authentication systems, dynamic menu loading, SaaS multi-tenant system, code generator, real-time communication, integration with third-party services like WeChat, Alipay, and more. The project is well-documented and follows the Alibaba Java development guidelines, ensuring clean code and architecture.
ruoyi-vue-pro
The ruoyi-vue-pro repository is an open-source project that provides a comprehensive development platform with various functionalities such as system features, infrastructure, member center, data reports, workflow, payment system, mall system, ERP system, CRM system, and AI big model. It is built using Java backend with Spring Boot framework and Vue frontend with different versions like Vue3 with element-plus, Vue3 with vben(ant-design-vue), and Vue2 with element-ui. The project aims to offer a fast development platform for developers and enterprises, supporting features like dynamic menu loading, button-level access control, SaaS multi-tenancy, code generator, real-time communication, integration with third-party services like WeChat, Alipay, and cloud services, and more.
yudao-boot-mini
yudao-boot-mini is an open-source project focused on developing a rapid development platform for developers in China. It includes features like system functions, infrastructure, member center, data reports, workflow, mall system, WeChat official account, CRM, ERP, etc. The project is based on Spring Boot with Java backend and Vue for frontend. It offers various functionalities such as user management, role management, menu management, department management, workflow management, payment system, code generation, API documentation, database documentation, file service, WebSocket integration, message queue, Java monitoring, and more. The project is licensed under the MIT License, allowing both individuals and enterprises to use it freely without restrictions.
Chinese-LLaMA-Alpaca-2
Chinese-LLaMA-Alpaca-2 is a large Chinese language model developed by Meta AI. It is based on the Llama-2 model and has been further trained on a large dataset of Chinese text. Chinese-LLaMA-Alpaca-2 can be used for a variety of natural language processing tasks, including text generation, question answering, and machine translation. Here are some of the key features of Chinese-LLaMA-Alpaca-2: * It is the largest Chinese language model ever trained, with 13 billion parameters. * It is trained on a massive dataset of Chinese text, including books, news articles, and social media posts. * It can be used for a variety of natural language processing tasks, including text generation, question answering, and machine translation. * It is open-source and available for anyone to use. Chinese-LLaMA-Alpaca-2 is a powerful tool that can be used to improve the performance of a wide range of natural language processing tasks. It is a valuable resource for researchers and developers working in the field of artificial intelligence.
awesome-hosting
awesome-hosting is a curated list of hosting services sorted by minimal plan price. It includes various categories such as Web Services Platform, Backend-as-a-Service, Lambda, Node.js, Static site hosting, WordPress hosting, VPS providers, managed databases, GPU cloud services, and LLM/Inference API providers. Each category lists multiple service providers along with details on their minimal plan, trial options, free tier availability, open-source support, and specific features. The repository aims to help users find suitable hosting solutions based on their budget and requirements.
carrot
The 'carrot' repository on GitHub provides a list of free and user-friendly ChatGPT mirror sites for easy access. The repository includes sponsored sites offering various GPT models and services. Users can find and share sites, report errors, and access stable and recommended sites for ChatGPT usage. The repository also includes a detailed list of ChatGPT sites, their features, and accessibility options, making it a valuable resource for ChatGPT users seeking free and unlimited GPT services.
Awesome-AGI
Awesome-AGI is a curated list of resources related to Artificial General Intelligence (AGI), including models, pipelines, applications, and concepts. It provides a comprehensive overview of the current state of AGI research and development, covering various aspects such as model training, fine-tuning, deployment, and applications in different domains. The repository also includes resources on prompt engineering, RLHF, LLM vocabulary expansion, long text generation, hallucination mitigation, controllability and safety, and text detection. It serves as a valuable resource for researchers, practitioners, and anyone interested in the field of AGI.
indie-hacker-tools-plus
Indie Hacker Tools Plus is a curated repository of essential tools and technology stacks for independent developers. The repository aims to help developers enhance efficiency, save costs, and mitigate risks by using popular and validated tools. It provides a collection of tools recognized by the industry to empower developers with the most refined technical support. Developers can contribute by submitting articles, software, or resources through issues or pull requests.
For similar tasks
tokencost
Tokencost is a clientside tool for calculating the USD cost of using major Large Language Model (LLMs) APIs by estimating the cost of prompts and completions. It helps track the latest price changes of major LLM providers, accurately count prompt tokens before sending OpenAI requests, and easily integrate to get the cost of a prompt or completion with a single function. Users can calculate prompt and completion costs using OpenAI requests, count tokens in prompts formatted as message lists or string prompts, and refer to a cost table with updated prices for various LLM models. The tool also supports callback handlers for LLM wrapper/framework libraries like LlamaIndex and Langchain.
llm
The 'llm' package for Emacs provides an interface for interacting with Large Language Models (LLMs). It abstracts functionality to a higher level, concealing API variations and ensuring compatibility with various LLMs. Users can set up providers like OpenAI, Gemini, Vertex, Claude, Ollama, GPT4All, and a fake client for testing. The package allows for chat interactions, embeddings, token counting, and function calling. It also offers advanced prompt creation and logging capabilities. Users can handle conversations, create prompts with placeholders, and contribute by creating providers.
gigachat
GigaChat is a Python library that allows GigaChain to interact with GigaChat, a neural network model capable of engaging in dialogue, writing code, creating texts, and images on demand. Data exchange with the service is facilitated through the GigaChat API. The library supports processing token streaming, as well as working in synchronous or asynchronous mode. It enables precise token counting in text using the GigaChat API.
client
Gemini API PHP Client is a library that allows you to interact with Google's generative AI models, such as Gemini Pro and Gemini Pro Vision. It provides functionalities for basic text generation, multimodal input, chat sessions, streaming responses, tokens counting, listing models, and advanced usages like safety settings and custom HTTP client usage. The library requires an API key to access Google's Gemini API and can be installed using Composer. It supports various features like generating content, starting chat sessions, embedding content, counting tokens, and listing available models.
gemini-cli
gemini-cli is a versatile command-line interface for Google's Gemini LLMs, written in Go. It includes tools for chatting with models, generating/comparing embeddings, and storing data in SQLite for analysis. Users can interact with Gemini models through various subcommands like prompt, chat, counttok, embed content, embed db, and embed similar.
client
Gemini PHP is a PHP API client for interacting with the Gemini AI API. It allows users to generate content, chat, count tokens, configure models, embed resources, list models, get model information, troubleshoot timeouts, and test API responses. The client supports various features such as text-only input, text-and-image input, multi-turn conversations, streaming content generation, token counting, model configuration, and embedding techniques. Users can interact with Gemini's API to perform tasks related to natural language generation and text analysis.
ai21-python
The AI21 Labs Python SDK is a comprehensive tool for interacting with the AI21 API. It provides functionalities for chat completions, conversational RAG, token counting, error handling, and support for various cloud providers like AWS, Azure, and Vertex. The SDK offers both synchronous and asynchronous usage, along with detailed examples and documentation. Users can quickly get started with the SDK to leverage AI21's powerful models for various natural language processing tasks.
Tiktoken
Tiktoken is a high-performance implementation focused on token count operations. It provides various encodings like o200k_base, cl100k_base, r50k_base, p50k_base, and p50k_edit. Users can easily encode and decode text using the provided API. The repository also includes a benchmark console app for performance tracking. Contributions in the form of PRs are welcome.
For similar jobs
responsible-ai-toolbox
Responsible AI Toolbox is a suite of tools providing model and data exploration and assessment interfaces and libraries for understanding AI systems. It empowers developers and stakeholders to develop and monitor AI responsibly, enabling better data-driven actions. The toolbox includes visualization widgets for model assessment, error analysis, interpretability, fairness assessment, and mitigations library. It also offers a JupyterLab extension for managing machine learning experiments and a library for measuring gender bias in NLP datasets.
LLMLingua
LLMLingua is a tool that utilizes a compact, well-trained language model to identify and remove non-essential tokens in prompts. This approach enables efficient inference with large language models, achieving up to 20x compression with minimal performance loss. The tool includes LLMLingua, LongLLMLingua, and LLMLingua-2, each offering different levels of prompt compression and performance improvements for tasks involving large language models.
llm-examples
Starter examples for building LLM apps with Streamlit. This repository showcases a growing collection of LLM minimum working examples, including a Chatbot, File Q&A, Chat with Internet search, LangChain Quickstart, LangChain PromptTemplate, and Chat with user feedback. Users can easily get their own OpenAI API key and set it as an environment variable in Streamlit apps to run the examples locally.
LMOps
LMOps is a research initiative focusing on fundamental research and technology for building AI products with foundation models, particularly enabling AI capabilities with Large Language Models (LLMs) and Generative AI models. The project explores various aspects such as prompt optimization, longer context handling, LLM alignment, acceleration of LLMs, LLM customization, and understanding in-context learning. It also includes tools like Promptist for automatic prompt optimization, Structured Prompting for efficient long-sequence prompts consumption, and X-Prompt for extensible prompts beyond natural language. Additionally, LLMA accelerators are developed to speed up LLM inference by referencing and copying text spans from documents. The project aims to advance technologies that facilitate prompting language models and enhance the performance of LLMs in various scenarios.
awesome-tool-llm
This repository focuses on exploring tools that enhance the performance of language models for various tasks. It provides a structured list of literature relevant to tool-augmented language models, covering topics such as tool basics, tool use paradigm, scenarios, advanced methods, and evaluation. The repository includes papers, preprints, and books that discuss the use of tools in conjunction with language models for tasks like reasoning, question answering, mathematical calculations, accessing knowledge, interacting with the world, and handling non-textual modalities.
gaianet-node
GaiaNet-node is a tool that allows users to run their own GaiaNet node, enabling them to interact with an AI agent. The tool provides functionalities to install the default node software stack, initialize the node with model files and vector database files, start the node, stop the node, and update configurations. Users can use pre-set configurations or pass a custom URL for initialization. The tool is designed to facilitate communication with the AI agent and access node information via a browser. GaiaNet-node requires sudo privilege for installation but can also be installed without sudo privileges with specific commands.
llmops-duke-aipi
LLMOps Duke AIPI is a course focused on operationalizing Large Language Models, teaching methodologies for developing applications using software development best practices with large language models. The course covers various topics such as generative AI concepts, setting up development environments, interacting with large language models, using local large language models, applied solutions with LLMs, extensibility using plugins and functions, retrieval augmented generation, introduction to Python web frameworks for APIs, DevOps principles, deploying machine learning APIs, LLM platforms, and final presentations. Students will learn to build, share, and present portfolios using Github, YouTube, and Linkedin, as well as develop non-linear life-long learning skills. Prerequisites include basic Linux and programming skills, with coursework available in Python or Rust. Additional resources and references are provided for further learning and exploration.
Awesome-AISourceHub
Awesome-AISourceHub is a repository that collects high-quality information sources in the field of AI technology. It serves as a synchronized source of information to avoid information gaps and information silos. The repository aims to provide valuable resources for individuals such as AI book authors, enterprise decision-makers, and tool developers who frequently use Twitter to share insights and updates related to AI advancements. The platform emphasizes the importance of accessing information closer to the source for better quality content. Users can contribute their own high-quality information sources to the repository by following specific steps outlined in the contribution guidelines. The repository covers various platforms such as Twitter, public accounts, knowledge planets, podcasts, blogs, websites, YouTube channels, and more, offering a comprehensive collection of AI-related resources for individuals interested in staying updated with the latest trends and developments in the AI field.
