--- sidebar_label: Together AI description: "Deploy open-source models at scale using Together AI's optimized inference platform with serverless GPU infrastructure" --- # Together AI [Together AI](https://www.together.ai/) provides access to open-source models through an API compatible with OpenAI's interface. ## OpenAI Compatibility Together AI's API is compatible with OpenAI's API, which means all parameters available in the [OpenAI provider](/docs/providers/openai/) work with Together AI. ## Basic Configuration Configure a Together AI model in your promptfoo configuration: ```yaml title="promptfooconfig.yaml" # yaml-language-server: $schema=https://promptfoo.dev/config-schema.json providers: - id: togetherai:meta-llama/Llama-4-Scout-Instruct config: temperature: 0.7 ``` The provider requires an API key stored in the `TOGETHER_API_KEY` environment variable. ## Key Features ### Max Tokens Configuration ```yaml config: max_tokens: 4096 ``` ### Function Calling ```yaml config: tools: - type: function function: name: get_weather description: Get the current weather parameters: type: object properties: location: type: string description: City and state ``` ### JSON Mode ```yaml config: response_format: { type: 'json_object' } ``` ## Popular Models Together AI offers over 200 models. Here are some of the most popular models by category: ### Llama 4 Models - **Llama 4 Maverick**: `meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8` (524,288 context length, FP8) - **Llama 4 Scout**: `meta-llama/Llama-4-Scout-17B-16E-Instruct` (327,680 context length, FP16) ### DeepSeek Models - **DeepSeek R1**: `deepseek-ai/DeepSeek-R1` (128,000 context length, FP8) - **DeepSeek R1 Distill Llama 70B**: `deepseek-ai/DeepSeek-R1-Distill-Llama-70B` (131,072 context length, FP16) - **DeepSeek R1 Distill Qwen 14B**: `deepseek-ai/DeepSeek-R1-Distill-Qwen-14B` (131,072 context length, FP16) - **DeepSeek V3**: `deepseek-ai/DeepSeek-V3` (16,384 context length, FP8) ### Llama 3 Models - **Llama 3.3 70B Instruct Turbo**: `meta-llama/Llama-3.3-70B-Instruct-Turbo` (131,072 context length, FP8) - **Llama 3.1 70B Instruct Turbo**: `meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo` (131,072 context length, FP8) - **Llama 3.1 405B Instruct Turbo**: `meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo` (130,815 context length, FP8) - **Llama 3.1 8B Instruct Turbo**: `meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo` (131,072 context length, FP8) - **Llama 3.2 3B Instruct Turbo**: `meta-llama/Llama-3.2-3B-Instruct-Turbo` (131,072 context length, FP16) ### Mixtral Models - **Mixtral-8x7B Instruct**: `mistralai/Mixtral-8x7B-Instruct-v0.1` (32,768 context length, FP16) - **Mixtral-8x22B Instruct**: `mistralai/Mixtral-8x22B-Instruct-v0.1` (65,536 context length, FP16) - **Mistral Small 3 Instruct (24B)**: `mistralai/Mistral-Small-24B-Instruct-2501` (32,768 context length, FP16) ### Qwen Models - **Qwen 2.5 72B Instruct Turbo**: `Qwen/Qwen2.5-72B-Instruct-Turbo` (32,768 context length, FP8) - **Qwen 2.5 7B Instruct Turbo**: `Qwen/Qwen2.5-7B-Instruct-Turbo` (32,768 context length, FP8) - **Qwen 2.5 Coder 32B Instruct**: `Qwen/Qwen2.5-Coder-32B-Instruct` (32,768 context length, FP16) - **QwQ-32B**: `Qwen/QwQ-32B` (32,768 context length, FP16) ### Vision Models - **Llama 3.2 Vision**: `meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo` (131,072 context length, FP16) - **Qwen 2.5 Vision Language 72B**: `Qwen/Qwen2.5-VL-72B-Instruct` (32,768 context length, FP8) - **Qwen 2 VL 72B**: `Qwen/Qwen2-VL-72B-Instruct` (32,768 context length, FP16) ### Free Endpoints Together AI offers free tiers with reduced rate limits: - `meta-llama/Llama-3.3-70B-Instruct-Turbo-Free` - `meta-llama/Llama-Vision-Free` - `deepseek-ai/DeepSeek-R1-Distill-Llama-70B-Free` For a complete list of all 200+ available models and their specifications, refer to the [Together AI Models page](https://docs.together.ai/docs/inference-models). ## Example Configuration ```yaml title="promptfooconfig.yaml" # yaml-language-server: $schema=https://promptfoo.dev/config-schema.jsons providers: - id: togetherai:meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 config: temperature: 0.7 max_tokens: 4096 - id: togetherai:deepseek-ai/DeepSeek-R1 config: temperature: 0.0 response_format: { type: 'json_object' } tools: - type: function function: name: get_weather description: Get weather information parameters: type: object properties: location: { type: 'string' } unit: { type: 'string', enum: ['celsius', 'fahrenheit'] } ``` For more information, refer to the [Together AI documentation](https://docs.together.ai/docs/chat-models).