Model Comparison

Context Length1.05M

Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input (text and image) and multilingual output (text and code) across 12 supported languages. Designed for assistant-style interaction and visual reasoning, Scout uses 16 experts per forward pass and features a context length of 10 million tokens, with a training corpus of ~40 trillion tokens.

Built for high efficiency and local or commercial deployment, Llama 4 Scout incorporates early fusion for seamless modality integration. It is instruction-tuned for use in multilingual chat, captioning, and image understanding tasks. Released under the Llama 4 Community License, it was last trained on data up to August 2024 and launched publicly on April 5, 2025.

Provider

Pricing

Input$0.08 / M tokens
Output$0.30 / M tokens
Images– –

Endpoint Features

Quantizationfp8
Max Tokens (input + output)1.05M
Max Output Tokens1.05M
Stream cancellation
Supports Tools
No Prompt Training
Reasoning– –
    OpenRouter