Model Comparison

Context Length3K

A 7.3B parameter model that outperforms Llama 2 13B on all benchmarks, with optimizations for speed and context length.

Provider

Pricing

Input$0.11 / M tokens
Output$0.19 / M tokens
Images– –

Endpoint Features

Quantization– –
Max Tokens (input + output)3K
Max Output Tokens– –
Stream cancellation
Supports Tools– –
No Prompt Training
Reasoning– –
    OpenRouter