Beast Models
A deliberative reasoning system. Multiple internal reasoning paths — a multi-agent architecture under the hood — converge at inference time to deliver depth, accuracy, and reliability across reasoning, coding, and agentic workflows.
Compare Models
| Beast Nano | Beast Mini | Beast Max | |
|---|---|---|---|
| Tagline | Fast Deliberation | Efficient Deliberation | Full Deliberation |
| Context Window | 254K tokens | 196K tokens | 892K tokens |
| Max Output | 65K tokens | 128K tokens | 128K tokens |
| Input Modalities | Text, Image | Text, Image | Text, Image, File/PDF |
| Reasoning Effort | low, medium, high |
low, medium, high, xhigh |
low, medium, high, xhigh, yolo |
| Search Context Size | low, medium, high |
low, medium, high |
low, medium, high |
| Thinking Model | Yes | Yes | Yes |
| Tool Use | Yes | Yes | Yes |
| Structured Outputs | Yes | Yes | Yes |
| Seed / Reproducibility | Yes | Yes | Yes |
| Prompt Pricing | $8 / 1M tokens | $12 / 1M tokens | $70 / 1M tokens |
| Completion Pricing | $10 / 1M tokens | $15 / 1M tokens | $90 / 1M tokens |
Beast Nano beast-nano
Fast Deliberation
Beast Nano is the fastest model in the Beast lineup: a streamlined deliberative reasoning system tuned for cost-sensitive, high-volume workloads. It runs a lighter deliberation profile at inference time, preserving the core quality advantage of deliberative reasoning while reducing latency relative to Beast Mini and Beast Max.
Because Beast Nano still deliberates before answering, it is not a real-time model in the absolute sense. Deliberation inherently takes longer than a single forward pass through a conventional LLM. If sub-second response time is a hard requirement, Beast models are not the right fit.
Beast Nano supports multimodal inputs (text and images) with unified text-based output and a 254K-token context window. The reasoning_effort parameter lets you trade depth for speed within the deliberation pipeline. Ideal for everyday agentic workflows, batch jobs, classification, routing, and any task where you want disciplined reasoning at the lowest cost in the Beast lineup.
Specifications
| Model ID | beast-nano |
|---|---|
| Context Window | 254,279 tokens |
| Max Output | 65,536 tokens |
| Input Modalities | Text, Image |
| Output | Text |
| Reasoning Effort | low, medium, high |
| Search Context Size | low, medium, high |
Pricing
| Prompt (input) | $8 / 1M tokens |
|---|---|
| Completion (output) | $10 / 1M tokens |
Capabilities
- Thinking Model
- Tool Calling (streaming)
- Chat Completions API
- Responses API
- JSON Mode
- Structured Outputs
- Reasoning Control
- Vision
- Seed / Reproducibility
Supported Parameters
- include_reasoning
- max_tokens
- reasoning
- reasoning_effort
- response_format
- seed
- stop
- structured_outputs
- temperature
- tool_choice
- tools
- top_p
- web_search_options
Beast Mini beast-mini
Efficient Deliberation
Beast Mini brings BEAST’s deliberative architecture to cost-sensitive workloads. Internal reasoning runs at inference time with deliberation tuned for efficiency, delivering the core quality advantages of a deliberative system at a fraction of Beast Max’s compute cost.
Beast Mini supports multimodal inputs (text and images) with unified text-based output and a 196K-token context window. The reasoning_effort parameter lets you dial the depth of deliberation to match your quality, cost, and speed tradeoffs.
Ideal for everyday agentic workflows, high-volume applications, and development environments that need disciplined deliberative reasoning without full deliberation overhead.
Specifications
| Model ID | beast-mini |
|---|---|
| Context Window | 196,669 tokens |
| Max Output | 128,000 tokens |
| Input Modalities | Text, Image |
| Output | Text |
| Reasoning Effort | low, medium, high, xhigh |
| Search Context Size | low, medium, high |
Pricing
| Prompt (input) | $12 / 1M tokens |
|---|---|
| Completion (output) | $15 / 1M tokens |
Capabilities
- Thinking Model
- Tool Calling (streaming)
- Chat Completions API
- Responses API
- JSON Mode
- Structured Outputs
- Reasoning Control
- Web Search
- Vision
- Logprobs
- Seed / Reproducibility
Supported Parameters
- frequency_penalty
- include_reasoning
- logit_bias
- logprobs
- max_tokens
- min_p
- parallel_tool_calls
- presence_penalty
- reasoning
- reasoning_effort
- repetition_penalty
- response_format
- seed
- stop
- structured_outputs
- temperature
- tool_choice
- tools
- top_k
- top_logprobs
- top_p
- web_search_options
Beast Max beast-max
Full Deliberation
Beast Max is BEAST’s flagship model: a deliberative reasoning system built for depth, accuracy, and complex problem solving. Multiple internal reasoning paths converge at inference time to produce a high-quality response before delivery.
Beast Max supports multimodal inputs (text, images, and PDF documents) with unified text-based output and a 892K-token context window for sustained multi-turn reasoning. The reasoning_effort parameter controls the depth of deliberation, from rapid convergence to full inference-time depth.
Because Beast Max deliberates before responding, it trades latency for dramatically reduced hallucinations and measurably higher output quality. Ideal for research, complex coding, and agentic workflows where accuracy matters more than speed.
Specifications
| Model ID | beast-max |
|---|---|
| Context Window | 892,500 tokens |
| Max Output | 128,000 tokens |
| Input Modalities | Text, Image, File/PDF |
| Output | Text |
| Reasoning Effort | low, medium, high, xhigh, yolo |
| Search Context Size | low, medium, high |
Pricing
| Prompt (input) | $70 / 1M tokens |
|---|---|
| Completion (output) | $90 / 1M tokens |
Capabilities
- Thinking Model
- Tool Calling (streaming)
- Chat Completions API
- Responses API
- JSON Mode
- Structured Outputs
- Reasoning Control
- Seed / Reproducibility
- Web Search
- Vision
- Document / PDF Input
Supported Parameters
- include_reasoning
- max_completion_tokens
- max_tokens
- reasoning
- reasoning_effort
- response_format
- seed
- structured_outputs
- tool_choice
- tools
- web_search_options
API Access
BeastLab AI exposes an OpenAI-compatible API: both the Chat Completions and Responses APIs. Any tool that lets you configure a custom base URL and API key can connect.
Reference the models you want by their IDs:
- beast-nano
- beast-mini
- beast-max
See the Integrations guide for setup instructions with Cursor, Kilo Code CLI, OpenCode, and other OpenAI-compatible tools.
Get API Access