Beast Models

A multi-agent reasoning system. Multiple internal agents deliberate at inference time — converging on a single response — to deliver depth, accuracy, and reliability across reasoning, coding, and agentic workflows.

Compare Models

	Beast Nano	Beast Mini	Beast Max
Version	v3	v3	v3
Tagline	Fast Multi-Agent Reasoning	Efficient Multi-Agent Reasoning	Full Multi-Agent Deliberation
Context Window	254K tokens	196K tokens	892K tokens
Max Output	65K tokens	128K tokens	128K tokens
Input Modalities	Text, Image	Text, Image	Text, Image, File/PDF
Reasoning Effort	`low`, `medium`, `high`	`low`, `medium`, `high`, `xhigh`	`low`, `medium`, `high`, `xhigh`, `yolo`
Search Context Size	`low`, `medium`, `high`	`low`, `medium`, `high`	`low`, `medium`, `high`
Thinking Model	Yes	Yes	Yes
Tool Use	Yes	Yes	Yes
Structured Outputs	Yes	Yes	Yes
Seed / Reproducibility	Yes	Yes	Yes
Prompt Pricing	$8 / 1M tokens	$12 / 1M tokens	$70 / 1M tokens
Cached Input Pricing	$0.80 / 1M tokens	$1.20 / 1M tokens	$7 / 1M tokens
Completion Pricing	$10 / 1M tokens	$15 / 1M tokens	$90 / 1M tokens

Beast Nano beast-nano

Fast Multi-Agent Reasoning

Beast Nano is the fastest model in the Beast lineup: a streamlined multi-agent reasoning system tuned for cost-sensitive, high-volume workloads. It runs a lighter deliberation profile at inference time, preserving the core quality advantage of multi-agent reasoning while reducing latency relative to Beast Mini and Beast Max.

Because Beast Nano still deliberates before answering, it is not a real-time model in the absolute sense. Deliberation inherently takes longer than a single forward pass through a conventional LLM. If sub-second response time is a hard requirement, Beast models are not the right fit.

Beast Nano supports multimodal inputs (text and images) with unified text-based output and a 254K-token context window. The reasoning_effort parameter lets you trade depth for speed within the multi-agent pipeline. Ideal for everyday agentic workflows, batch jobs, classification, routing, and any task where you want disciplined reasoning at the lowest cost in the Beast lineup.

Specifications

Model ID	`beast-nano`
Version	v3
Context Window	254,279 tokens
Max Output	65,536 tokens
Input Modalities	Text, Image
Output	Text
Reasoning Effort	`low`, `medium`, `high`
Search Context Size	`low`, `medium`, `high`

Pricing

Prompt (input)	$8 / 1M tokens
Cached input	$0.80 / 1M tokens
Completion (output)	$10 / 1M tokens

Capabilities

Thinking Model
Tool Calling (streaming)
Chat Completions API
Responses API
JSON Mode
Structured Outputs
Reasoning Control
Vision
Seed / Reproducibility

Supported Parameters

include_reasoning
max_tokens
reasoning
reasoning_effort
response_format
seed
stop
structured_outputs
temperature
tool_choice
tools
top_p
web_search_options

Beast Mini beast-mini

Efficient Multi-Agent Reasoning

Beast Mini brings BEAST’s multi-agent architecture to cost-sensitive workloads. Internal agents collaborate at inference time with deliberation tuned for efficiency, delivering the core quality advantages of a multi-agent system at a fraction of Beast Max’s compute cost.

Beast Mini supports multimodal inputs (text and images) with unified text-based output and a 196K-token context window. The reasoning_effort parameter lets you dial the depth of multi-agent deliberation to match your quality, cost, and speed tradeoffs.

Ideal for everyday agentic workflows, high-volume applications, and development environments that need disciplined multi-agent reasoning without full deliberation overhead.

Specifications

Model ID	`beast-mini`
Version	v3
Context Window	196,669 tokens
Max Output	128,000 tokens
Input Modalities	Text, Image
Output	Text
Reasoning Effort	`low`, `medium`, `high`, `xhigh`
Search Context Size	`low`, `medium`, `high`

Pricing

Prompt (input)	$12 / 1M tokens
Cached input	$1.20 / 1M tokens
Completion (output)	$15 / 1M tokens

Capabilities

Thinking Model
Tool Calling (streaming)
Chat Completions API
Responses API
JSON Mode
Structured Outputs
Reasoning Control
Web Search
Vision
Logprobs
Seed / Reproducibility

Supported Parameters

frequency_penalty
include_reasoning
logit_bias
logprobs
max_tokens
min_p
parallel_tool_calls
presence_penalty
reasoning
reasoning_effort
repetition_penalty
response_format
seed
stop
structured_outputs
temperature
tool_choice
tools
top_k
top_logprobs
top_p
web_search_options

Beast Max beast-max

Full Multi-Agent Deliberation

Beast Max is BEAST’s flagship model: a multi-agent reasoning system built for depth, accuracy, and complex problem solving. Multiple internal agents deliberate at inference time, converging on a high-quality response before delivery.

Beast Max supports multimodal inputs (text, images, and PDF documents) with unified text-based output and a 892K-token context window for sustained multi-turn reasoning. The reasoning_effort parameter controls the depth of multi-agent deliberation, from rapid convergence to full inference-time depth.

Because Beast Max deliberates before responding, it trades latency for dramatically reduced hallucinations and measurably higher output quality. Ideal for research, complex coding, and agentic workflows where accuracy matters more than speed.

Specifications

Model ID	`beast-max`
Version	v3
Context Window	892,500 tokens
Max Output	128,000 tokens
Input Modalities	Text, Image, File/PDF
Output	Text
Reasoning Effort	`low`, `medium`, `high`, `xhigh`, `yolo`
Search Context Size	`low`, `medium`, `high`

Pricing

Prompt (input)	$70 / 1M tokens
Cached input	$7 / 1M tokens
Completion (output)	$90 / 1M tokens

Capabilities

Thinking Model
Tool Calling (streaming)
Chat Completions API
Responses API
JSON Mode
Structured Outputs
Reasoning Control
Seed / Reproducibility
Web Search
Vision
Document / PDF Input

Supported Parameters

include_reasoning
max_completion_tokens
max_tokens
reasoning
reasoning_effort
response_format
seed
structured_outputs
tool_choice
tools
web_search_options

API Access

BeastLab AI exposes an OpenAI-compatible API: both the Chat Completions and Responses APIs. Any tool that lets you configure a custom base URL and API key can connect.

Reference the models you want by their IDs:

beast-nano
beast-mini
beast-max

See the Integrations guide for setup instructions with Cursor, Kilo Code CLI, OpenCode, and other OpenAI-compatible tools.

Get API Access