Beast Models

A deliberative reasoning system. Multiple internal reasoning paths — a multi-agent architecture under the hood — converge at inference time to deliver depth, accuracy, and reliability across reasoning, coding, and agentic workflows.

Compare Models

Beast Nano Beast Mini Beast Max
Tagline Fast Deliberation Efficient Deliberation
Context Window 254K tokens 196K tokens
Max Output 65K tokens 128K tokens
Input Modalities Text, Image Text, Image
Reasoning Effort low, medium, high low, medium, high, xhigh
Search Context Size low, medium, high low, medium, high
Thinking Model Yes Yes
Tool Use Yes Yes
Structured Outputs Yes Yes
Seed / Reproducibility Yes Yes
Prompt Pricing $8 / 1M tokens $12 / 1M tokens
Completion Pricing $10 / 1M tokens $15 / 1M tokens

Beast Nano beast-nano

Fast Deliberation

Beast Nano is the fastest model in the Beast lineup: a streamlined deliberative reasoning system tuned for cost-sensitive, high-volume workloads. It runs a lighter deliberation profile at inference time, preserving the core quality advantage of deliberative reasoning while reducing latency relative to Beast Mini and Beast Max.

Because Beast Nano still deliberates before answering, it is not a real-time model in the absolute sense. Deliberation inherently takes longer than a single forward pass through a conventional LLM. If sub-second response time is a hard requirement, Beast models are not the right fit.

Beast Nano supports multimodal inputs (text and images) with unified text-based output and a 254K-token context window. The reasoning_effort parameter lets you trade depth for speed within the deliberation pipeline. Ideal for everyday agentic workflows, batch jobs, classification, routing, and any task where you want disciplined reasoning at the lowest cost in the Beast lineup.

Specifications

Model IDbeast-nano
Context Window254,279 tokens
Max Output65,536 tokens
Input ModalitiesText, Image
OutputText
Reasoning Effortlow, medium, high
Search Context Sizelow, medium, high

Pricing

Prompt (input)$8 / 1M tokens
Completion (output)$10 / 1M tokens

Capabilities

  • Thinking Model
  • Tool Calling (streaming)
  • Chat Completions API
  • Responses API
  • JSON Mode
  • Structured Outputs
  • Reasoning Control
  • Vision
  • Seed / Reproducibility

Supported Parameters

  • include_reasoning
  • max_tokens
  • reasoning
  • reasoning_effort
  • response_format
  • seed
  • stop
  • structured_outputs
  • temperature
  • tool_choice
  • tools
  • top_p
  • web_search_options

Beast Mini beast-mini

Efficient Deliberation

Beast Mini brings BEAST’s deliberative architecture to cost-sensitive workloads. Internal reasoning runs at inference time with deliberation tuned for efficiency, delivering the core quality advantages of a deliberative system at a fraction of Beast Max’s compute cost.

Beast Mini supports multimodal inputs (text and images) with unified text-based output and a 196K-token context window. The reasoning_effort parameter lets you dial the depth of deliberation to match your quality, cost, and speed tradeoffs.

Ideal for everyday agentic workflows, high-volume applications, and development environments that need disciplined deliberative reasoning without full deliberation overhead.

Specifications

Model IDbeast-mini
Context Window196,669 tokens
Max Output128,000 tokens
Input ModalitiesText, Image
OutputText
Reasoning Effortlow, medium, high, xhigh
Search Context Sizelow, medium, high

Pricing

Prompt (input)$12 / 1M tokens
Completion (output)$15 / 1M tokens

Capabilities

  • Thinking Model
  • Tool Calling (streaming)
  • Chat Completions API
  • Responses API
  • JSON Mode
  • Structured Outputs
  • Reasoning Control
  • Web Search
  • Vision
  • Logprobs
  • Seed / Reproducibility

Supported Parameters

  • frequency_penalty
  • include_reasoning
  • logit_bias
  • logprobs
  • max_tokens
  • min_p
  • parallel_tool_calls
  • presence_penalty
  • reasoning
  • reasoning_effort
  • repetition_penalty
  • response_format
  • seed
  • stop
  • structured_outputs
  • temperature
  • tool_choice
  • tools
  • top_k
  • top_logprobs
  • top_p
  • web_search_options

Beast Max beast-max

Full Deliberation

Beast Max is BEAST’s flagship model: a deliberative reasoning system built for depth, accuracy, and complex problem solving. Multiple internal reasoning paths converge at inference time to produce a high-quality response before delivery.

Beast Max supports multimodal inputs (text, images, and PDF documents) with unified text-based output and a 892K-token context window for sustained multi-turn reasoning. The reasoning_effort parameter controls the depth of deliberation, from rapid convergence to full inference-time depth.

Because Beast Max deliberates before responding, it trades latency for dramatically reduced hallucinations and measurably higher output quality. Ideal for research, complex coding, and agentic workflows where accuracy matters more than speed.

Specifications

Model IDbeast-max
Context Window892,500 tokens
Max Output128,000 tokens
Input ModalitiesText, Image, File/PDF
OutputText
Reasoning Effortlow, medium, high, xhigh, yolo
Search Context Sizelow, medium, high

Pricing

Prompt (input)$70 / 1M tokens
Completion (output)$90 / 1M tokens

Capabilities

  • Thinking Model
  • Tool Calling (streaming)
  • Chat Completions API
  • Responses API
  • JSON Mode
  • Structured Outputs
  • Reasoning Control
  • Seed / Reproducibility
  • Web Search
  • Vision
  • Document / PDF Input

Supported Parameters

  • include_reasoning
  • max_completion_tokens
  • max_tokens
  • reasoning
  • reasoning_effort
  • response_format
  • seed
  • structured_outputs
  • tool_choice
  • tools
  • web_search_options

API Access

BeastLab AI exposes an OpenAI-compatible API: both the Chat Completions and Responses APIs. Any tool that lets you configure a custom base URL and API key can connect.

Reference the models you want by their IDs:

  • beast-nano
  • beast-mini
  • beast-max

See the Integrations guide for setup instructions with Cursor, Kilo Code CLI, OpenCode, and other OpenAI-compatible tools.

Get API Access