Step LLM (`:llm`)

Quick Answer

Use this reference for the :llm step schema, prompt/message patterns, templates, and model-call configuration.

Use for model calls via an LLM-capable :http-api connection.

Canonical Shape

Core fields:

Field	Type	Required	Notes
`:type`	keyword	Yes	Must be `:llm`
`:expect`	map	No	Optional output expectation/assertion metadata
`:connection`	keyword/string	Recommended	Slot or connection id
`:messages`	vector	Yes*	Explicit chat messages
`:prompt`	string/map	Yes*	Prompt shorthand
`:system`	string	No	System prompt shorthand
`:input`	map	No	Canonical input envelope
`:template`	keyword	No	`:llm-prompt` template id
`:data`	map	No	Template data
`:model`	string	No	Model override
`:provider`	keyword/string	No	Provider override
`:temperature`, `:top-p`, `:stop`	scalar	No	Generation controls
`:max-tokens`	int	No	Response token cap
`:seed`, `:presence-penalty`, `:frequency-penalty`	scalar	No	Provider-supported generation controls
`:output` / `:response-format`	map/keyword	No	Structured output config; `:output :schema` accepts Malli schemas or raw JSON Schema maps
`:json-schema`	map	No	Legacy raw JSON Schema field
`:provider-opts`	map	No	Provider-specific escape hatch
`:base-url`, `:deployment`, `:api-version`	string	No	Custom/Azure/Chat Completions-compatible endpoint overrides
`:reasoning-effort`	keyword/string	No	Reasoning effort for providers that support it
`:prompt-cache-key`, `:previous-response-id`, `:cache-system?`	scalar/boolean	No	Provider-specific prompt caching and response continuation controls
`:tools`	map/vector	No	Agentic tools config; supports `:steps [...]` for packaged steps, `:agents [...]` for flow-level agent definitions, and `:mcp [...]` for flow-level MCP adapters
`:openai`	map	No	OpenAI Responses-specific options (`:responses`)
`:available-steps`	vector	No	Auto-tool step set
`:max-iterations`	int	No	Agentic loop bound; default `20`, max `100`
`:max-tool-calls`	int	No	Total executed tool-call cap; default `100`, max `1000`
`:max-repeated-tool-calls`	int	No	Optional cap for identical tool name+arguments calls; unset by default, max `100`
`:auth`	map	No	Explicit auth if not using connection
`:workspace-id`	string	No	Runtime context override; normally supplied by Breyta

* Provide :messages, :prompt/:system, or :input.

Limits And Behavior

Use either :messages or :prompt/:system.
Prefer templates for long prompts.
:tools belongs on the step config, not inside templates.
Token cost estimates are derived from provider/model usage and platform token pricing, not from step :metering. See Run Cost Estimates.
For most production flows, bind an LLM-capable connection via :requires. New stored connections should use :http-api with an LLM backend.
If a connection-create UI still exposes :llm-provider, treat it as a legacy alias only. The canonical stored connection and authored :requires type are both :http-api.
:max-tokens policy:
- platform-managed keys are capped by platform limit
- user-provided keys are not platform-clamped (provider/account limits still apply)
OpenAI hosted shell can be configured via :openai {:responses ...}.
Vision-capable models can receive uploaded or persisted image resources via
multipart :messages content parts. See Image Resource Inputs.
Supported providers for agentic tool execution (:tools with :mode :execute):
- :openai — OpenAI Responses API (most feature-rich: reasoning effort, CoT, prompt caching, native tools)
- :anthropic — Anthropic Messages API (tool calling, prompt caching)
- :bedrock — AWS Bedrock Runtime for Anthropic Claude models, signed with AWS SigV4
- :google — Google Gemini API (tool calling)
- :deepseek — DeepSeek Chat Completions API (tool calling, agentic loop, reasoning effort mapped to thinking mode)
- :openrouter — OpenRouter Chat Completions API (tool calling, structured output, agentic multi-turn tool loop for compatible routed models)
- :chat-completions-compatible — generic Chat Completions wire format (Groq, Together, Fireworks, Ollama, Azure, Mistral) — built-in family is conservative: tool calling in propose mode only, no default agentic multi-turn tool loop
- :openai-compatible — legacy alias for :chat-completions-compatible
OpenRouter connections use :backend :openrouter with base URL
https://openrouter.ai/api/v1. Model ids are OpenRouter model ids such as
"openai/gpt-4o-mini" or "google/gemini-2.5-flash". Optional
OpenRouter attribution headers can be supplied through :provider-opts, for
example {:http-referer "https://example.com" :x-title "My Flow"}. Breyta
sends the title as X-OpenRouter-Title.
For OpenRouter reasoning models, Breyta keeps provider reasoning metadata
separate from visible content and preserves it across tool turns when the
provider returns replay details.
AWS Bedrock connections use :backend :bedrock, an AWS SigV4 auth config,
and an Anthropic Claude Bedrock model id such as
"anthropic.claude-3-5-sonnet-20241022-v2:0" or an inference profile id
such as "us.anthropic.claude-3-5-sonnet-20241022-v2:0". See
AWS Bedrock Claude.
Custom providers can be registered via :providers {:llm [...]} in the flow
definition. See LLM Providers.
A verified Chat Completions-like endpoint can opt into agentic execution by
defining a flow-level provider with :family :chat-completions-compatible
and explicit :capabilities #{:tool-calling :structured-output :agentic-tool-loop}. This is author-owned compatibility: the endpoint must
accept assistant tool_calls replay followed by role=tool messages, keep
tool call ids stable, and support the selected model's tool-calling behavior.
When :tools {:steps [...]} lists qualified packaged step ids, those
flow-local packaged steps are also published as tools alongside built-in
step tools. See Packaged Steps.
When :tools {:mcp [...]} lists MCP adapter ids or tool refs, selected
remote MCP tools are published as ordinary agentic tools. MCP endpoints are
bound through :http-api requirements with :backend :mcp; see
Flow Definition — MCP Tool Adapters.
Agentic loop defaults:
- :max-iterations defaults to 20 and may be set up to 100.
- :max-tool-calls defaults to 100 and may be set up to 1000.
- :max-repeated-tool-calls is unset by default; set it to stop repeated identical tool calls.
- When a value is present both top-level and inside :tools, the :tools
  envelope value wins for that run.
Current limitations:
- :openai {:responses {:transport :websocket}} is not supported.
- :openai {:responses {:shell {:environment {:type :local}}}} is not supported.
- OpenAI Responses-only options (:previous-response-id, :openai.responses) are only available with the :openai provider.
- :reasoning-effort is available with providers that declare it, currently :openai and :deepseek.

Canonical Example

;; In the flow definition:
;; :templates [{:id :summary
;;              :type :llm-prompt
;;              :system "You are concise."
;;              :prompt "Summarize in 3 bullets:\\n{{text}}"}]
;; :functions [{:id :llm-input
;;              :language :clojure
;;              :code "(fn [input] {:text (:text input)})"}]
'(let [prepared (flow/step :function :prepare-llm-input
                  {:ref :llm-input
                   :input (flow/input)})
       result (flow/step :llm :summarize
                {:connection :ai
                 :model "gpt-4o-mini"
                 :template :summary
                 :data prepared
                 :output {:format :json}
                 :tools {:mode :propose
                         :allowed ["files" "table" "search"]}
                 :max-iterations 20})]
   result)

MCP Tools

Use MCP tools when a remote MCP endpoint already exposes the operations you
want the model to call. The LLM step does not bind directly to a remote server;
it selects from top-level :mcp adapters.

;; Flow-level setup:
{:requires [{:slot :linear-mcp
             :type :http-api
             :label "Linear MCP"
             :backend :mcp
             :base-url "https://mcp.linear.app"
             :auth {:type :bearer}}]
 :mcp [{:id :linear/issues
        :connection :linear-mcp
        :transport :streamable-http
        :allow-tools ["list_issues"]
        :tool-prefix "linear"
        :tools [{:name "list_issues"
                 :description "List Linear issues."
                 :input-schema {:type "object"
                                :properties {:team_id {:type "string"}}
                                :required ["team_id"]}}]}]}

;; Step-level selection:
(flow/step :llm :find-issues
  {:connection :ai
   :prompt "Find current blocker issues for team ENG."
   :tools {:mode :execute
           :mcp [:linear/list_issues]}})

The model sees a sanitized tool name such as linear_list_issues. Runtime
arguments are sent to the MCP server as JSON-RPC tools/call over the existing
HTTP activity path, so connection auth, SSRF checks, timeout, and response-size
limits apply. The default 190-second MCP tool timeout gives headroom over
Breyta MCP interfaces' 180-second synchronous wait; set adapter :timeout-ms
when a remote MCP server has a tighter or longer expected response window.

AWS Bedrock Claude

For Amazon Bedrock Claude model calls, bind the LLM step to a canonical
:http-api connection with :backend :bedrock. The connection stores the
Bedrock Runtime base URL and an AWS SigV4 auth config; the secret stores AWS
credentials.

Requirement shape:

{:requires [{:slot :ai
             :type :http-api
             :provided-by :author
             :label "AWS Bedrock Claude"
             :backend :bedrock
             :base-url "https://bedrock-runtime.us-east-1.amazonaws.com"
             :auth {:type :aws-sigv4
                    :secret-ref :aws-bedrock
                    :region "us-east-1"
                    :service "bedrock"}}]}

Secret payload:

{
  "access-key-id": "AKIA...",
  "secret-access-key": "...",
  "session-token": "optional temporary credential token"
}

Step shape:

(flow/step :llm :extract
  {:connection :ai
   :model "anthropic.claude-3-5-sonnet-20241022-v2:0"
   :system "Extract the key facts."
   :prompt "Use the attached report and return JSON."
   :messages [{:role "user"
               :content [{:type "text"
                          :text "Summarize this image."}
                         {:type :image-resource
                          :uri (:uri (:uploaded-image (flow/input)))}
                         {:type "text"
                          :text "Return the result as JSON."}]}]
   :output {:format :json}})

Notes:

The :llm Bedrock path targets Bedrock Runtime InvokeModel and sends the
Anthropic Messages request shape with anthropic_version "bedrock-2023-05-31".
:service should be "bedrock" for Bedrock Runtime.
The hosted Breyta Bedrock backend currently supports AWS SigV4 auth. Amazon
Bedrock API keys are a separate bearer-token auth surface and are not wired
through this backend yet; do not configure Bedrock as :api-key or :bearer.
For Bedrock Claude Sonnet 4.5 and Haiku 4.5 model ids, set either
:temperature or :top-p, not both.
Uploaded or persisted image resources can be used in :messages content
parts for Bedrock Claude vision-capable models with :type :image-resource.
Breyta reads the resource and sends the Bedrock-supported base64 image block;
do not pass res://... values as raw image_url URLs.
AWS Bedrock's Claude Messages API documents image input as base64 image bytes,
while direct Anthropic API also supports image URLs. For Bedrock, use
:image-resource or an explicit data URL rather than an HTTPS image URL.
Use Step HTTP — AWS SigV4
for raw Bedrock Runtime operations that are not normal Claude message calls.

Full Config Example

This example covers the complete authored configuration surface. Prefer the
smaller canonical example unless you need the specific option.

(flow/step :llm :review
  {:connection :ai
   :provider :openai
   :model "gpt-5.2"
   :expect {:contains ["summary"]}

   ;; Input forms. Use one primary form in real flows.
   :input {:system "You are a precise reviewer."
           :prompt "Review {{topic}}."
           :context {:topic "billing"}}
   :messages [{:role "system" :content "You are a precise reviewer."}
              {:role "user"
               :content [{:type "text" :text "Review this screenshot."}
                         {:type "image_url"
                          :image_url {:url "https://example.com/screen.png"
                                      :detail "high"}}]}]
   :system "You are concise."
   :prompt "Summarize {{topic}}."
   :template :summary
   :data {:topic "billing"}

   ;; Generation controls.
   :temperature 0.2
   :top-p 0.9
   :stop ["\nDONE"]
   :max-tokens 2000
   :seed 42
   :presence-penalty 0.0
   :frequency-penalty 0.1

   ;; Structured output. Prefer Malli schemas in :output :schema.
   :output {:format :json
            :schema [:map
                     [:summary :string]
                     [:confidence :double]]
            :style :deterministic
            :strict? true}
   ;; Legacy compatibility fields. Omit these when using :output above unless
   ;; you intentionally want the top-level values to override canonical output.
   :response-format :json
   :json-schema {"type" "object"}

   ;; Provider and endpoint controls.
   :provider-opts {:openai {:responses {:store false}}}
   :base-url "https://api.openai.com/v1"
   :deployment "prod-reviewer"
   :api-version "2025-04-01-preview"
   :reasoning-effort :medium
   :prompt-cache-key "summary-v1"
   :previous-response-id "resp_previous"
   :cache-system? true
   :openai {:responses {:tool-choice "required"
                        :store false}}

   ;; Tool calling.
   :available-steps [:files :table :search]
   :tools {:mode :execute
           :allowed ["files" "table" "search"]
           :steps [:github/open-pr]
           :agents [:review/security]
           :require {:tool-names ["files"]
                     :steps [:github/open-pr]
                     :agents [:review/security]}
           :definitions {:custom_tool {:name "custom_tool"
                                       :type :files}}
           :max-iterations 12
           :max-tool-calls 80
           :max-repeated-tool-calls 3}
   :max-iterations 20
   :max-tool-calls 100
   :max-repeated-tool-calls 4

   ;; Usually prefer :connection over explicit :auth.
   :auth {:type :api-key :header "Authorization" :prefix "Bearer"}
   :workspace-id "ws-runtime-override"})

Structured Output Schemas

For new flows, prefer the canonical :output envelope with a Malli schema:

(flow/step :llm :extract
  {:connection :ai
   :prompt "Extract the support request."
   :output {:format :json
            :schema [:map
                     [:subject :string]
                     [:priority [:enum "low" "normal" "high"]]
                     [:tags [:vector :string]]]}})

Breyta validates the Malli schema form, converts it to provider-facing JSON
Schema before the model call, and keeps provider-specific schema capabilities
behind the normal LLM provider boundary.

Raw JSON Schema maps are still accepted and are passed through unchanged:

:output {:format :json
         :schema {"type" "object"
                  "properties" {"subject" {"type" "string"}}}}

The older top-level :response-format and :json-schema fields remain
supported for existing flows. Prefer :output {:format :json :schema ...} for
new authoring because the same shape works for both :llm and :agent. If
both forms are present, the top-level legacy fields win for backward
compatibility.

DeepSeek supports JSON mode with :output {:format :json}, but the standard
DeepSeek chat endpoint does not accept Breyta JSON Schema structured output.
Use JSON mode without :schema, or add a deterministic validation/repair step
after the DeepSeek call.

Image Resource Inputs

Use :image-resource content parts when a flow input or earlier step produced
an uploaded/persisted image resource. Breyta resolves these only through the
workspace resource system; the :llm step does not fetch arbitrary remote
image URLs. If the image starts on the web, first use an explicit
resource-producing step such as :http/:files/:persist.

(flow/step :llm :describe-screenshot
  {:connection :ai
   :model "gpt-4o"
   :input {:messages [{:role "user"
                       :content [{:type :text
                                  :text "Describe this screenshot."}
                                 {:type :image-resource
                                  :resource uploaded-image-ref
                                  :detail :high}]}]}})

The resource locator can be supplied as any of:

:resource uploaded-image-ref
:uri "res://..."
:res-uri "res://..."
:resource-uri "res://..."

Optional image fields:

Field	Type	Notes
`:detail`	`:auto`, `:low`, `:high`	Provider detail hint; defaults to `:auto`
`:transport`	`:url`, `:data-url`	`:url` requires signed URL support; `:data-url` forces worker-side inline transport
`:signed-url-ttl-seconds` / `:url-ttl-seconds`	int	Signed URL TTL, clamped to 60-3600 seconds
`:processing`	map	Optional downsample/re-encode before provider call

Processing options:

{:type :image-resource
 :resource uploaded-image-ref
 :transport :data-url
 :processing {:max-width 1600
              :max-height 1600
              :format :jpeg
              :quality 85}}

Behavior and guardrails:

Image resources require a vision-capable provider/model. Unsupported models
fail before the provider HTTP request.
Resource reads stay inside the existing workspace resource/storage boundary.
Data URL transport validates declared content type, checks known image byte
signatures, enforces encoded byte and decoded pixel limits, and re-encodes
decoded images when processing is requested.
When the provider/model supports URL image transport and no :processing is
requested, Breyta prefers a short-lived signed storage URL to keep the
provider request smaller.
:transport :url is strict: if an HTTPS signed URL cannot be prepared, the
step fails instead of silently falling back.
Signed URLs and base64 image data are not returned in step outputs and should
not be logged by authored flow code.

OpenAI Responses Hosted Web Search

OpenAI Responses supports hosted web-search tools. In Breyta, pass the native
Responses tool config through :openai.responses.tools on an OpenAI-backed
:llm or :agent step:

'(flow/step :llm :research
   {:connection :ai
    :model "gpt-5.4"
    :prompt "Search for the latest official pricing page and summarize only cited facts."
    :openai {:responses {:tools [{:type "web_search"}]
                         :tool-choice "auto"
                         :store false}}})

Provider naming can vary by OpenAI Responses API version and model. Use the
exact tool type supported by the provider, commonly web_search or
web_search_preview. This is different from Breyta's built-in :search step
tool and different from packaged/MCP/Breyta tools configured through
:tools {:steps [...]}, :tools {:mcp [...]}, or :tools {:breyta ...}.

Reference: OpenAI web search guide:
https://platform.openai.com/docs/guides/tools-web-search?api-mode=responses

OpenAI Responses Hosted Shell

'(flow/step :llm :support-agent
   {:connection :ai
    :model "gpt-5.2"
    :prompt "Analyze this support email and draft a reply."
    :openai {:responses {:shell {:environment {:type :container_auto
                                               :network-policy {:type :allowlist
                                                                :allowed-domains ["gmail.googleapis.com"]}}}
                         :tool-choice "required"
                         :store false}}})

Connection And Installation Notes

For installable flows, declare the LLM connection requirement in :requires
so the binding is resolved at install time:

;; Author provides the connection (installer never sees an API key):
{:requires [{:slot :ai :type :http-api :provided-by :author}]}

;; Installer provides the connection (installer enters their own key):
{:requires [{:slot :ai :type :http-api :label "LLM Provider" :auth {:type :api-key}}]}

Then reference the slot in the step config:

(flow/step :llm :summarize {:connection :ai :prompt "Summarize this."})

When the flow is installed, the platform resolves :connection :ai through
the selected installation binding. The step never hardcodes credentials.

DeepSeek connections should use a normal HTTP API requirement with the
DeepSeek backend:

{:requires [{:slot :deepseek-api
             :type :http-api
             :label "DeepSeek"
             :base-url "https://api.deepseek.com"
             :backends #{:deepseek}
             :auth {:type :api-key}}]}

(flow/step :llm :discover
  {:connection :deepseek-api
   :provider :deepseek
   :model "deepseek-v4-pro"
   :prompt "Find three candidate companies."
   :tools {:mode :execute
           :steps [:web/search]}})

OpenRouter connections use the same HTTP API requirement shape with the
OpenRouter backend. Tool and structured-output support depends on the routed
model, so pick OpenRouter models that advertise the features your flow needs.

{:requires [{:slot :openrouter-api
             :type :http-api
             :label "OpenRouter"
             :base-url "https://openrouter.ai/api/v1"
             :backends #{:openrouter}
             :auth {:type :api-key}}]}

(flow/step :llm :delegate-review
  {:connection :openrouter-api
   :provider :openrouter
   :model "deepseek/deepseek-v4-flash"
   :provider-opts {:http-referer "https://breyta.ai"
                   :x-title "Breyta Flow"}
   :prompt "Review this repository and ask both specialist agents for input."
   :tools {:mode :execute
           :agents [:review/deepseek :review/opus]
           :require {:agents [:review/deepseek :review/opus]}
           :max-iterations 5}})

Use :provided-by :author when the author wants to absorb LLM costs across
all installations. Use installer-provided when each installer should bring
their own API key and control their own spend.

Step Agent — objective/input-oriented wrapper over the :llm tool loop
Packaged Steps — flow-local step wrappers publishable as agent tools
Step Files — source-tree and changeset operations
Step Table — table-resource operations
Installations — installable agent flows
Flow Definition
Templates
Limits And Recovery
CLI Commands

Step LLM (:llm)