Persisted Results And Resource Refs
Breyta persistence guide for storing large step outputs as res:// references and retrieving artifacts safely.
Goal
Store large step outputs as resource refs and pass compact references across steps.
Quick Answer
Use :persist when output size is uncertain, return :ref in flow output, and inspect content with breyta resources read <res://...>.
In practice, :persist is a common default for data-producing steps because many real outputs exceed inline thresholds quickly.
For streaming HTTP downloads and other temporary HTTP response blobs, prefer :persist {:type :blob :tier :ephemeral} on the :http step instead of relying on the retained default.
Why Use :persist
Use persistence when a step can produce large or unbounded output:
- avoid bloating inline workflow state
- keep downstream step params shareable and small
- surface retrievable artifacts via
breyta resources ... - avoid frequent rework from crossing the 256 KB inline threshold during iteration
Default Posture
For non-trivial flows, default to :persist for steps that can return variable or growing payloads (:db, :http, :llm), then pass refs downstream.
Treat persist tier as a separate choice:
- use
:tier :ephemeralfor temporary streamed HTTP downloads, exports, and other short-lived response blobs created by:http - keep the retained default for non-streaming persists and for artifacts that should remain durable, reusable, or user-discoverable beyond the immediate run
In practice, many flows need both decisions:
:persistanswers "should this stay out of inline workflow state?":tieranswers "is this a temporary artifact or a durable retained artifact?"
Minimal Pattern
{:flow
'(let [rows (flow/step :db :query-orders
{:connection :warehouse
:database :postgres
:sql "select * from orders where created_at >= now() - interval '1 day'"})
persisted (flow/step :function :persist-rows
{:input {:rows rows}
:persist {:type :blob}})]
{:rows-ref (:ref persisted)})}
Persisted step returns:
{:ref "res://..."}
Loading Persisted HTTP Responses In Function Steps
When an :http step persists a large response as a blob, pass the whole step result into the downstream function step and mark that field in :load. This restores the persisted HTTP response before your function code runs.
'(let [resp (flow/step :http :generate-image
{:connection :image-api
:method :post
:path "/images/generations"
:persist {:type :blob}})
img (flow/step :function :decode-image
{:input {:resp resp}
:load [:resp]
:persist {:type :blob
:filename "image.jpeg"
:content-type "image/jpeg"}
:code '(fn [{:keys [resp]}]
(-> resp
:body
:data
first
:b64_json
breyta.sandbox/base64-decode-bytes))})]
{:image img})
Use this pattern when the HTTP response body is too large to survive inline transfer to the next step.
Prefer :tier :ephemeral on the persisted HTTP response when the response is a temporary workflow artifact. The downstream function step can still persist the derived file, but that derived persist stays on the retained tier today:
'(let [resp (flow/step :http :generate-image
{:connection :image-api
:method :post
:path "/images/generations"
:persist {:type :blob
:tier :ephemeral}})
img (flow/step :function :decode-image
{:input {:resp resp}
:load [:resp]
:persist {:type :blob
:filename "image.jpeg"
:content-type "image/jpeg"}
:code '(fn [{:keys [resp]}]
(-> resp
:body
:data
first
:b64_json
breyta.sandbox/base64-decode-bytes))})]
{:image img})
Blob Path Templates
Use :path to express the relative storage subpath and keep :filename as the leaf name:
(flow/step :function :persist-report
{:input {:tenant-id tenant-id
:report-id report-id
:rows rows}
:code '(fn [{:keys [rows]}] rows)
:persist {:type :blob
:path "exports/{{input.tenant-id}}"
:filename "report-{{input.report-id}}.json"}})
For plain :persist writes without :slot, runtime stores the artifact under its managed prefix:
workspaces/<ws>/persist/<flow>/<step>/<uuid>/exports/<tenant-id>/report-<report-id>.json
Notes:
:pathis relative only; do not include a leading/or..:pathand:filenamesupport{{...}}interpolation from resolved step params (input.*,data.*,query.*, etc.) plus runtime fields likeworkspace-id,flow-slug, andstep-id- Existing slash-bearing
:filenameflows still work, but new flows should prefer the explicit:path+:filenamesplit
Installer-Configured Storage Scopes
When installers should control where persisted artifacts land, declare a :blob-storage slot and point :persist :slot at that slot.
For connected persists, the installer-configured storage root becomes the write base under the runtime workspace:
workspaces/<ws>/storage/<configured-root>/<persist-path>/<filename>
That is the full platform path shape for connected persists. Breyta does not add hidden <flow>/<step>/<uuid> segments after the configured storage root.
Author the slot once:
{:requires [{:slot :archive
:type :blob-storage
:label "Archive storage"
:config {:prefix {:default "reports"
:label "Folder prefix"
:description "Stored under this folder in the selected storage connection."
:placeholder "reports/customer-a"}}}]}
Use it from :persist:
(flow/step :http :download-report
{:connection :reports-api
:path "/exports/latest"
:response-as :bytes
:persist {:type :blob
:slot :archive
:path "{{input.tenant-id}}/{{input.run-date}}"
:filename "summary-{{input.report-id}}.pdf"}})
With storage root reports/customer-a, that write lands at:
workspaces/<ws>/storage/reports/customer-a/<tenant-id>/<run-date>/summary-<report-id>.pdf
Use the same slot from a runtime resource picker:
{:kind :form
:collect :run
:fields [{:key :report
:label "Archived report"
:field-type :resource
:slot :archive
:accept ["application/pdf"]}]}
Notes:
- every installer-owned
:blob-storageslot automatically adds a required setup control for the storage root - authors can customize that control with
:config {:prefix ...}, but cannot disable it - the chosen root is saved in
bindings.<slot>.config.root :persist :pathstays relative to the configured root rather than repeating it in the step- connected persists write exactly under
workspaces/<ws>/storage/<configured-root>/... - runtime resource pickers reuse the same resolved
connection + root, so the author does not wire the prefix twice - current writes remain platform-backed; the slot is the authored contract and the runtime binding controls which storage target backs it
- end-user installations derive a private default root such as
installations/<profile-id>/reports; shared roots require an explicit override - sharing is an installer choice: two flows share when installers point them at the same backend and storage root
- slot names stay local to each flow; the concrete storage location is the actual sharing boundary
- persisted blob resources are canonical
:fileresources, so resource fields default correctly without an explicit:resource-types :sourceremains as a legacy/internal picker-routing field, but the preferred authored model is to bind pickers by:slot
End-To-End Producer And Consumer Example
Use this pattern when Flow A writes files and Flow B later works on those files.
Producer flow:
{:requires [{:slot :archive
:type :blob-storage
:label "Archive storage"
:config {:prefix {:default "reports"
:label "Folder prefix"}}}]
:flow
'(let [download (flow/step :http :download-report
{:connection :reports-api
:response-as :bytes
:persist {:type :blob
:slot :archive
:path "{{input.customer-id}}/{{input.run-date}}"
:filename "summary-{{input.report-id}}.pdf"}})]
{:download download})}
Consumer flow:
{:requires [{:slot :archive
:type :blob-storage
:label "Archive storage"
:prefers [{:flow :report-producer
:slot :archive}]
:config {:prefix {:default "reports"
:label "Folder prefix"}}}
{:kind :form
:collect :run
:fields [{:key :report
:label "Archived report"
:field-type :resource
:slot :archive
:accept ["application/pdf"]}]}]
:flow
'(let [input (flow/input)]
{:report (:report input)})}
By default, two end-user installations stay isolated because each one derives its own private root, such as installations/<producer-profile-id>/reports and installations/<consumer-profile-id>/reports.
They share only if both installations are explicitly configured with the same root:
{:bindings {:archive {:binding-type :connection
:connection-id "platform"
:config {:root "reports/acme"}}}}
and the producer run uses:
{:customer-id "cust-77"
:run-date "2026-03-24"
:report-id "rep-42"}
the stored object path is:
workspaces/<ws>/storage/reports/acme/cust-77/2026-03-24/summary-rep-42.pdf
The consumer does not need to know that path. Its runtime picker simply scopes to the same concrete storage location behind :slot :archive.
If you know one producer flow is the intended upstream lane, add :prefers to the consumer slot.
That records the intended sharing relationship, but it does not auto-select or persist the consumer root.
To share, the installer still must explicitly save the same connection + root on both installations.
Keep these boundaries in mind:
- the producer writes through its own local installer-owned slot such as
:archive - the consumer reads through its own local installer-owned slot such as
:archive - the installer decides whether those slots share by choosing the same or different storage roots
- if two flows point at the same storage location, they share both the utility and the overwrite risk
Resource Types
Use the resource-type split like this:
:persist {:type :blob ...}creates:fileresources- uploads create
:fileresources :persist {:type :kv ...}creates:resultresources- captured run and step outputs stay
:result
Because persisted blobs and uploads are both :file, most resource picker fields can omit :resource-types entirely.
Add :resource-types only when you need something narrower than the default file picker, for example [:result] for structured run outputs.
Reading Persisted Content
breyta resources workflow list <workflow-id>
breyta resources read <res://...>
breyta resources search "transcript"
Commands that help:
breyta resources search "<query>" [--type result|file] [--content-sources file,result]breyta resources search "<query>" [--storage-backend gcs] [--storage-root reports/acme] [--path-prefix exports/2026]breyta resources list [--types file] [--storage-root reports/acme] [--path-prefix exports/2026]breyta resources workflow step <workflow-id> <step-id>breyta resources get <res://...>breyta resources url <res://...>
Use storage filters like this:
storage-backendnarrows by backend family, such asgcsstorage-rootnarrows to the installer-configured root, such asreports/acmepath-prefixnarrows further inside that root, relative to it, such asexports/2026path-prefixis relative to the configured root, not the fullworkspaces/<ws>/storage/...object path
That means a platform-backed persisted file stored at:
workspaces/ws-acme/storage/reports/acme/exports/2026/summary.pdf
is searchable with:
--storage-backend gcs--storage-root reports/acme--path-prefix exports/2026
Resource Search Indexing
How persisted artifacts become searchable in breyta resources search:
- search indexes metadata fields (
display name, URI/path context, tags, source label) - connected persists also index normalized storage scope fields so search and pickers can filter by backend, root, and relative path
- text content indexing is enabled only for text-like payloads
:tier :ephemeralblobs are metadata-indexed by default (raw content is not extracted)- binary blobs are discoverable by metadata/path context, but raw binary content is not full-text indexed
- indexed text is bounded by size/character limits for stability
For connected persists, Breyta stores both the full path and normalized storage fields:
| Indexed field | Meaning | Example |
|---|---|---|
path | Full physical object path, useful for broad search/debug context | workspaces/ws-acme/storage/reports/acme/exports/2026/summary.pdf |
storage_backend | Backend family | platform |
storage_root | Installer-configured root inside that backend | reports/acme |
path_under_root | Relative path below the root | exports/2026/summary.pdf |
That split is intentional:
- free-text search can still match the full
path storage-rootandpath-prefixuse the normalized fields instead of requiring the full workspace storage path- the same backend/root/relative-path contract can extend to future storage backends without changing authored filters
Persist Blob Tiers
:retained(default): 50MB max persisted write size, 12 month default retention:ephemeral: 4GB max persisted write size, short-lived streaming tier (optimized for HTTP downloads)
Which Tier Should You Use?
Use :tier :ephemeral when the blob is a temporary streamed HTTP artifact:
- HTTP downloads and exports
- large API responses that are only being handed to a downstream step
- streamed response bodies that do not need durable retention
Keep the retained default when the blob is meant to last as a durable product, or when the persist is coming from a non-streaming step:
- user-facing deliverables that should remain searchable later
- derived files persisted from
:function,:db, or other non-streaming steps - installer-managed shared storage under a stable business path
- artifacts that another flow or operator is expected to revisit after the run completes
Short rule of thumb:
:persistwithout:tierdefaults to retained storage- use
:tier :ephemeralonly on temporary streamed HTTP persists - if you need the artifact to behave like a durable file, keep the retained default or use an explicit storage slot/root
:persist :search-index Overrides
Use :search-index under :persist to customize indexed text/metadata for persisted artifacts (especially binary blobs), without changing stored payload bytes.
Target shape:
{:persist {:type :blob
:path "invoices/{{input.customer-id}}"
:filename "invoice.pdf"
:search-index {:text "invoice-id=INV-123 vendor=Acme total=4500"
:tags ["invoice" "acme" "emea"]
:source-label "Invoice PDF from SAP import"
:include-raw-content? false}}}
Intended precedence:
:search-index.textoverrides default indexed content text:search-index.tagsoverrides/augments indexed tags:search-index.source-labeloverrides derived source label:search-index.include-raw-content?controls whether default extracted text is also included when available
Open The Same Artifact In Breyta Web
In API mode JSON output, resource responses can include optional webUrl links that point to the artifact context in Breyta Web:
breyta resources workflow list <workflow-id> --format json->data.items[].webUrl(andmeta.webUrlfor a primary destination)breyta resources search "<query>" --format json->data.items[].webUrlanddata.items[].display-namebreyta resources get <res://...> --format json->data.webUrl(and usuallymeta.webUrl)breyta resources url <res://...> --format json-> signeddata.urlplus optionaldata.webUrl/meta.webUrl
Quick extraction pattern:
breyta resources get <res://...> --format json | jq -r '.meta.webUrl // .data.webUrl // empty'
Cross-Flow State Handoff With KV
For shared state between runs/flows, pair result persistence with KV writes:
- persist large step output as
res://... - write a compact KV record that points to that ref
- read KV in downstream flows and resolve ref only when needed
'(let [payload (flow/step :http :collect
{:connection :source-api
:method :get
:path "/records"
:persist {:type :blob}})
_kv (flow/step :kv :record-latest
{:type :kv
:operation :set
:key "records:latest"
:value {:ref (:ref payload)}
:ttl 604800})
latest (flow/step :kv :load-latest
{:type :kv
:operation :get
:key "records:latest"})]
{:latest (:value latest)})
This keeps orchestration payloads small while still giving operators a durable pointer to the latest artifact.
Design Rules
- persist early when output size is uncertain
- return refs instead of heavy payloads in final output
- pass refs explicitly; don’t hide them in nested structures
- treat persisted artifacts as durable run history
- persist when payloads can grow, are reused across steps, or need operator inspection after completion
- for cross-run/cross-flow lookup, store lightweight pointers in KV instead of duplicating large objects
Troubleshooting
- downstream steps fail with large payloads: persist the producer output and re-run
- resource not found: list workflow resources and match
step-idtoworkflowId resourcescommands require authenticated API mode- debug persisted refs by listing workflow resources, finding the producing
step-id, and reading the targetres://URI