Persisted Results And Resource Refs

Breyta persistence guide for storing large step outputs as res:// references and retrieving artifacts safely.

Goal

Store large step outputs as resource refs and pass compact references across steps.

Quick Answer

Use :persist when output size is uncertain, pass refs downstream, and inspect content with breyta resources read <res://...>.
For rows, :persist {:type :table ...} creates a queryable table resource for later :table steps and breyta resources table ....
For blobs, choose the tier deliberately: retained/default for durable or user-visible artifacts; :tier :ephemeral for temporary streamed HTTP responses.

Persistence is storage, not presentation. A res://... ref is a compact handle
for downstream steps and debugging. When a person should see the result, render
the resource through a final output viewer: usually a Markdown report with
breyta-resource fences, or a deliberate :table, :image, :video,
:download, or :raw viewer. Persisted JSON resources can also render through
:view :json inside Markdown. See Output Artifacts.

Why Use `:persist`

Use persistence when a step can produce large or unbounded output:

avoid bloating inline workflow state
keep downstream step params shareable and small
surface retrievable artifacts via breyta resources ...
avoid frequent rework from crossing the 512 KB inline threshold during iteration
keep row-oriented operational data in a queryable table resource instead of pushing whole rowsets through workflow history

The important hard numbers for authors are:

inline step results are intended to stay under 512 KB
unpersisted step results hard-fail around 1 MB
database result payloads are capped at 1 MB
retained blob persists can write up to 50 MB
ephemeral streamed blob persists can write up to 4 GB
HTTP body loads from refs are capped at 10 MB retained or 20 MB ephemeral

So for data-heavy outputs that can exceed inline limits, return :persist refs
instead of passing the whole value through workflow state.

Storage Tier Decision

:persist has two choices: :type chooses :blob, :table, or :kv; :tier chooses the blob storage tier where supported.

Tier	How to request it	Use when	Typical cap
retained	omit `:tier` or use `:tier :retained`	durable, reusable, searchable, user-visible, or needed after the run	`50 MB` write cap
ephemeral	`:tier :ephemeral`	temporary streamed HTTP downloads, exports, generated media, or API response bodies	`4 GB` streaming write cap

Current support boundary: :tier :ephemeral is for streaming HTTP blob persists.
Function, table, and KV persists use the retained/default path today. Retain the
final curated artifact or table; keep intermediate HTTP blobs ephemeral.

Document Fetch, Preserve, Extract, Display

For PDFs and other document files, separate four jobs:

fetch: use :http with :response-as :bytes
preserve: add :persist {:type :blob ...} so the file has a res:// ref
extract: use an external document extraction API or an LLM/tool that explicitly supports that file type
display: return a Markdown report with a download/resource embed

Breyta does not currently expose a built-in PDF text extraction primitive.
Do not assume that persisting a PDF makes its raw text available to functions or
search. Persisted PDFs are discoverable by metadata/path context; add
:persist {:search-index {:text ...}} when you already have trusted extracted
text.

If an HTTP step returns binary bytes, do not pass those bytes to table rows as
text. Persist the blob, pass the resource ref, then call an explicit extraction
service before writing extracted fields to a table.

Default Posture

For non-trivial flows, default to :persist for steps that can return variable or growing payloads (:db, :http, :llm), then pass refs downstream.
When the result is a collection of rows that should stay editable or queryable later, prefer :persist {:type :table ...} over keeping the full rowset inline.

For derived tables, prefer flow/step :table with {:op :materialize-join ...}
over pulling rows into :function and hand-writing joins.

Minimal Pattern

Blob persist:

{:flow
 '(let [download (flow/step :http :download-orders
                   {:url "https://api.example.com/orders.csv"
                    :response-as :bytes
                    :persist {:type :blob
                              :tier :ephemeral
                              :filename "orders.csv"
                              :content-type "text/csv"}})]
    {:download-uri (:uri download)})}

Persisted blob results include the resource ref fields used by resource APIs,
viewers, tables, and downstream loaders:

{:uri "res://..."
 :resource-uri "res://..."
 :blob-ref {...}}

Table persist:

{:flow
 '(let [orders (flow/step :http :fetch-orders
                 {:url "https://api.example.com/orders"
                  :accept :json
                  :persist {:type :table
                            :table "orders"
                            :rows-path [:body :items]
                            :write-mode :upsert
                            :key-fields [:id]}})]
    {:orders-table orders
     :orders-table-uri (:uri orders)})}

Table persists return a resource ref with table/write metadata:

{:type :resource-ref
 :uri "res://v1/ws/ws-123/result/table/tbl_..."
 :content-type "application/vnd.breyta.table+json"
 :preview {:table-name "orders"
           :write-mode :upsert
           :rows-written 100}
 :write {:mode :upsert
         :rows-written 100}}

upsert is incremental: it updates matching key rows and inserts new key rows,
but it does not remove rows omitted from a later write. For "latest snapshot"
tables, include a run/batch key in :key-fields or :partitioning, then query
the current batch. There is no scoped replace/delete-by-group mode yet.

Human-Readable Table Output Recipe

Use this pattern when the final output should open as a real Breyta table artifact, not just show a text report with a table inside it.

Build row maps with stable storage keys.
Add human-facing column metadata with :columns.
Persist the rows with :persist {:type :table ...}.
Return the persisted table step result as the :breyta.viewer/value for a :table viewer.

'(let [run-id (str "run-" (flow/now-ms))
       comparison-table
       (flow/step :function :build-comparison-table
                  {:input {:rows comparison-rows
                           :run-id run-id}
                   :code '(fn [{:keys [rows run-id]}]
                            {:rows
                             (mapv (fn [row]
                                     {:run_id run-id
                                      :paragraph (:paragraph row)
                                      :original (:original row)
                                      :cleaned (:cleaned row)
                                      :changed (:changed row)})
                                   rows)})
                   :persist {:type :table
                             :table (str "transcript-comparison-" run-id)
                             :rows-path [:rows]
                             :write-mode :upsert
                             :key-fields [:run_id :paragraph]
                             :indexes [{:field :run_id}
                                       {:field :changed}]
                             :columns [{:column :paragraph
                                        :display-name "Paragraph"}
                                       {:column :original
                                        :display-name "Original"}
                                       {:column :cleaned
                                        :display-name "Cleaned"}
                                       {:column :changed
                                        :display-name "Changed"}]}})]
   {:breyta.viewer/kind :table
    :breyta.viewer/options {:title "Original vs cleaned"}
    :breyta.viewer/value comparison-table})

Inline maps like {:rows [...] :columns [...]} are not table artifacts. A real table artifact has a table content type and a res://.../result/table/... URI.

When the table belongs inside a narrative report, keep the final output as a
Markdown viewer and embed the persisted table with a breyta-resource fence.
That lets the surrounding text, filtered table snapshot, aggregate chart, and
download affordance render in document order without exposing the res:// URI
to end users. Use Output Artifacts for the full
Markdown resource embed syntax.

Verification loop:

breyta runs show <workflow-id> --pretty
breyta resources read <table-uri> --limit 25 --offset 0

Check that the final output table item contains :type :resource-ref, :content-type "application/vnd.breyta.table+json", and non-zero :preview :rows-written or :preview :row-count.

materialize-join remains incremental in v1:

destination writes use :append or :upsert
there is no snapshot/replace mode yet
joins read the current materialized row state of source tables, so :recompute first when derived source values must be refreshed

The same rule applies to ordinary table persists: a smaller rerun does not
delete rows from an earlier larger run unless the flow models each run as its own
batch/partition and reads the latest batch.

Loading Persisted HTTP Responses In Function Steps

For large persisted HTTP responses, pass the whole step result into the
downstream function input and mark that field in :load. Use
:tier :ephemeral for temporary intermediates; omit :tier when the HTTP blob
itself is durable.

'(let [resp (flow/step :http :generate-image
               {:connection :image-api
                :method :post
                :path "/images/generations"
                :persist {:type :blob
                          :tier :ephemeral}})
       img (flow/step :function :decode-image
              {:input {:resp resp}
               :load [:resp]
               :persist {:type :blob
                         :filename "image.jpeg"
                         :content-type "image/jpeg"}
               :code '(fn [{:keys [resp]}]
                        (-> resp
                            :body
                            :data
                            first
                            :b64_json
                            breyta.sandbox/base64-decode-bytes))})]
   {:image img})

Use this pattern when the HTTP response body is too large to survive inline transfer to the next step.

Blob Path Templates

Use :path to express the relative storage subpath and keep :filename as the leaf name:

(flow/step :function :persist-report
  {:input {:tenant-id tenant-id
           :report-id report-id
           :rows rows}
   :code '(fn [{:keys [rows]}] rows)
   :persist {:type :blob
             :path "exports/{{input.tenant-id}}"
             :filename "report-{{input.report-id}}.json"}})

For plain :persist writes without :slot, runtime stores the artifact under its managed prefix:

workspaces/<ws>/persist/<flow>/<step>/<uuid>/exports/<tenant-id>/report-<report-id>.json

Notes:

:path is relative only; do not include a leading / or ..
:path and :filename support {{...}} interpolation from resolved step params (input.*, data.*, query.*, etc.) plus runtime fields like workspace-id, flow-slug, and step-id
Existing slash-bearing :filename flows still work, but new flows should prefer the explicit :path + :filename split

Installer-Configured Storage Scopes

When installers should control where persisted artifacts land, declare a :blob-storage slot and point :persist :slot at that slot.
For connected persists, the installer-configured storage root becomes the write base under the runtime workspace:

workspaces/<ws>/storage/<configured-root>/<persist-path>/<filename>

That is the full platform path shape for connected persists. Breyta does not add hidden <flow>/<step>/<uuid> segments after the configured storage root.

Author the slot once:

{:requires [{:slot :archive
             :type :blob-storage
             :label "Archive storage"
             :config {:prefix {:default "reports"
                               :label "Folder prefix"
                               :description "Stored under this folder in the selected storage connection."
                               :placeholder "reports/customer-a"}}}]}

Use it from :persist:

(flow/step :http :download-report
  {:connection :reports-api
   :path "/exports/latest"
   :response-as :bytes
   :persist {:type :blob
             :slot :archive
             :path "{{input.tenant-id}}/{{input.run-date}}"
             :filename "summary-{{input.report-id}}.pdf"}})

With storage root reports/customer-a, that write lands at:

workspaces/<ws>/storage/reports/customer-a/<tenant-id>/<run-date>/summary-<report-id>.pdf

Use the same slot from a runtime resource picker:

{:invocations {:default
               {:inputs [{:name :report
                          :label "Archived report"
                          :type :resource
                          :slot :archive
                          :accept ["application/pdf"]}]}}}

Notes:

installer-owned :blob-storage slots add a required setup control for the storage root
authors can customize the default/prefix with :config {:prefix ...}, but cannot disable setup
:persist :path stays relative to the configured root
resource pickers reuse the same resolved connection/root, so authors do not wire the prefix twice
end-user installations default to isolated roots; shared roots require explicit installer configuration
persisted blob resources are canonical :file resources
prefer binding invocation resource inputs by :slot

End-To-End Producer And Consumer Example

Use this pattern when Flow A writes files and Flow B later works on those files.
For example, an influencer research flow can write a retained CSV to a private
installer-scoped folder, and an outreach flow can let the same user pick that
CSV from the run form resource picker instead of downloading and uploading CSV files manually.

Producer flow:

{:requires [{:slot :archive
             :type :blob-storage
             :label "Archive storage"
             :config {:prefix {:default "reports"
                               :label "Folder prefix"}}}]
 :flow
 '(let [download (flow/step :http :download-report
                   {:connection :reports-api
                    :response-as :bytes
                    :persist {:type :blob
                              :slot :archive
                              :path "{{input.customer-id}}/{{input.run-date}}"
                              :filename "summary-{{input.report-id}}.pdf"}})]
    {:download download})}

Consumer flow:

{:requires [{:slot :archive
             :type :blob-storage
             :label "Archive storage"
             :prefers [{:flow :report-producer
                        :slot :archive}]
             :config {:prefix {:default "reports"
                               :label "Folder prefix"}}}]
 :invocations {:default
               {:inputs [{:name :report
                          :label "Archived report"
                          :type :resource
                          :slot :archive
                          :accept ["application/pdf"]}]}}
 :flow
 '(let [input (flow/input)]
    {:report (:report input)})}

By default, two end-user installations stay isolated because each one derives its own private root, such as installations/<producer-installation-id>/reports and installations/<consumer-installation-id>/reports.
They share only if both installations are explicitly configured with the same root:

{:bindings {:archive {:binding-type :connection
                      :connection-id "platform"
                      :config {:root "reports/acme"}}}}

and the producer run uses:

{:customer-id "cust-77"
 :run-date "2026-03-24"
 :report-id "rep-42"}

the stored object path is:

workspaces/<ws>/storage/reports/acme/cust-77/2026-03-24/summary-rep-42.pdf

The consumer does not need to know that path. Its runtime picker simply scopes to the same concrete storage location behind :slot :archive.

For public UX, prefer this resource picker handoff when a downstream flow should
reuse an artifact from a prior run. Keep manual upload as a fallback, but do not
make users browse all workspace resources or copy res:// values by hand.

If you know one producer flow is the intended upstream lane, add :prefers to the consumer slot.
That records the intended sharing relationship, but it does not auto-select or persist the consumer root.
To share, the installer still must explicitly save the same connection + root on both installations.

Keep these boundaries in mind:

the producer writes through its own local installer-owned slot such as :archive
the consumer reads through its own local installer-owned slot such as :archive
the installer decides whether those slots share by choosing the same or different storage roots
if two flows point at the same storage location, they share both the utility and the overwrite risk

Resource Types

Use the resource-type split like this:

:persist {:type :blob ...} creates :file resources
uploads create :file resources
:persist {:type :kv ...} creates :result resources
:persist {:type :table ...} creates :result resources backed by the :persist-table adapter
captured run and step outputs stay :result

Because persisted blobs and uploads are both :file, most resource picker fields can omit :resource-types entirely.
Add :resource-types only when you need something narrower than the default file picker, for example [:result] for structured run outputs or persisted table resources.

Table Resources

Table resources are persisted results with a bounded table-like query/edit surface.

In the resource panel, partitioned table families render as grouped table resources:

the family remains the primary resource identity
when tablePartition is omitted, the panel defaults to the newest :date-bucket table or the first bounded table for other strategies
if an explicit tablePartition is missing, the panel shows a clear warning instead of silently falling back
the panel itself is read-only; use breyta resources table ... or flow/step :table for imports and mutations
the panel keeps CSV export for the currently previewed table
the panel keeps Copy Markdown for the currently visible preview page
table selection rerenders in place inside the current panel or sidepeek
most family metadata sits behind a compact info tooltip so the preview stays focused on rows and columns

On run pages, table resource refs open the primary table preview by default. Use artifactUri=... for another resource.

Human-readable table output should include the persisted table ref. Inline :rows / :columns / :schema / :query maps are not table resources.

Create on first write:

(flow/step :http :fetch-orders
  {:url "https://example.com/orders"
   :persist {:type :table
             :table "orders"
             :rows-path [:body :items]
             :write-mode :upsert
             :key-fields [:order-id]
             :indexes [{:field :status}
                       {:field :customer-id}]}})

Use the dedicated :table step later:

(flow/step :table :open-orders
  {:op :query
   :table {:ref orders-ref}
   :where [[:status := "open"]]
   :sort [[:order-id :asc]]
   :page {:mode :offset
          :limit 25
          :offset 0}})

Query paging contract:

:page is required for :table {:op :query ...}
:table {:ref <resource-ref>} is canonical; bare refs work for simple reads.
:page.mode must be explicit as :offset or :cursor
cursor paging requires explicit :sort
the first cursor page omits :page.cursor

You can also author or evolve logical columns later:

(flow/step :table :define-customer-name
  {:op :set-column
   :table {:ref orders-ref}
   :column :customer-name
   :definition {:semantic-type :text
                :computed {:type :lookup
                           :reference-column :customer-id
                           :field :name}}})

set-column backfills bounded tables. Use :recompute only to rerun derived/reference values. For partitioned families, pass partition scope on :table.

Dynamic enum columns keep stored values stable while letting authors control rendered labels:

(flow/step :table :define-status-enum
  {:op :set-column
   :table {:ref orders-ref}
   :column :status
   :definition {:display-name "Status"
                :enum {:options [{:id "open"
                                  :name "Open"
                                  :aliases ["OPEN" "Open"]}
                                 {:id "in-progress"
                                  :name "In progress"
                                  :aliases ["IN_PROGRESS" "In Progress"]}]}}})

Enum behavior:

:enum implies type-hint "enum"
writes, :update-cell, CSV import, and :recompute normalize incoming scalar values to stable ids
matching accepts existing ids, names, and aliases
unknown values dynamically grow the enum definition with a normalized id and a derived display name
stored rows, :query, :get-row, and CSV export keep the normalized ids
the web table preview and Copy Markdown render enum names instead of raw ids

Display formatting is render-only:

column :format metadata and sparse :update-cell-format overrides can render relative-time, date, timestamp / date-time, and currency
the web table preview and Copy Markdown apply those formats to the currently visible page
CLI/API query surfaces and CSV export keep canonical raw values

Resource refs are also first-class cell values:

store canonical {:type :resource-ref :uri ...} maps in row data when a cell should point at another resource
the web table preview renders those cells as clickable resource chips and opens the target resource in the same panel or sidepeek
Copy Markdown uses the rendered label for the currently visible page
CLI/API query surfaces and CSV export keep the canonical raw resource-ref value

Table resources can also be used as invocation inputs. Declare a :resource
input filtered to :result resources and, when you want only persisted tables,
add the table MIME type:

{:invocations {:default
               {:inputs [{:name :source-table
                          :label "Source table"
                          :type :resource
                          :resource-types [:result]
                          :accept ["application/vnd.breyta.table+json"]}]}}
 :flow
 '(let [input (flow/input)
        preview (flow/step :table :preview-source
                  {:op :query
                   :table (:source-table input)
                   :page {:mode :offset
                          :limit 25}})]
    {:source-table (:source-table input)
     :preview preview})}

Important boundaries:

paged by default
query-like operations stay bounded and scoped to one table or an explicit bounded partition subset
no implicit all-partitions scans from the family root
joins only through bounded :materialize-join
no raw SQL
no cross-workspace reads

Key v1 table-family limits:

500 table resources (families) per workspace
50_000 live rows per concrete table inside a family
200 columns per table
16 promoted/index fields per table
128 partitions per family
16 partitions touched per write
12 selected partitions per query/aggregate/export
24 selected partitions per preview/read/schema
256 max partition key bytes
64 KB max cell size
256 KB max row payload
256 MB max table size
2 GB max workspace table DB size
1_000 rows per write
1_000 rows per query page
10_000 max query scan window via page.offset + page.limit
200 max aggregate groups

If you need materially more than that, use a dedicated database/query backend instead of expanding table-resource workarounds flow-by-flow.

Current design guidance when one logical dataset approaches bounded-table limits:

keep 50_000 live rows per concrete table or partition as a real boundary
use first-class :partitioning when the data naturally partitions by region, tenant, source, or a date bucket and most reads/writes stay within one partition or a small bounded subset
keep the family root as the schema/metadata owner and select partition scope explicitly for query-like operations instead of expecting implicit all-partitions scans
use separate explicit tables when the data truly represents different datasets or lifecycles, not just as a workaround for missing partition support
if the workload mainly needs wide cross-partition scans, arbitrary joins, or general database behavior, prefer a dedicated :db step and external database/query backend

Reading Persisted Content

breyta resources workflow list <workflow-id>
breyta resources read <res://...>
breyta resources search "transcript" --limit 10

Commands that help:

breyta resources search "<query>" [--limit 10] [--type result|file] [--content-sources file,result]
breyta resources search "<query>" [--limit 10] [--storage-backend gcs] [--storage-root reports/acme] [--path-prefix exports/2026]
breyta resources list [--types file] [--storage-root reports/acme] [--path-prefix exports/2026]
breyta resources workflow step <workflow-id> <step-id>
breyta resources get <res://...>
breyta resources read <res://table-uri> [--limit 100] [--offset 0]
breyta resources table query <res://table-uri> --page-mode offset --limit 100
breyta resources table query <res://table-uri> --page-mode cursor --sort-json '[[\"order-id\",\"asc\"]]'
breyta resources table get-row <res://table-uri> --row-id <row-id> or --key order-id=ord-1
breyta resources table get-row <res://table-uri> --key meeting-key=m1 --key agenda-item-number=1
breyta resources table aggregate <res://table-uri> --group-by currency --metrics-json '[...]'
breyta resources table aggregate <res://table-uri> --group-by-json '[...]' --metrics-json '[...]'
breyta resources table schema <res://table-uri>
breyta resources table export <res://table-uri> [--out orders.csv]
breyta resources table import <res://table-uri> --file orders.csv --write-mode append|upsert
breyta resources table import orders-import --file orders.csv --write-mode upsert --key-fields order-id [--index-fields status]
breyta resources table update-cell <res://table-uri> --key order-id=ord-1 --column status --value closed
breyta resources table update-cell-format <res://table-uri> --key order-id=ord-1 --column amount --format-json '{\"display\":\"currency\",\"currency\":\"USD\"}'
breyta resources table set-column <res://table-uri> --column customer-name --computed-json '{...}'
breyta resources table set-column <res://table-uri> --column status --enum-json '{...}'
breyta resources table recompute <res://table-uri> --limit 1000 --offset 0
breyta resources url <res://...>

For blobs, resources read returns a compact content preview by default. For table URIs, it returns a bounded preview page and pagination metadata. Use --full only when the complete payload is required.
Switch to resources table ... when you need the richer query, export, import, aggregate, or single-cell edit surface.
For enum columns, CLI/API query and export surfaces return the stored normalized ids; the web table preview and Copy Markdown render the configured names.
For partitioned families, single-cell edits stay within the selected partition and cannot change the partition-driving field; use a normal write/upsert when a row should land in a different table.

The bounded aggregate surface also supports:

group ordering via order-by-json
truncation visibility via hasMore
metric-local filters via where
scalar arg-max / arg-min metrics for "latest/highest row value per group" cases
having-json for post-group filtering
bounded collect-set metrics
group-by-json for date bucket and numeric-bin specs
percentile and median for bounded distribution reporting

Use storage filters like this:

storage-backend narrows by backend family, such as gcs
storage-root narrows to the installer-configured root, such as reports/acme
path-prefix narrows further inside that root, relative to it, such as exports/2026
path-prefix is relative to the configured root, not the full workspaces/<ws>/storage/... object path

That means a platform-backed persisted file stored at:

workspaces/<ws>/storage/reports/acme/exports/2026/summary.pdf

is searchable with:

--storage-backend gcs
--storage-root reports/acme
--path-prefix exports/2026

Resource Search Indexing

How persisted artifacts become searchable in breyta resources search:

search indexes metadata fields (display name, URI/path context, tags, source label)
connected persists also index normalized storage scope fields so search and pickers can filter by backend, root, and relative path
text content indexing is enabled only for text-like payloads
:tier :ephemeral blobs are metadata-indexed by default (raw content is not extracted)
binary blobs are discoverable by metadata/path context, but raw binary content is not full-text indexed
indexed text is bounded by size/character limits for stability

For connected persists, Breyta stores both the full path and normalized storage fields:

Indexed field	Meaning	Example
`path`	Full physical object path, useful for broad search/debug context	`workspaces/<ws>/storage/reports/acme/exports/2026/summary.pdf`
`storage_backend`	Backend family	`platform`
`storage_root`	Installer-configured root inside that backend	`reports/acme`
`path_under_root`	Relative path below the root	`exports/2026/summary.pdf`

That split is intentional:

free-text search can still match the full path
storage-root and path-prefix use the normalized fields instead of requiring the full workspace storage path
the same backend/root/relative-path contract can extend to future storage backends without changing authored filters

`:persist :search-index` Overrides

Use :search-index under :persist to customize indexed text/metadata for persisted artifacts (especially binary blobs), without changing stored payload bytes.

Target shape:

{:persist {:type :blob
           :path "invoices/{{input.invoice.customer-id}}"
           :filename "invoice.pdf"
           :search-index {:text "invoice-id=INV-123 vendor=Acme total=4500"
                          :tags ["invoice" "acme" "emea"]
                          :source-label "Invoice PDF from SAP import"
                          :include-raw-content? false}}}

Intended precedence:

:search-index.text overrides default indexed content text
:search-index.tags overrides/augments indexed tags
:search-index.source-label overrides derived source label
:search-index.include-raw-content? controls whether default extracted text is also included when available

Find the same persisted artifact later with flow/step :search:

'(let [artifact (flow/step :function :persist-invoice
                  {:input {:invoice invoice}
                   :code '(fn [{:keys [invoice]}] invoice)
                   :persist {:type :blob
                             :path "invoices/{{input.invoice.customer-id}}"
                             :filename "invoice.json"
                             :content-type "application/json"
                             :search-index {:text "invoice-id=INV-123 vendor=Acme total=4500"
                                            :tags ["invoice" "acme" "emea"]
                                            :source-label "Invoice bundle for Acme"
                                            :include-raw-content? true}}})
       hits (flow/step :search :find-invoice
              {:query "invoice-id=INV-123"
               :targets [:resources]
               :limit 5
               :hydrate {:enabled true
                         :top-k 1
                         :max-chars 12000}})]
   {:artifact artifact
    :hits hits})

Quick operator/debug loop:

breyta resources search "invoice-id=INV-123" --limit 10

Open The Same Artifact In Breyta Web

In API mode JSON output, resource responses can include optional webUrl links that point to the artifact context in Breyta Web:

breyta resources workflow list <workflow-id> --format json -> data.items[].webUrl (and meta.webUrl for a primary destination)
breyta resources search "<query>" --format json -> data.items[].webUrl and data.items[].displayName
breyta resources get <res://...> --format json -> data.webUrl (and usually meta.webUrl)
breyta resources url <res://...> --format json -> signed data.url plus optional data.webUrl/meta.webUrl

Quick extraction pattern:

breyta resources get <res://...> --format json | jq -r '.meta.webUrl // .data.webUrl // empty'

Signed URLs vs public preview links

breyta resources url <res://...> returns a temporary signed URL for direct
resource access. It is useful for operators, runtime handoff, and debugging, but
it is not the production outreach/share mechanism for external viewers.

Use public artifact preview links when someone should open a read-only artifact
without logging in:

POST /api/resources/shares
GET /public/artifact-previews/:token
GET /public/artifact-previews/:token/download
DELETE /api/resources/shares/:token

Send X-Breyta-Workspace: <workspace-id> on the authenticated create and
revoke API calls.

Public preview links are unlisted, revocable, optionally expiring, and render a
sanitized artifact page. The page hides workspace/run/debug metadata, raw
resource refs, private resource-content proxy URLs, common signed storage URLs,
and private resource actions. Use this for creator outreach or external review
flows where the recipient should see the opportunity or report, not the
workspace internals.

When the share request sets allowDownload: true and the shared artifact is
text-like, such as EDN, Markdown, CSV, JSON, XML, JavaScript, form data, or
plain text, the share response includes a token-scoped publicDownloadUrl. That route serves a
bounded attachment through the same revocable/expiring token. It does not expose
private resource-content URLs or signed storage URLs, and it does not serve
binary media or sandboxed HTML previews.

Cross-Flow State Handoff With KV

For shared state between runs/flows, pair result persistence with KV writes:

persist large step output as res://...
write a compact KV record that points to that ref
read KV in downstream flows and resolve ref only when needed

'(let [payload (flow/step :http :collect
                 {:connection :source-api
                  :method :get
                  :path "/records"
                  :persist {:type :blob}})
       _kv (flow/step :kv :record-latest
             {:operation :set
              :key "records:latest"
              :value {:uri (:uri payload)}
              :ttl 604800})
       latest (flow/step :kv :load-latest
                {:operation :get
                 :key "records:latest"})]
   {:latest (:value latest)})

This keeps orchestration payloads small while still giving operators a durable pointer to the latest artifact.

Design Rules

persist early when output size is uncertain
return refs instead of heavy payloads in final output
pass refs explicitly; don’t hide them in nested structures
treat persisted artifacts as durable run history
persist when payloads can grow, are reused across steps, or need operator inspection after completion
use :persist {:type :table ...} when row-shaped data should stay queryable/editable as a resource later
for cross-run/cross-flow lookup, store lightweight pointers in KV instead of duplicating large objects

Troubleshooting

downstream steps fail with large payloads: persist the producer output and re-run
resource not found: list workflow resources and match step-id to workflowId
resources commands require authenticated API mode
debug persisted refs by listing workflow resources, finding the producing step-id, and reading the target res:// URI