Step Table (:table)
Quick Answer
Use flow/step :table to query, inspect, export, edit, and evolve an existing table resource without exposing raw SQL.
Table creation happens through :persist {:type :table ...} on another step. The :table step is the bounded runtime surface for working with that persisted table later.
Worker mental model:
- a normal flow step returns row-shaped data
:persist {:type :table ...}tells the runtime to write those rows into a table family, creating it on first write- the persisted step result becomes the canonical table
{:type :resource-ref :uri ...}handle - later
:tablesteps consume that ref; include it in final flow output when you want the run/resource UI to expose it directly
For partitioned table families, query-like ops should select partition scope explicitly:
- use the family root for
:schema - use
{:ref <resource-uri> :partitions {:key "..."}}or{:ref <resource-uri> :partitions {:keys ["..." "..."]}}for:query,:get-row,:aggregate,:export,:update-cell,:update-cell-format,:set-column, and:recompute - do not rely on implicit all-partitions scans
Canonical Shape
Common fields:
| Field | Type | Required | Notes |
|---|---|---|---|
:type | keyword | Yes | Must be :table |
:op | keyword/string | Yes | One of :query, :get-row, :aggregate, :schema, :export, :update-cell, :update-cell-format, :set-column, :recompute, :materialize-join |
:table | map/ref | Usually | Single-table target. Use {:ref <resource-uri>}; add :partitions when needed. Bare refs work for simple ops. :materialize-join uses :left, :right, and :into. |
:expect | keyword | No | Standard step output expectation |
:provider-opts | map | No | Escape hatch options for advanced runtime integration |
Per-op fields:
| Op | Required fields | Optional fields | Notes |
|---|---|---|---|
:query | :table, :page | :select, :where, :sort | Paged by default; :page.mode is explicit |
:get-row | :table plus :row-id or :key | none | Fetch one row by stable id or key fields |
:aggregate | :table | :where, :group-by, :metrics, :having, :order-by, :limit | Single-table aggregates only |
:schema | :table | none | Returns columns, key/index fields, and stats |
:export | :table | :format, :select, :where, :sort | V1 export format is :csv |
:update-cell | :table, :column, :value, plus :row-id or :key | none | Updates one canonical cell value |
:update-cell-format | :table, :column, plus :row-id or :key | :format | Sparse formatting override; omit/clear format to remove override |
:set-column | :table, :column | :definition | Create/update one logical column definition, including semantic/computed/reference/enum metadata; existing rows are backfilled automatically |
:recompute | :table | :where, :limit, :offset | Recompute materialized computed/reference columns for existing rows |
:materialize-join | :left, :right, :on, :into | :join-type, :project, :op-id | Build or refresh a destination table from a bounded key-based join; the same contract is also exposed through breyta resources table materialize-join for end-to-end validation |
Predicate, Sort, And Metric Shapes
;; Predicates
[:status := "open"]
[:amount :>= 100]
[:title :contains "invoice"]
;; Agent/tool JSON may use {"field":"status","op":"=","value":"open"}.
;; Sort
[:updated-at :desc]
;; Agent/tool JSON may use {"field":"updated-at","direction":"desc"}.
;; Metrics
{:op :count :as :count}
{:op :sum :field :amount :as :total-amount}
{:op :count :where [[:status := "open"]] :as :open-count}
{:op :arg-max :field :order-id :order-field :amount :as :largest-order-id}
{:op :collect-set :field :currency :limit 5 :as :currencies}
{:op :percentile :field :amount :p 0.95 :as :p95-amount}
{:op :median :field :amount :as :median-amount}
;; Group-by bucket spec
{:field :created-at
:bucket {:op :date-trunc :unit :month}
:as :created-month}
{:field :amount
:bucket {:op :numeric-bin :size 10}
:as :amount-bin}
Supported predicate ops:
:=:!=:>:>=:<:<=:contains
Supported aggregate metrics:
:count:sum:avg:min:max:count-distinct:arg-max:arg-min:collect-set:percentile:median
Aggregate notes:
:order-bycan reference group keys and metric aliases:havingcan reference group keys and metric aliases- aggregate responses include
:limitand:has-morewhen truncation is possible - metric-local
:whereenables bounded conditional metrics such as count-if and sum-if :arg-max/:arg-minreturn the metric:fieldvalue from the row with the highest/lowest:order-field, with deterministic row-id tie-breaking:collect-setreturns bounded distinct values in deterministic order:percentileuses continuous interpolation over the numeric values in the metric field with:pin the range0.0..1.0:medianis the0.5percentile over the numeric metric field- bucketed
:group-bysupports{:bucket {:op :date-trunc :unit :day|:week|:month}}and{:bucket {:op :numeric-bin :size <positive-number>}} - numeric-bin group keys return the inclusive lower bound of the bucket as a number
Limits And Behavior
- Table resources (families) per workspace max:
500 - Live rows per concrete table max:
50_000 - Columns per table max:
200 - Promoted/index fields per table max:
16 - Partitions per family max:
128 - Partitions touched per write max:
16 - Selected partitions per query/aggregate/export max:
12 - Selected partitions per read/schema max:
24 - Partition key bytes max:
256 - Cell max:
64 KB - Row payload max:
256 KB - Table max:
256 MB - Workspace table DB max:
2 GB - Rows per write max:
1000 - Query page size max:
1000 - Query scan window max:
10000rows viapage.offset + page.limit - Aggregate group max:
200 collect-setdefault item max per metric:10collect-setabsolute item max per metric:25percentilerequires numeric:pbetween0.0and1.0- Single-table only
- No arbitrary joins
:materialize-joinis the only bounded join-like exception, and it always materializes into a destination table- No cross-workspace reads
- No arbitrary SQL
Dedicated :materialize-join limits:
- Inline
:left {:rows ...}max:1000rows - Table-source window max per side:
10000rows via:limitand:offset - Output row max:
10000 - Join key max:
4 - Projected right-field max:
64
The :table step is intentionally bounded. It is meant to feel like a table resource primitive, not a general database query engine.
Partitioned table families are first-class. Use :partitioning on :persist {:type :table ...} when data naturally splits by region, tenant, source, or date bucket and most reads/writes stay in one small partition set.
Design guidance when a dataset approaches bounded-table limits:
- keep
50_000live rows per concrete table or partition as a real boundary - keep the family root as the schema/metadata owner and select partition scope explicitly for query-like operations instead of expecting implicit all-partitions scans
- use separate explicit tables when the data truly represents different datasets or lifecycles, not just as a workaround for missing partition support
- if the workload mainly needs wide cross-partition scans, arbitrary joins, or general database behavior, prefer a dedicated
:dbstep and an external database/query backend
The most common failures here are:
- write rejected because the table would exceed
50_000rows - write rejected because the table family would exceed
128partitions - write rejected because a new observed column would exceed
200columns - cell rejected because it exceeds
64 KB - query rejected because
limit > 1000 - query rejected because
page.offset + page.limit > 10000 - query/export rejected because the selected partition subset exceeds the bounded family limit
Create-On-Write Companion Pattern
Create the table by persisting rows on first write:
(flow/step :http :fetch-orders
{:url "https://example.com/orders"
:persist {:type :table
:table "orders"
:rows-path [:body :items]
:write-mode :upsert
:key-fields [:order-id]
:indexes [{:field :status}
{:field :customer-id}]
:columns [{:column :customer-id
:semantic-type :reference
:reference {:table "customers"
:remote-field :customer-id}}
{:column :customer-name
:semantic-type :text
:computed {:type :lookup
:reference-column :customer-id
:field :name}}]}})
That step returns a table resource ref:
{:type :resource-ref
:uri "res://v1/ws/ws-123/result/table/tbl_..."
:preview {:table-name "orders"
:write-mode :upsert
:rows-written 100}
:write {:mode :upsert
:rows-written 100}}
Creation notes:
- there is no separate "create table resource" step or registration call for workers
- the runtime creates or updates the table family as part of the persisted write
- downstream steps should keep passing the returned resource ref, not the original raw rows, when they mean "the persisted table"
Return A Table As Final Output
When the user should see a real table artifact in the run output, return the persisted table step result from a :breyta.viewer/kind :table viewer. This is different from returning a map that happens to contain :rows or :columns.
'(let [run-id (str "run-" (flow/now-ms))
comparison-table
(flow/step :function :build-comparison-table
{:input {:rows comparison-rows
:run-id run-id}
:code '(fn [{:keys [rows run-id]}]
{:rows
(mapv (fn [row]
{:run_id run-id
:paragraph (:paragraph row)
:original (:original row)
:cleaned (:cleaned row)
:changed (:changed row)})
rows)})
:persist {:type :table
:table (str "transcript-comparison-" run-id)
:rows-path [:rows]
:write-mode :upsert
:key-fields [:run_id :paragraph]
:columns [{:column :paragraph
:display-name "Paragraph"}
{:column :original
:display-name "Original"}
{:column :cleaned
:display-name "Cleaned"}
{:column :changed
:display-name "Changed"}]}})]
{:breyta.viewer/kind :table
:breyta.viewer/options {:title "Original vs cleaned"}
:breyta.viewer/value comparison-table})
The table viewer value should be the resource ref returned by the persisted step:
{:type :resource-ref
:uri "res://v1/ws/ws-123/result/table/tbl_..."
:content-type "application/vnd.breyta.table+json"
:preview {:rows-written 44}}
If the output page or run sidepeek says that no tables are available, verify the final output is not just inline preview data. Use breyta resources read <table-uri> to confirm that the table resource has rows.
If the table is part of a larger written report, use the Markdown output
pattern instead: return a :markdown viewer envelope and embed the persisted
table with a fenced breyta-resource block. Markdown table embeds can select
columns, filter/sort rows, render aggregate charts, and add a separate
:view :download fence for CSV source export. See
Output Artifacts.
Query it later with :table:
(flow/step :table :open-orders
{:op :query
:table {:ref orders-ref}
:select [:order-id :status :amount]
:where [[:status := "open"]]
:sort [[:order-id :asc]]
:page {:mode :offset
:limit 25
:offset 0}})
Cursor-paged forward scan:
(flow/step :table :scan-orders
{:op :query
:table {:ref orders-ref}
:select [:order-id :status]
:sort [[:order-id :asc]]
:page {:mode :cursor
:limit 250}})
Query rules:
:pageis required for:op :query:table {:ref <resource-ref>}is canonical; bare refs work for simple ops.:page.modemust be:offsetor:cursor:page.mode :offsetaccepts:offsetand does not accept:cursor:page.mode :cursoraccepts:cursor, requires explicit:sort, and does not accept:offset- the first cursor page omits
:page.cursor - cursor-paged
:page.total-countis optional
Canonical Examples
Get one row:
(flow/step :table :load-order
{:op :get-row
:table {:ref orders-ref}
:key {:order-id "ord-1"}})
Aggregate:
(flow/step :table :sales-by-currency
{:op :aggregate
:table {:ref orders-ref}
:group-by [:currency]
:metrics [{:op :count
:where [[:status := "open"]]
:as :open-count}
{:op :sum :field :amount :as :total-amount}
{:op :arg-max
:field :order-id
:order-field :amount
:as :largest-order-id}]
:order-by [[:total-amount :desc]]
:limit 20})
(flow/step :table :sales-by-month
{:op :aggregate
:table {:ref orders-ref}
:group-by [{:field :created-at
:bucket {:op :date-trunc
:unit :month}
:as :created-month}]
:metrics [{:op :count :as :count}
{:op :collect-set
:field :currency
:limit 5
:as :currencies}]
:having [[:count :>= 2]]
:order-by [[:created-month :asc]]
:limit 12})
(flow/step :table :amount-distribution
{:op :aggregate
:table {:ref orders-ref}
:group-by [{:field :amount
:bucket {:op :numeric-bin
:size 10}
:as :amount-bin}]
:metrics [{:op :count :as :count}
{:op :percentile
:field :amount
:p 0.95
:as :p95-amount}
{:op :median
:field :amount
:as :median-amount}]
:order-by [[:amount-bin :asc]]
:limit 12})
Export:
(flow/step :table :export-orders
{:op :export
:table {:ref orders-ref}
:format :csv
:select [:order-id :status :amount]})
By default, in-flow :export returns the CSV text inline. Add top-level
:persist {:type :blob ...} when the export should become a downloadable
resource ref:
(flow/step :table :export-orders-csv
{:op :export
:table {:ref orders-ref}
:format :csv
:select [:order-id :status :amount]
:persist {:type :blob
:filename "orders.csv"
:content-type "text/csv"}})
Materialize a joined destination table:
(flow/step :table :materialize-orders-with-customers
{:op :materialize-join
:left {:rows [{:order-id "ord-1" :customer-id "cust-1" :status "open"}]}
:right {:table "customers"
:select [:customer-id :name :domain]}
:join-type :left
:on [{:left-field :customer-id
:right-field :customer-id}]
:project {:keep-left :all
:right-fields [{:field :name :as :customer-name}
{:field :domain :as :customer-domain}]}
:into {:table "orders-enriched"
:write-mode :upsert
:key-fields [:order-id]
:index-fields [:customer-name :status]}})
materialize-join is incremental materialization in v1:
:into :write-modeis:appendor:upsert- there is no snapshot or
:replacemode yet - existing destination rows are not deleted automatically when the source set shrinks
The same applies to normal table persists with :write-mode :upsert: matching
keys are updated, new keys are inserted, and omitted rows stay in the table. If a
flow represents a current snapshot, include a run/batch key in the table keys or
partitioning and query the latest batch. Do not expect a smaller rerun to delete
rows from an earlier larger extraction.
materialize-join also uses the current materialized row state of source tables:
- computed/reference column values already materialized into the source table are joinable/projectable
- the join does not re-evaluate computed expressions dynamically
- run
:recomputefirst if source-table derived values need to be refreshed before the join
Update one value:
(flow/step :table :close-order
{:op :update-cell
:table {:ref orders-ref}
:key {:order-id "ord-1"}
:column :status
:value "closed"})
For partitioned table families, :update-cell cannot modify the partition-driving field. Single-cell updates stay within the explicitly selected partition; moving a row to a different partition requires a normal write/upsert into the target table.
Update one formatting override:
(flow/step :table :format-amount
{:op :update-cell-format
:table {:ref orders-ref}
:key {:order-id "ord-1"}
:column :amount
:format {:display "currency"
:currency "USD"}})
Display formatting is render-only:
- column
:formatmetadata and sparse:update-cell-formatoverrides can renderrelative-time,date,timestamp/date-time, andcurrency - the web table preview and
Copy Markdownapply those formats to the currently visible page :query,:get-row, and CSV export keep canonical raw values
Resource refs are also first-class cell values:
- store canonical
{:type :resource-ref :uri ...}maps in row data when a cell should point at another resource - the web table preview renders those cells as clickable resource chips and opens the target resource in the same panel or sidepeek
Copy Markdownuses the rendered label for the currently visible page:query,:get-row, and CSV export keep the canonical raw resource-ref value
Author one logical column:
(flow/step :table :define-order-summary
{:op :set-column
:table {:ref orders-ref}
:column :order-summary
:definition {:semantic-type :text
:computed {:type :expr
:expr {:op :concat
:args [{:field :customer-name}
" / "
{:field :status}]}}}})
set-column automatically recomputes existing rows for bounded tables. Use :recompute later only when you want to rerun derived/reference values after some other change.
Dynamic enum columns:
(flow/step :table :define-status-enum
{:op :set-column
:table {:ref orders-ref}
:column :status
:definition {:display-name "Status"
:enum {:options [{:id "open"
:name "Open"
:aliases ["OPEN" "Open"]}
{:id "in-progress"
:name "In progress"
:aliases ["IN_PROGRESS" "In Progress"]}]}}})
Enum behavior:
:enumimpliestype-hint "enum"- writes,
:update-cell, CSV import, and:recomputenormalize incoming scalar values to stable ids - matching accepts existing ids, names, and aliases
- unknown values dynamically grow the enum definition with a normalized id and a derived display name
- stored row values,
:query,:get-row, and CSV export keep the normalized ids - the web table preview and
Copy Markdownrender enum names instead of raw ids
Result Shapes
Representative responses:
;; :query
{:table-name "orders"
:rows [{:order-id "ord-1" :status "open"}]
:count 1
:page {:mode :offset
:limit 25
:offset 0
:total-count 107
:has-more true
:next-offset 25
:prev-offset nil}}
;; :query cursor page
{:table-name "orders"
:rows [{:order-id "ord-1" :status "open"}]
:count 1
:page {:mode :cursor
:limit 250
:page-size 1
:has-more true
:next-cursor "..."}}
;; :aggregate
{:results [{:currency "USD" :count 2 :total-amount 150.0}]
:count 1}
;; :schema
{:table-name "orders"
:key-fields ["order-id"]
:index-fields ["status" "customer-id"]
:columns [...]}
;; :update-cell
{:row {:order-id "ord-1" :status "closed"}}
;; :update-cell-format
{:format {:display "currency" :currency "USD"}}
;; :set-column
{:column {:column-name "order-summary"
:semantic-type "text"
:computed {:type "expr"}}
:rows-updated 100
:auto-recomputed true}
;; :recompute
{:rows-scanned 100
:rows-updated 100
:limit 1000
:offset 0}
;; :materialize-join
{:table-name "orders-enriched"
:rows-written 100
:join {:join-type :left
:matched-rows 90
:unmatched-left-rows 10}
:preflight {:left-row-count 100
:right-row-count 50
:right-duplicate-key-count 0
:destination-duplicate-key-count 0}}
CLI Pairing
The CLI mirrors the shipped table operations:
breyta resources read <res://table-uri> --limit 25 --offset 0 --partition-key month-2026-03
breyta resources table query <res://table-uri> --limit 25 --offset 0 --partition-keys month-2026-03,month-2026-04
breyta resources table query <res://table-uri> --page-mode cursor --limit 250 --sort-json '[["order-id","asc"]]' --partition-key month-2026-03
breyta resources table get-row <res://table-uri> --key order-id=ord-1 --partition-key month-2026-03
breyta resources table aggregate <res://table-uri> --group-by currency --metrics-json '[{"op":"count","where":[["status","=","open"]],"as":"open-count"},{"op":"sum","field":"amount","as":"total-amount"},{"op":"arg-max","field":"order-id","order-field":"amount","as":"largest-order-id"}]' --order-by-json '[["total-amount","desc"]]' --partition-key month-2026-03
breyta resources table aggregate <res://table-uri> --group-by-json '[{"field":"created-at","bucket":{"op":"date-trunc","unit":"month"},"as":"created-month"}]' --metrics-json '[{"op":"count","as":"count"},{"op":"collect-set","field":"currency","limit":5,"as":"currencies"}]' --having-json '[["count",">=",2]]' --order-by-json '[["created-month","asc"]]'
breyta resources table aggregate <res://table-uri> --group-by-json '[{"field":"amount","bucket":{"op":"numeric-bin","size":10},"as":"amount-bin"}]' --metrics-json '[{"op":"count","as":"count"},{"op":"percentile","field":"amount","p":0.95,"as":"p95-amount"},{"op":"median","field":"amount","as":"median-amount"}]' --order-by-json '[["amount-bin","asc"]]'
breyta resources table schema <res://table-uri> --partition-key month-2026-03
breyta resources table export <res://table-uri> --out orders.csv --partition-key month-2026-03
breyta resources table import <res://table-uri> --file orders.csv --write-mode append --partition-key month-2026-03
breyta resources table import orders-import --file orders.csv --write-mode upsert --key-fields order-id --index-fields status
breyta resources table update-cell <res://table-uri> --key order-id=ord-1 --column status --value closed --partition-key month-2026-03
breyta resources table update-cell-format <res://table-uri> --key order-id=ord-1 --column amount --format-json '{"display":"currency","currency":"USD"}' --partition-key month-2026-03
breyta resources table set-column <res://table-uri> --column order-summary --semantic-type text --computed-json '{"type":"expr","expr":{"op":"concat","args":[{"field":"customer-name"}," / ",{"field":"status"}]}}' --partition-keys month-2026-03,month-2026-04
breyta resources table set-column <res://table-uri> --column status --enum-json '{"options":[{"id":"open","name":"Open","aliases":["OPEN","Open"]},{"id":"in-progress","name":"In progress","aliases":["IN_PROGRESS","In Progress"]}]}' --partition-keys month-2026-03,month-2026-04
breyta resources table recompute <res://table-uri> --limit 1000 --offset 0 --partition-key month-2026-03
breyta resources table materialize-join --left-json '{"table":{"ref":"res://...orders"}}' --right-json '{"table":{"ref":"res://...customers"}}' --on-json '[{"left-field":"customer-id","right-field":"customer-id"}]' --project-json '[{"field":"name","as":"customer-name"}]' --into-json '{"table":"joined-orders","write-mode":"upsert","key-fields":["order-id"]}'