Docs
Reference

Step Table (:table)

Quick Answer

Use flow/step :table to query, inspect, export, edit, and evolve an existing table resource without exposing raw SQL.

Table creation happens through :persist {:type :table ...} on another step. The :table step is the bounded runtime surface for working with that persisted table later.

Worker mental model:

  • a normal flow step returns row-shaped data
  • :persist {:type :table ...} tells the runtime to write those rows into a table family, creating it on first write
  • the persisted step result becomes the canonical table {:type :resource-ref :uri ...} handle
  • later :table steps consume that ref; include it in final flow output when you want the run/resource UI to expose it directly

For partitioned table families, query-like ops should select partition scope explicitly:

  • use the family root for :schema
  • use {:ref <resource-uri> :partitions {:key "..."}} or {:ref <resource-uri> :partitions {:keys ["..." "..."]}} for :query, :get-row, :aggregate, :export, :update-cell, :update-cell-format, :set-column, and :recompute
  • do not rely on implicit all-partitions scans

Canonical Shape

Common fields:

FieldTypeRequiredNotes
:typekeywordYesMust be :table
:opkeyword/stringYesOne of :query, :get-row, :aggregate, :schema, :export, :update-cell, :update-cell-format, :set-column, :recompute, :materialize-join
:tablemap/refUsuallySingle-table target. Use {:ref <resource-uri>}; add :partitions when needed. Bare refs work for simple ops. :materialize-join uses :left, :right, and :into.
:expectkeywordNoStandard step output expectation
:provider-optsmapNoEscape hatch options for advanced runtime integration

Per-op fields:

OpRequired fieldsOptional fieldsNotes
:query:table, :page:select, :where, :sortPaged by default; :page.mode is explicit
:get-row:table plus :row-id or :keynoneFetch one row by stable id or key fields
:aggregate:table:where, :group-by, :metrics, :having, :order-by, :limitSingle-table aggregates only
:schema:tablenoneReturns columns, key/index fields, and stats
:export:table:format, :select, :where, :sortV1 export format is :csv
:update-cell:table, :column, :value, plus :row-id or :keynoneUpdates one canonical cell value
:update-cell-format:table, :column, plus :row-id or :key:formatSparse formatting override; omit/clear format to remove override
:set-column:table, :column:definitionCreate/update one logical column definition, including semantic/computed/reference/enum metadata; existing rows are backfilled automatically
:recompute:table:where, :limit, :offsetRecompute materialized computed/reference columns for existing rows
:materialize-join:left, :right, :on, :into:join-type, :project, :op-idBuild or refresh a destination table from a bounded key-based join; the same contract is also exposed through breyta resources table materialize-join for end-to-end validation

Predicate, Sort, And Metric Shapes

;; Predicates
[:status := "open"]
[:amount :>= 100]
[:title :contains "invoice"]
;; Agent/tool JSON may use {"field":"status","op":"=","value":"open"}.

;; Sort
[:updated-at :desc]
;; Agent/tool JSON may use {"field":"updated-at","direction":"desc"}.

;; Metrics
{:op :count :as :count}
{:op :sum :field :amount :as :total-amount}
{:op :count :where [[:status := "open"]] :as :open-count}
{:op :arg-max :field :order-id :order-field :amount :as :largest-order-id}
{:op :collect-set :field :currency :limit 5 :as :currencies}
{:op :percentile :field :amount :p 0.95 :as :p95-amount}
{:op :median :field :amount :as :median-amount}

;; Group-by bucket spec
{:field :created-at
 :bucket {:op :date-trunc :unit :month}
 :as :created-month}
{:field :amount
 :bucket {:op :numeric-bin :size 10}
 :as :amount-bin}

Supported predicate ops:

  • :=
  • :!=
  • :>
  • :>=
  • :<
  • :<=
  • :contains

Supported aggregate metrics:

  • :count
  • :sum
  • :avg
  • :min
  • :max
  • :count-distinct
  • :arg-max
  • :arg-min
  • :collect-set
  • :percentile
  • :median

Aggregate notes:

  • :order-by can reference group keys and metric aliases
  • :having can reference group keys and metric aliases
  • aggregate responses include :limit and :has-more when truncation is possible
  • metric-local :where enables bounded conditional metrics such as count-if and sum-if
  • :arg-max / :arg-min return the metric :field value from the row with the highest/lowest :order-field, with deterministic row-id tie-breaking
  • :collect-set returns bounded distinct values in deterministic order
  • :percentile uses continuous interpolation over the numeric values in the metric field with :p in the range 0.0..1.0
  • :median is the 0.5 percentile over the numeric metric field
  • bucketed :group-by supports {:bucket {:op :date-trunc :unit :day|:week|:month}} and {:bucket {:op :numeric-bin :size <positive-number>}}
  • numeric-bin group keys return the inclusive lower bound of the bucket as a number

Limits And Behavior

  • Table resources (families) per workspace max: 500
  • Live rows per concrete table max: 50_000
  • Columns per table max: 200
  • Promoted/index fields per table max: 16
  • Partitions per family max: 128
  • Partitions touched per write max: 16
  • Selected partitions per query/aggregate/export max: 12
  • Selected partitions per read/schema max: 24
  • Partition key bytes max: 256
  • Cell max: 64 KB
  • Row payload max: 256 KB
  • Table max: 256 MB
  • Workspace table DB max: 2 GB
  • Rows per write max: 1000
  • Query page size max: 1000
  • Query scan window max: 10000 rows via page.offset + page.limit
  • Aggregate group max: 200
  • collect-set default item max per metric: 10
  • collect-set absolute item max per metric: 25
  • percentile requires numeric :p between 0.0 and 1.0
  • Single-table only
  • No arbitrary joins
  • :materialize-join is the only bounded join-like exception, and it always materializes into a destination table
  • No cross-workspace reads
  • No arbitrary SQL

Dedicated :materialize-join limits:

  • Inline :left {:rows ...} max: 1000 rows
  • Table-source window max per side: 10000 rows via :limit and :offset
  • Output row max: 10000
  • Join key max: 4
  • Projected right-field max: 64

The :table step is intentionally bounded. It is meant to feel like a table resource primitive, not a general database query engine.

Partitioned table families are first-class. Use :partitioning on :persist {:type :table ...} when data naturally splits by region, tenant, source, or date bucket and most reads/writes stay in one small partition set.

Design guidance when a dataset approaches bounded-table limits:

  • keep 50_000 live rows per concrete table or partition as a real boundary
  • keep the family root as the schema/metadata owner and select partition scope explicitly for query-like operations instead of expecting implicit all-partitions scans
  • use separate explicit tables when the data truly represents different datasets or lifecycles, not just as a workaround for missing partition support
  • if the workload mainly needs wide cross-partition scans, arbitrary joins, or general database behavior, prefer a dedicated :db step and an external database/query backend

The most common failures here are:

  • write rejected because the table would exceed 50_000 rows
  • write rejected because the table family would exceed 128 partitions
  • write rejected because a new observed column would exceed 200 columns
  • cell rejected because it exceeds 64 KB
  • query rejected because limit > 1000
  • query rejected because page.offset + page.limit > 10000
  • query/export rejected because the selected partition subset exceeds the bounded family limit

Create-On-Write Companion Pattern

Create the table by persisting rows on first write:

(flow/step :http :fetch-orders
  {:url "https://example.com/orders"
   :persist {:type :table
             :table "orders"
             :rows-path [:body :items]
             :write-mode :upsert
             :key-fields [:order-id]
             :indexes [{:field :status}
                       {:field :customer-id}]
             :columns [{:column :customer-id
                        :semantic-type :reference
                        :reference {:table "customers"
                                    :remote-field :customer-id}}
                       {:column :customer-name
                        :semantic-type :text
                        :computed {:type :lookup
                                   :reference-column :customer-id
                                   :field :name}}]}})

That step returns a table resource ref:

{:type :resource-ref
 :uri "res://v1/ws/ws-123/result/table/tbl_..."
 :preview {:table-name "orders"
           :write-mode :upsert
           :rows-written 100}
 :write {:mode :upsert
         :rows-written 100}}

Creation notes:

  • there is no separate "create table resource" step or registration call for workers
  • the runtime creates or updates the table family as part of the persisted write
  • downstream steps should keep passing the returned resource ref, not the original raw rows, when they mean "the persisted table"

Return A Table As Final Output

When the user should see a real table artifact in the run output, return the persisted table step result from a :breyta.viewer/kind :table viewer. This is different from returning a map that happens to contain :rows or :columns.

'(let [run-id (str "run-" (flow/now-ms))
       comparison-table
       (flow/step :function :build-comparison-table
                  {:input {:rows comparison-rows
                           :run-id run-id}
                   :code '(fn [{:keys [rows run-id]}]
                            {:rows
                             (mapv (fn [row]
                                     {:run_id run-id
                                      :paragraph (:paragraph row)
                                      :original (:original row)
                                      :cleaned (:cleaned row)
                                      :changed (:changed row)})
                                   rows)})
                   :persist {:type :table
                             :table (str "transcript-comparison-" run-id)
                             :rows-path [:rows]
                             :write-mode :upsert
                             :key-fields [:run_id :paragraph]
                             :columns [{:column :paragraph
                                        :display-name "Paragraph"}
                                       {:column :original
                                        :display-name "Original"}
                                       {:column :cleaned
                                        :display-name "Cleaned"}
                                       {:column :changed
                                        :display-name "Changed"}]}})]
   {:breyta.viewer/kind :table
    :breyta.viewer/options {:title "Original vs cleaned"}
    :breyta.viewer/value comparison-table})

The table viewer value should be the resource ref returned by the persisted step:

{:type :resource-ref
 :uri "res://v1/ws/ws-123/result/table/tbl_..."
 :content-type "application/vnd.breyta.table+json"
 :preview {:rows-written 44}}

If the output page or run sidepeek says that no tables are available, verify the final output is not just inline preview data. Use breyta resources read <table-uri> to confirm that the table resource has rows.

If the table is part of a larger written report, use the Markdown output
pattern instead: return a :markdown viewer envelope and embed the persisted
table with a fenced breyta-resource block. Markdown table embeds can select
columns, filter/sort rows, render aggregate charts, and add a separate
:view :download fence for CSV source export. See
Output Artifacts.

Query it later with :table:

(flow/step :table :open-orders
  {:op :query
   :table {:ref orders-ref}
   :select [:order-id :status :amount]
   :where [[:status := "open"]]
   :sort [[:order-id :asc]]
   :page {:mode :offset
          :limit 25
          :offset 0}})

Cursor-paged forward scan:

(flow/step :table :scan-orders
  {:op :query
   :table {:ref orders-ref}
   :select [:order-id :status]
   :sort [[:order-id :asc]]
   :page {:mode :cursor
          :limit 250}})

Query rules:

  • :page is required for :op :query
  • :table {:ref <resource-ref>} is canonical; bare refs work for simple ops.
  • :page.mode must be :offset or :cursor
  • :page.mode :offset accepts :offset and does not accept :cursor
  • :page.mode :cursor accepts :cursor, requires explicit :sort, and does not accept :offset
  • the first cursor page omits :page.cursor
  • cursor-paged :page.total-count is optional

Canonical Examples

Get one row:

(flow/step :table :load-order
  {:op :get-row
   :table {:ref orders-ref}
   :key {:order-id "ord-1"}})

Aggregate:

(flow/step :table :sales-by-currency
  {:op :aggregate
   :table {:ref orders-ref}
   :group-by [:currency]
   :metrics [{:op :count
              :where [[:status := "open"]]
              :as :open-count}
             {:op :sum :field :amount :as :total-amount}
             {:op :arg-max
              :field :order-id
              :order-field :amount
              :as :largest-order-id}]
   :order-by [[:total-amount :desc]]
   :limit 20})

(flow/step :table :sales-by-month
  {:op :aggregate
   :table {:ref orders-ref}
   :group-by [{:field :created-at
               :bucket {:op :date-trunc
                        :unit :month}
               :as :created-month}]
   :metrics [{:op :count :as :count}
             {:op :collect-set
              :field :currency
              :limit 5
              :as :currencies}]
   :having [[:count :>= 2]]
   :order-by [[:created-month :asc]]
   :limit 12})

(flow/step :table :amount-distribution
  {:op :aggregate
   :table {:ref orders-ref}
   :group-by [{:field :amount
               :bucket {:op :numeric-bin
                        :size 10}
               :as :amount-bin}]
   :metrics [{:op :count :as :count}
             {:op :percentile
              :field :amount
              :p 0.95
              :as :p95-amount}
             {:op :median
              :field :amount
              :as :median-amount}]
   :order-by [[:amount-bin :asc]]
   :limit 12})

Export:

(flow/step :table :export-orders
  {:op :export
   :table {:ref orders-ref}
   :format :csv
   :select [:order-id :status :amount]})

By default, in-flow :export returns the CSV text inline. Add top-level
:persist {:type :blob ...} when the export should become a downloadable
resource ref:

(flow/step :table :export-orders-csv
  {:op :export
   :table {:ref orders-ref}
   :format :csv
   :select [:order-id :status :amount]
   :persist {:type :blob
             :filename "orders.csv"
             :content-type "text/csv"}})

Materialize a joined destination table:

(flow/step :table :materialize-orders-with-customers
  {:op :materialize-join
   :left {:rows [{:order-id "ord-1" :customer-id "cust-1" :status "open"}]}
   :right {:table "customers"
           :select [:customer-id :name :domain]}
   :join-type :left
   :on [{:left-field :customer-id
         :right-field :customer-id}]
   :project {:keep-left :all
             :right-fields [{:field :name :as :customer-name}
                            {:field :domain :as :customer-domain}]}
   :into {:table "orders-enriched"
          :write-mode :upsert
          :key-fields [:order-id]
          :index-fields [:customer-name :status]}})

materialize-join is incremental materialization in v1:

  • :into :write-mode is :append or :upsert
  • there is no snapshot or :replace mode yet
  • existing destination rows are not deleted automatically when the source set shrinks

The same applies to normal table persists with :write-mode :upsert: matching
keys are updated, new keys are inserted, and omitted rows stay in the table. If a
flow represents a current snapshot, include a run/batch key in the table keys or
partitioning and query the latest batch. Do not expect a smaller rerun to delete
rows from an earlier larger extraction.

materialize-join also uses the current materialized row state of source tables:

  • computed/reference column values already materialized into the source table are joinable/projectable
  • the join does not re-evaluate computed expressions dynamically
  • run :recompute first if source-table derived values need to be refreshed before the join

Update one value:

(flow/step :table :close-order
  {:op :update-cell
   :table {:ref orders-ref}
   :key {:order-id "ord-1"}
   :column :status
   :value "closed"})

For partitioned table families, :update-cell cannot modify the partition-driving field. Single-cell updates stay within the explicitly selected partition; moving a row to a different partition requires a normal write/upsert into the target table.

Update one formatting override:

(flow/step :table :format-amount
  {:op :update-cell-format
   :table {:ref orders-ref}
   :key {:order-id "ord-1"}
   :column :amount
	   :format {:display "currency"
	            :currency "USD"}})

Display formatting is render-only:

  • column :format metadata and sparse :update-cell-format overrides can render relative-time, date, timestamp / date-time, and currency
  • the web table preview and Copy Markdown apply those formats to the currently visible page
  • :query, :get-row, and CSV export keep canonical raw values

Resource refs are also first-class cell values:

  • store canonical {:type :resource-ref :uri ...} maps in row data when a cell should point at another resource
  • the web table preview renders those cells as clickable resource chips and opens the target resource in the same panel or sidepeek
  • Copy Markdown uses the rendered label for the currently visible page
  • :query, :get-row, and CSV export keep the canonical raw resource-ref value

Author one logical column:

(flow/step :table :define-order-summary
  {:op :set-column
   :table {:ref orders-ref}
   :column :order-summary
   :definition {:semantic-type :text
                :computed {:type :expr
                           :expr {:op :concat
                                  :args [{:field :customer-name}
                                         " / "
                                         {:field :status}]}}}})

set-column automatically recomputes existing rows for bounded tables. Use :recompute later only when you want to rerun derived/reference values after some other change.

Dynamic enum columns:

(flow/step :table :define-status-enum
  {:op :set-column
   :table {:ref orders-ref}
   :column :status
   :definition {:display-name "Status"
                :enum {:options [{:id "open"
                                  :name "Open"
                                  :aliases ["OPEN" "Open"]}
                                 {:id "in-progress"
                                  :name "In progress"
                                  :aliases ["IN_PROGRESS" "In Progress"]}]}}})

Enum behavior:

  • :enum implies type-hint "enum"
  • writes, :update-cell, CSV import, and :recompute normalize incoming scalar values to stable ids
  • matching accepts existing ids, names, and aliases
  • unknown values dynamically grow the enum definition with a normalized id and a derived display name
  • stored row values, :query, :get-row, and CSV export keep the normalized ids
  • the web table preview and Copy Markdown render enum names instead of raw ids

Result Shapes

Representative responses:

;; :query
{:table-name "orders"
 :rows [{:order-id "ord-1" :status "open"}]
 :count 1
 :page {:mode :offset
        :limit 25
        :offset 0
        :total-count 107
        :has-more true
        :next-offset 25
        :prev-offset nil}}

;; :query cursor page
{:table-name "orders"
 :rows [{:order-id "ord-1" :status "open"}]
 :count 1
 :page {:mode :cursor
        :limit 250
        :page-size 1
        :has-more true
        :next-cursor "..."}}

;; :aggregate
{:results [{:currency "USD" :count 2 :total-amount 150.0}]
 :count 1}

;; :schema
{:table-name "orders"
 :key-fields ["order-id"]
 :index-fields ["status" "customer-id"]
 :columns [...]}

;; :update-cell
{:row {:order-id "ord-1" :status "closed"}}

;; :update-cell-format
{:format {:display "currency" :currency "USD"}}

;; :set-column
{:column {:column-name "order-summary"
          :semantic-type "text"
          :computed {:type "expr"}}
 :rows-updated 100
 :auto-recomputed true}

;; :recompute
{:rows-scanned 100
 :rows-updated 100
 :limit 1000
 :offset 0}

;; :materialize-join
{:table-name "orders-enriched"
 :rows-written 100
 :join {:join-type :left
        :matched-rows 90
        :unmatched-left-rows 10}
 :preflight {:left-row-count 100
             :right-row-count 50
             :right-duplicate-key-count 0
             :destination-duplicate-key-count 0}}

CLI Pairing

The CLI mirrors the shipped table operations:

breyta resources read <res://table-uri> --limit 25 --offset 0 --partition-key month-2026-03
breyta resources table query <res://table-uri> --limit 25 --offset 0 --partition-keys month-2026-03,month-2026-04
breyta resources table query <res://table-uri> --page-mode cursor --limit 250 --sort-json '[["order-id","asc"]]' --partition-key month-2026-03
breyta resources table get-row <res://table-uri> --key order-id=ord-1 --partition-key month-2026-03
breyta resources table aggregate <res://table-uri> --group-by currency --metrics-json '[{"op":"count","where":[["status","=","open"]],"as":"open-count"},{"op":"sum","field":"amount","as":"total-amount"},{"op":"arg-max","field":"order-id","order-field":"amount","as":"largest-order-id"}]' --order-by-json '[["total-amount","desc"]]' --partition-key month-2026-03
breyta resources table aggregate <res://table-uri> --group-by-json '[{"field":"created-at","bucket":{"op":"date-trunc","unit":"month"},"as":"created-month"}]' --metrics-json '[{"op":"count","as":"count"},{"op":"collect-set","field":"currency","limit":5,"as":"currencies"}]' --having-json '[["count",">=",2]]' --order-by-json '[["created-month","asc"]]'
breyta resources table aggregate <res://table-uri> --group-by-json '[{"field":"amount","bucket":{"op":"numeric-bin","size":10},"as":"amount-bin"}]' --metrics-json '[{"op":"count","as":"count"},{"op":"percentile","field":"amount","p":0.95,"as":"p95-amount"},{"op":"median","field":"amount","as":"median-amount"}]' --order-by-json '[["amount-bin","asc"]]'
breyta resources table schema <res://table-uri> --partition-key month-2026-03
breyta resources table export <res://table-uri> --out orders.csv --partition-key month-2026-03
breyta resources table import <res://table-uri> --file orders.csv --write-mode append --partition-key month-2026-03
breyta resources table import orders-import --file orders.csv --write-mode upsert --key-fields order-id --index-fields status
breyta resources table update-cell <res://table-uri> --key order-id=ord-1 --column status --value closed --partition-key month-2026-03
breyta resources table update-cell-format <res://table-uri> --key order-id=ord-1 --column amount --format-json '{"display":"currency","currency":"USD"}' --partition-key month-2026-03
breyta resources table set-column <res://table-uri> --column order-summary --semantic-type text --computed-json '{"type":"expr","expr":{"op":"concat","args":[{"field":"customer-name"}," / ",{"field":"status"}]}}' --partition-keys month-2026-03,month-2026-04
breyta resources table set-column <res://table-uri> --column status --enum-json '{"options":[{"id":"open","name":"Open","aliases":["OPEN","Open"]},{"id":"in-progress","name":"In progress","aliases":["IN_PROGRESS","In Progress"]}]}' --partition-keys month-2026-03,month-2026-04
breyta resources table recompute <res://table-uri> --limit 1000 --offset 0 --partition-key month-2026-03
breyta resources table materialize-join --left-json '{"table":{"ref":"res://...orders"}}' --right-json '{"table":{"ref":"res://...customers"}}' --on-json '[{"left-field":"customer-id","right-field":"customer-id"}]' --project-json '[{"field":"name","as":"customer-name"}]' --into-json '{"table":"joined-orders","write-mode":"upsert","key-fields":["order-id"]}'

Related

As of May 15, 2026