Step DB BigQuery
Quick Answer
Use :db with :database :bigquery, a bound :connection, and either :sql or :template.
Use named parameters in :params (map) and @name placeholders in SQL.
Canonical Shape
| Field | Type | Required | Notes |
|---|---|---|---|
:database | keyword/string | Yes | Must be :bigquery |
:connection | keyword/string | Yes | Bound BigQuery connection |
:sql | string | Yes* | Inline SQL |
:template | keyword | Yes* | :db-query template id |
:params | map | No | Named query params ({:date "2026-02-01"}) |
:dry-run | boolean | No | Validate and estimate without executing |
:timeout | int | No | Query timeout in seconds |
:max-results | int | No | Row cap (up to platform max) |
* Provide exactly one of :sql or :template.
Canonical Example (Template + Named Params)
{:templates [{:id :daily-usage
:type :db-query
:sql "select workspace_id, sum(tokens) as total_tokens from `billing.usage` where usage_date = @usage_date group by workspace_id"}]
:functions [{:id :to-bigquery-params
:language :clojure
:code "(fn [input] {:usage_date (:usage-date input)})"}]
:flow
'(let [params (flow/step :function :build-params
{:ref :to-bigquery-params
:input (flow/input)})
rows (flow/step :db :query-usage
{:database :bigquery
:connection :billing-bq
:template :daily-usage
:params params
:max-results 2000})]
{:rows rows
:row-count (count rows)})}
Limits And Behavior
- Prefer templates for longer SQL and easier review
- Use named params instead of inline literals
:dry-run trueis useful during authoring to validate shape and cost before full execution
Connection And Binding Recipe
{:requires [{:slot :billing-bq
:type :database
:backends #{:bigquery}
:label "Billing BigQuery"}]}
breyta connections list --type database
breyta flows configure <slug> --set billing-bq.conn=conn-...
Parameter Pitfalls
- use
:paramsas a map (named params), not a positional vector - query placeholder names must match map keys (
@usage_date<->{:usage_date ...}) - avoid inline literals for dynamic values; keep values in
:params - prepare params via a function step so your SQL templates stay static and reusable
Query And Cost Design Checklist
- filter by partition/date early to reduce bytes scanned
- aggregate in BigQuery before returning to flows
- use
:dry-run trueduring authoring to validate and estimate cost - cap result volume with
:max-results, then persist downstream artifacts instead of returning large row sets
Failure Triage
- permission/auth failures: verify service account roles and project access
- project/config failures: ensure bound connection has the target GCP project
- syntax/params failures: verify
@paramusage and map keys - timeouts/slow queries: reduce scan size first, then tune timeout
Output Shape
BigQuery DB steps return rows as vectors in flow output.