Reliability And Concurrency Reference
Load this reference before creating or editing fanout, concurrency, waits,
retries, idempotency, large artifacts, checkpoints, or failure behavior.
Reliability Planning
Before push:
- define every side effect and what must happen exactly once
- name the idempotency or duplicate-protection key for each side-effectful path
- classify every external call as retryable or fail-fast
- set timeout expectations for each external boundary
- choose concurrency intentionally and state why
- define payload strategy: inline small data, persist large artifacts, pass refs for large blobs
- define cursor/checkpoint behavior for partial failure and replay
- define exact runtime proof: result fields, counters, child runs, or resources that prove success
Concurrency Defaults
- Use sequential behavior when order matters, when mutating shared state, when moving large artifacts, or when external systems are slow or fragile.
- Use fanout only for independent, bounded, side-effect-safe items with explicit timeout awareness.
- Use keyed concurrency when work must serialize per entity while allowing cross-entity parallelism.
- If concurrency is not clearly beneficial, default to sequential.
flows runreturnsconcurrencyDecisionandconcurrencyKey; use those
fields plus run timing to prove whether a run started, coexisted, queued,
blocked, or was rejected.
Answer these before build:
- What can run in parallel safely?
- What must never overlap?
- What shared state or side effect could be corrupted?
- What is the per-item timeout?
- What happens when one item stalls?
- How are partial successes summarized without losing failed items?
Keyed Concurrency And Raw Entrypoints
For webhook, event, and interface-started flows, :concurrency {:type :keyed ...} is
evaluated before the flow body can normalize input. The :key-field must exist
in the raw entrypoint payload that starts the run, not only in a derived
request_key or normalized function output.
Use a root field when the provider gives one. For a GitHub pull request webhook,
:number is a safer key than a later normalized :request-key. If the key is
nested, use the raw nested path, such as [:pull_request :id], and prove it
with a real webhook-shaped payload before release.
Fanout
Use fanout for bounded independent work. Avoid fanout for large file transfer,
cursor advancement, or shared-state mutation unless docs and tests show it is
safe.
For fanout flows:
- keep item payloads small
- persist large artifacts and pass refs
- include a per-item result shape with success/failure fields
- summarize counts and failures in final output
- prove the chosen mode with evidence: counts, failures, child runs, and no skipped/reprocessed items
- when fanout starts child workflows, group the parent and child flows with shared flow metadata so users can see the relationship; grouping is only display metadata, not runtime orchestration
Paging And Loops
Breyta can handle paging and looping, but loops must be explicit, bounded, and
checkpoint-aware. Do not assume a single provider call is enough for large
datasets, and do not build unbounded "keep fetching until done" flows.
Preferred paging shapes:
- one packaged
:stepswrapper or:function-backed step that pages a provider
API up tomax-pages/max-items, persists each page or final rows, and
returns counts plus resource refs - table resource paging with
:table {:op :query ... :page {:mode :cursor ...}}
and explicit:sort - multiple small runs or a child flow when each page is heavy or side-effectful
- cursor/checkpoint state that advances only after the page's durable writes or
side effects have succeeded
Use flow/poll for external async job completion, not generic data paging.
If a loop includes repeated flow/step calls, keep the bound small and consider
rate-limit pauses with :sleep. For unknown/unbounded data, persist page
results and pass resource refs instead of carrying accumulated bodies inline.
Fixed Delays
Use :sleep for timer delays, deployment-lag buffers, polling gaps, or
rate-limit spacing. Use :wait only when the run should pause for an external
signal, webhook, CLI action, or human action.
Example:
(flow/step :sleep :deployment-lag {:duration "10m"})
Retries And Checkpoints
- Retries should be bounded and reserved for transient failures.
- Never advance cursors/checkpoints past failed work.
- Resume/replay behavior must be explicit for partial success paths.
- Rerun once when feasible to verify idempotency and duplicate protection.
Large Artifacts
- Persist large artifacts and pass resource refs.
- Prefer child flows to isolate heavyweight artifact creation and handoff.
- Avoid moving large bodies through many steps.
- Use
breyta resources get,breyta resources read, andbreyta resources urlto prove artifact availability.
Verification
Before release, verify:
- happy path
- no-item or no-op path when applicable
- partial failure or retry path when feasible
- replay/rerun behavior for duplicate protection
- output/resource evidence for the user-facing result
After release, capture live smoke proof when side effects are safe.