Human-in-the-Loop Workflows: Enhancing Automation with Human Intelligence
By Chris Moen • Published 2026-05-12
Explore human-in-the-loop workflows, integrating approvals, waits, and callbacks to enhance automation with human intelligence for safer, more reliable operations.
Human-in-the-loop workflows add deliberate checkpoints to an automated run. Approvals pause execution until a person decides. Callbacks resume the run when an external system posts a result.
What this means in practice
Human-in-the-loop means the workflow stops at key points and waits for a signal. That signal can come from a person or another system.
The core pieces:
- Approvals. A human reviews a pending action and chooses approve or reject. The OpenAI Agents SDK describes this pause-and-approve pattern.
- Waits. The workflow enters a holding state until input arrives. It keeps state intact.
- Callbacks. An external system calls back to a URL with a payload. The workflow resumes with that data. Google Cloud shows this with a callback-based translation example.
- Notifications. The system alerts the right reviewer with context.
- Timeouts and fallbacks. If no response arrives, the flow escalates or aborts.
This is not new in orchestration. Apache Airflow discusses similar human-in-the-loop checkpoints.
Why it matters for production workflows
Checkpoints raise quality and reduce risk. They also let you run longer, more complex jobs.
Key reasons:
- Safety gates. Stop before running sensitive actions. Example. Merging code or issuing refunds.
- Compliance. Keep approvals and changes auditable.
- Long-running jobs. Offload heavy work to workers. Resume later by callback.
- Human expertise. Ask for extra details or a quick judgment when the model or system is not sure.
- Operational clarity. Every pause, decision, and resume is recorded.
What teams should look for
You want reliability and clear control.
Look for:
- Deterministic execution with step-by-step run history.
- First-class approvals and wait primitives. Manual, schedule, and webhook triggers.
- Callback support with stable URLs and structured payloads.
- Versioning of flows. Draft vs live separation. Safe releases.
- Notifications and scoped permissions for reviewers.
- Clear error handling, retries, and recovery.
- Resource handling for large artifacts. Do not push huge blobs through every step.
- Scriptable operation. A CLI or API with stable JSON output.
- Fit for agent workloads. Local runs, VMs over SSH, long waits, and human-in-the-loop checkpoints.
How Breyta fits this use case
Breyta is a workflow and agent orchestration platform for coding agents. It helps teams build, run, and publish reliable workflows, agents, and autonomous jobs.
What Breyta provides for human-in-the-loop:
- Approvals and waits. Breyta includes wait steps, approvals, external callbacks, and notifications as first-class features.
- Long-running patterns. You can kick off remote work over SSH, pause with a wait, then resume when the worker posts back to a callback URL. This avoids fragile long-lived connections and keeps workflow state intact.
- Local and VM agents. Breyta can orchestrate local coding-agent runs and VM-backed agents. It can hold for callbacks or human review and then continue the flow.
- Deterministic runs and history. Every step output is inspectable. You get clear run logs.
- Versioned releases. Author as draft. Inspect behavior. Release to a stable live target when approved.
- Resource model. Persist large outputs and pass compact res:// references. This keeps workflow state lean and artifacts inspectable.
- CLI and agent-first operation. The CLI returns stable JSON, so your coding agent can author flows, run drafts, inspect runs, and release safely.
Safe examples from current usage:
- Generate social drafts on a VM, store memory, request approval, and dispatch approved posts.
- Validate a repair in draft, get human approval, then release and promote it live.
- SSH to a support agent VM, process a task, and resume after a callback.
Operational scope:
- Breyta handles execution, state, retries, and recovery.
- You connect external systems, APIs, VMs, or SSH targets when needed. The orchestration stays in Breyta.
- You connect accounts once. Secrets are stored securely. Workflows reference connections instead of raw credentials.
- Pricing facts. Breyta has unlimited users, workflows, steps per flow, and concurrent executions. Billing is based on monthly step executions. Triggers, waits, and approval steps do not count as billable step executions.
Common patterns: approvals, waits, and callbacks in Breyta
Pattern 1. Review before action
- Trigger. Manual or webhook.
- Steps. Prepare change. Generate a PR payload or an ops plan.
- Wait. Pause for human approval.
- Approve. Apply the change or merge. Reject. Post a note and stop.
- Notes. Run history keeps the draft, decision, and final result.
Pattern 2. Remote worker with callback
- Trigger. Event or schedule.
- Step. SSH into a VM. Start a long job.
- Wait. Hold on a callback URL.
- Callback. Worker posts structured results. The flow resumes.
- Finish. Persist outputs as resources. Notify stakeholders.
Pattern 3. Content draft to publish
- Trigger. Manual or scheduled.
- Steps. Generate drafts with an agent. Persist long outputs as res://.
- Wait. Request approval with links to artifacts.
- Approve. Publish. Reject. Send feedback and stop.
How callbacks work in practice
- The workflow issues a callback URL when it enters a wait state.
- An external system finishes work and posts to that URL with JSON.
- The workflow resumes at the next step with that payload.
- If no callback arrives by a deadline, escalate or cancel.
This mirrors the general pattern shown in the Google Cloud callbacks tutorial.
Pitfalls to avoid
- Unclear approval scope. Define what approve means. Read only or apply changes.
- Missing context. Give reviewers the diff, logs, and links to artifacts.
- Blob bloat. Persist large outputs and pass references.
- Opaque pauses. Always notify the right channel with a link to the run.
- No timeouts. Set time limits and fallback paths for stuck waits.
FAQ
What is a callback URL?
A callback URL is an endpoint the workflow exposes when it pauses. An external system calls that URL with results. The workflow then resumes. See the general idea in the OpenAI approval flow and the Google Cloud callbacks example.
Can Breyta run long jobs on VMs and still keep state?
Yes. Breyta supports a remote-agent pattern. Start work over SSH. Pause with a wait step. Resume when the remote worker posts back. The workflow holds state without keeping a long SSH session open.
How does Breyta handle large outputs during review?
Persist them as resources. Pass compact res:// references through steps. Inspect artifacts with CLI resource commands. This keeps run state small and traceable.
How do approvals show up in Breyta?
Approvals and waits are built in. You can pause for human confirmation, wait for external systems, and resume later with state preserved. Run history shows each step and decision.
Summary
Human-in-the-loop workflows add safety and control. Approvals, waits, and callbacks let you blend automation with judgment. Breyta brings these patterns into a reliable workflow runtime with clear history, versioned releases, and agent-first operation. For a deeper look at the idea, see Breyta’s post on human-in-the-loop agent workflows.