How We Use Breyta to Run a Daily AI Coding Agent on Our Codebase

By Andreas Flakstad • Published 2026-03-16

Learn how Breyta powers our daily AI coding agent, delivering small, impactful pull requests to improve our codebase every morning. Discover the simple yet powerful workflow.

Every morning we wake up to a small pull request that improves our codebase.

That PR was created overnight by an AI coding agent running against our repository. We use Breyta to run that agent every night, and whenever we want to trigger it manually. Breyta connects to a VM we control over SSH, starts a Codex agent against the repo, and asks it to find one small worthwhile improvement. The agent only opens a pull request for a change that stays small, preserves behavior, and passes the relevant repo checks.

The pattern is simple, but useful. A lot of code quality work is important and easy to postpone. There is always something more urgent than cleanup, simplification, or making a piece of code a little more idiomatic. This flow gives that work a place to happen, so we get a small, reviewable PR in the morning instead of hoping someone remembers to circle back later.

Small and steady beats big and vague

The flow picks a small random set of seed files as entry points, and the agent starts from there. Those files are hints, not mandatory edit targets. The agent may decide there is nothing worth changing in them, or follow the trail into nearby code and tests instead. The goal is not to send an AI off to redesign the system. The goal is to make one solid improvement at a time.

Over time, that adds up. Most healthy codebases improve through many small good decisions, not one dramatic cleanup effort.

One recent PR is a good example of the kind of change this produces. It refactored a Firestore subscription cleanup path so listener registrations were properly tracked and removed through a shared cleanup function. That is not the sort of change anyone would put on a roadmap, but it is exactly the sort of thing you are happy to review when it arrives as a small, well-explained PR.

Breyta is the workflow system around the AI

Breyta is not the model here. The AI part comes from the agent we run inside the flow. Breyta is the workflow system around it. It handles when the job runs, where it runs, how it starts, how it waits for completion, and how the result gets back to us.

That distinction matters. Without that workflow layer, this kind of setup tends to live as a pile of scripts, prompts, cron jobs, and half-remembered manual steps. With Breyta, it becomes something explicit and repeatable.

Setting it up was straightforward. Once the repository access and VM connection were in place, the flow itself became the durable part: trigger the run, connect over SSH, launch the job, wait for the callback, and return the result.

The flow runs on a schedule and can also be started manually. It connects to a VM over SSH, pulls the latest repository state, creates an isolated worktree, starts the agent, waits for completion, and reports the result back into Breyta. If the agent finds a small, behavior-preserving improvement and the checks pass, a pull request is opened. No change is also an acceptable outcome.

The flow definition itself stays small: two triggers, a few helper functions to normalize inputs and build the SSH command, one SSH step that starts the remote job, one wait step that pauses until the callback arrives, and a final step that returns the result. The flow is mostly orchestration.

The repository-side assets are where the behavior lives. There is a worker script that does the practical work: get the latest repository state, create an isolated git worktree, pick a bounded starting point, run the agent, run the repository checks, push a branch, open a pull request, and report the result back. There are also prompt templates. One prompt tells the agent what kind of code improvement to look for and what constraints to respect. Another prompt generates the pull request title and explanation directly from the staged changes, so the PR describes what really happened instead of sounding generic.

The actual shape

The VM side is intentionally simple. We already had a machine we control, with repository access and the basic tools needed to run the job. Breyta just needed an SSH connection, a user on that machine, and a trusted host entry. We chose that route on purpose because it keeps the execution environment predictable and avoids introducing a separate hosted runtime just for the agent.

The flow definition

The Clojure flow definition is mostly an orchestration wrapper around that SSH job. In simplified form:

{:triggers [{:type :manual}      ; allow an on-demand run
            {:type :schedule      ; run automatically every night
             :config {:cron "30 1 * * *"
                      :timezone "UTC"
                      :input {...}}}]
 :flow
 '(let [input (flow/input) ; read activation input for this run
        job (flow/step :function :normalize-input ...) ; resolve paths, timeouts, branches, and callback info
        command (flow/step :function :build-ssh-command ...) ; build the remote command we want to execute over SSH
        kickoff-result (flow/step :ssh :kickoff-codex-analysis
                          {:type :ssh ; run on the remote VM
                           :connection "..." ; SSH connection configured in Breyta
                           :command command}) ; start the detached worker remotely
        callback-result (flow/step :wait :await-codex-callback ...) ; wait until the worker reports back
        output (flow/step :function :finalize-result ...)] ; turn worker output into the final flow result
    output)}

The flow itself stays readable. You can see the whole shape at a glance: prepare input, start remote work, wait, collect the result.

The worker script

The worker script in the repository does the heavier lifting. It also picks the random seed files that the prompt passes to the agent as starting points. In simplified form:

git fetch origin "$BASE_BRANCH" # get the latest repository state
git worktree add -B "$TARGET_BRANCH" "$WORKTREE_DIR" "origin/$BASE_BRANCH" # create an isolated branch + worktree
cd "$WORKTREE_DIR" # do all work in the isolated checkout

SEED_FILES=$(rg --files ... | shuf -n "$MAX_FILES") # pick a few random entry points
codex exec --model "$CODEX_MODEL" "$(cat "$PROMPT_FILE")" # ask the agent to make one bounded improvement

# run repo checks here # make sure the result is worth opening as a PR

git commit -m "$PR_TITLE" # save the result as a normal git commit
git push -u origin "$TARGET_BRANCH" # publish the branch
gh pr create --base "$BASE_BRANCH" --head "$TARGET_BRANCH" ... # open the pull request
curl -X POST "$CALLBACK_URL" ... # tell Breyta that the job finished

That is where the boundary of the job becomes real. The script creates an isolated worktree, keeps the scope small, runs the checks, and only opens a pull request for a change that is worth reviewing.

The prompts

We split the prompts too. One tells the agent how to behave as a code reviewer and refactoring partner: keep the change small, preserve behavior, stay close to the existing style, and make sure the result is worth reviewing. The other is only about the pull request itself. It turns the staged change into a real title and explanation based on what was actually edited and validated.

A simplified version of the main prompt looks something like this:

You are an expert engineer reviewing this codebase.
Use these seed files only as starting points.
Find one small, behavior-preserving improvement.
Prefer simpler, more idiomatic code.
Stay close to the existing style and patterns.
Run the relevant checks before finishing.
If there is no meaningful improvement, make no edit.

That prompt boundary matters more than people expect. It is where you decide whether the agent produces reviewable improvements or low-value churn.

Reuse is one of the best parts

A flow like this does not have to stay a one-off internal setup. In Breyta, flows can be marked as approved for reuse, which means other users can read them, search for them, and adapt them to their own repositories. Once a pattern is useful, it becomes something other people can build on instead of something trapped inside one workspace.

For us, the result is simple: we wake up to a small, thoughtful PR that makes the code a little better than it was the day before.

If you want the same pattern for your own codebase, start from the reusable template for this flow. If you are already using the CLI, you can also find it directly with:

breyta flows search "autonomous-code-improvement-agent-codex-cli-vm"