Version Control · Ship agent changes with confidence
Version Control for Every AI Agent
Every edit to your agents is saved as a version. Test new versions in staging against real data. Review the diff before you ship. Roll back to any prior version in a single click — no redeploy, no broken production.
The category
What Is Version Control for AI Agents?
It is Git-style discipline for the parts of an AI agent that actually change behavior — prompt, tools, permissions, workflow, approvals. Every change is saved as a version, every version can be tested in staging, diffed against production, promoted safely, and rolled back instantly. The full edit history is attributed, searchable, and exportable for audit.
Every edit is a version
The moment you hit save — on a prompt, a tool, a permission, a workflow step — Arahi snapshots the full agent definition. Nothing is lost, nothing is overwritten, and nothing requires a separate commit step.
Diff before you ship
Review any two versions side-by-side — added tools, tightened prompts, removed connectors, approval rules added. See exactly what changes between staging and production before you promote.
Promote or roll back
Promote staging to production with one click and a clean blue-green swap. If anything regresses, the same timeline restores the previous version in seconds — with the full audit trail of who changed what, and why.
The workflow
From edit to production, safely
The same four steps, every time — whether you're tweaking a single word in a prompt or restructuring a 12-step workflow.
Edit
Change a prompt, swap a tool, tighten a permission, or redraw a step in the workflow. The moment you save, Arahi snapshots the full agent definition as a new version — labeled with your name, a timestamp, and an optional commit message so you can find it again later.
Test in staging
Your edit lands in staging, not prod. Chat with the new version, run your saved eval scenarios, let teammates kick the tires against fixture data or read-only live connections. Production keeps serving the previous version untouched until you decide otherwise.
Review the diff
Open the side-by-side view of staging vs. production. Prompt changes show as line-level diffs, added or removed tools highlight, permission scopes flag if they widened, and workflow graph deltas render visually. Request review from a teammate right from the diff.
Promote or roll back
Promote staging to production with one click — blue-green swap, in-flight runs finish on the old version, new runs start on the new one. If anything regresses, pick any prior version from the timeline and roll back in seconds. Every promotion and rollback is logged.
What is included
Built for teams who ship agents to production
The controls you already expect from a modern codebase, wired into the agent layer where the real risk lives.
Diff view
Side-by-side comparison of any two versions — line-level prompt diffs, tool and permission deltas, and a visual diff of the workflow graph. Request review from a teammate without leaving the page.
Staging environments
Every agent ships with a staging twin. Point it at real connected data in read-only mode or a fixture dataset you define once. Spin up ephemeral preview environments per branch for bigger changes.
Instant rollback
Pick any prior version, preview the diff against live, and restore it in a single click. In-flight runs finish cleanly, queued runs pick up the restored version, and the rollback itself is logged.
Change attribution
Every version is tied to an SSO identity with timestamp, IP, and commit message. Filter the timeline by author, agent, or date range, and export the full audit log to CSV, Datadog, Splunk, or S3.
Scheduled releases
Queue a promotion for a quiet window — 2am on Tuesday, end of quarter, after the weekly sync. Pair with auto-rollback on health-check regressions so an unattended release can unwind itself.
Branch per agent
Work on a major rewrite without blocking small fixes. Branch the agent, iterate independently, merge when ready. Conflicts — two people editing the same prompt block — surface with a merge UI.
Frequently asked questions
Every save creates a new version, and Arahi retains the full history by default — not the last ten, not the last hundred. Every prompt tweak, tool change, schema edit, permission update, and connection swap is captured as an immutable snapshot, timestamped, attributed to the person who made it, and labeled with an optional commit message. On paid plans, version history is retained for the life of the workspace; on the free plan, the most recent 90 days are retained with older versions compacted into a summary record. You can pin specific versions as named releases (for example, v1.4-stable or pre-onboarding-migration) so they survive compaction regardless of age. If your compliance posture demands longer retention with tamper-evident hashing, the Enterprise plan writes every version to a customer-owned S3 bucket with object lock enabled. In practice, most teams never hit a retention wall — they reach for the diff view, roll back a regression, and move on.
Yes. Every agent in Arahi has at least two environments — staging and production — and you can spin up ephemeral preview environments on demand. Staging runs the same agent logic against either real connected data (read-only mode by default) or a sandboxed fixture dataset you define once and reuse. You can chat with the staging version, run the test suite, and invite teammates to click through scenarios, all while production continues to serve real traffic off the previous version. When you're satisfied, you promote staging to production with one click — Arahi performs a blue-green swap so in-flight conversations finish on the old version and new sessions start on the new one. Nothing in production moves until you explicitly promote, and if the promotion introduces a regression, one-click rollback restores the prior version within seconds.
Everything that defines agent behavior is versioned, not just the system prompt. That includes the prompt and instructions, the list of tools and connectors the agent can call, the permission scopes on each connector, the workflow graph and branching logic, input and output schemas, variables and secrets references (values are vaulted separately), the knowledge sources and retrieval settings, guardrails and content filters, pricing and rate-limit settings, and any human approval rules attached to sensitive steps. Each version is a complete, self-contained definition — you can roll back to last Tuesday's version and get the prompt, tools, permissions, and approvals from that moment, not a Frankenstein mix of old prompt and new tools. Secrets and API credentials are referenced by name rather than copied into each version, so rotating a credential doesn't force a re-version of every agent that uses it.
Rollback is one click from the history timeline. Pick any prior version, preview the diff against whatever is live, and confirm — Arahi makes that version the new production version within a couple of seconds. Under the hood, rollback is a forward-only operation: instead of rewriting history, Arahi creates a new version whose definition matches the target, preserving the full audit trail. That means you can roll back the rollback if you change your mind, and every promotion (forward or backward) is logged with a reason field. In-flight runs started on the broken version are allowed to finish or, for destructive-action agents, paused and surfaced for manual review so nothing half-executes. Scheduled and queued runs automatically pick up the restored version on their next tick. If you want rollback to happen automatically when error rates or eval scores regress, you can wire Arahi's health checks to trigger an auto-rollback with a Slack notification.
Yes — every version records the author, timestamp, commit message, IP address, and a git-style diff of the change. Open any version in the history timeline and you get a side-by-side view of what moved: added tools highlighted green, removed permissions in red, prompt edits shown line-by-line, workflow graph nodes diffed by shape and connection. Filter by author to see everything a specific teammate has shipped, or filter by agent to audit a single surface end-to-end. The activity log is exportable to CSV for compliance reviews and streams to SIEM targets (Datadog, Splunk, S3) on the Business plan and above. For regulated teams, change attribution is tied to SSO identity rather than a local Arahi account, so offboarding a user through your IdP immediately revokes their ability to push new versions while preserving their historical authorship for the audit trail.
Yes, and the two features are designed to reinforce each other. Human approval rules are part of the versioned definition, so you can see exactly which version required approval for which action and roll that policy back if it was loosened too aggressively. When a new version is promoted to production, in-flight approval requests started on the old version continue to resolve against the old rules — Arahi won't retroactively change the gate a reviewer agreed to. Going the other direction, you can require human approval on the promotion itself: configure any environment to block promotions until a designated reviewer signs off in Slack or email, which catches risky changes before they ever reach production. Rollbacks can be made approval-gated too, so a panicked revert at 2am still goes through the same review path as a forward deploy. See the full reviewer workflow on Human Approval.
Move fast. Without breaking production agents.
Every change snapshotted. Every release reviewable. Every rollback one click away. Ship agent changes with the same discipline you ship code.

