ADR 0007: Provider Action Execution (gated, executor-isolated)
Status
Proposed; Accepted for experimental implementation — 2026-05-30.
This ADR builds directly on ADR 0006: CloudEdge Event Federation and the Selective Address Mobility dataplane. It is experimental.
Phase 5.0 (this chunk) lands the design, the ProviderActionPolicy Kind, and
the action_executions journal only. Phase 5.0 contains no execution state
machine, no routerctl action commands, no executor invocation, and no real
provider CLI/SDK calls — a fake executor and the execution path arrive in
later chunks.
Context
- Phase 4.1 landed dry-run
actionPlans. Planner plugins (capabilitypropose.providerAction) emit display-only provider operations recorded on aDynamicConfigPart. routerd never executes anactionPlanand never invokes a provider CLI/SDK from them;pkg/plugin.ValidateActionPlanrejectsmode=execute. They exist purely so EventSubscription-driven runs stay reviewable. - The SAM dataplane is real-cloud validated. Selective Address Mobility has passed clean smokes across AWS, Azure, and OCI (3-cloud parity). The on-prem side delivers a claimed address over the overlay; the cloud side still needs the provider to actually attach/detach the secondary IP on its NIC. Today that attach/detach is a manual operator step.
- The missing piece is gated execution. We want routerd to be able to drive the approved provider mutation, but provider credentials must never enter routerd core, and execution must be off by default, explicitly approved, and fully journaled.
Decision
Two plugin roles
- Planner (Phase 4.1, capability
propose.providerAction): emits dry-runactionPlans; holds no credentials. - Executor (Phase 5, capability
execute.providerAction— a new enum value onPluginSpec.Capabilities): performs the action in its own process with its own credentials, using cloud-native identity (AWS instance profile, Azure managed identity, OCI instance principal) or its own environment.
Credential model (hard invariant)
routerd core NEVER holds, reads, or passes provider credentials. routerd
passes the executor only the approved actionPlan (no secrets) plus the
Phase-4.0 allowlisted/redacted context. The executor authenticates itself to the
cloud. Credentials never traverse routerd core or the action_executions
journal.
Flow
- A planner emits an
actionPlanon aDynamicConfigPart(dry-run, as today). - The plan is imported into the
action_executionsjournal asstatus=pending, keyed byidempotencyKey. - Approval: an operator approves it, OR policy auto-approves it (only when
requireApproval=falseANDenabled=trueAND notdryRunOnlyAND the allowlists match). - Execute: routerd invokes the matching executor plugin, handing it the approved plan (no secrets).
- The result is journaled:
succeeded/failed/skipped/rolledBack.
ProviderActionPolicy Kind
A new Kind (apiVersion: hybrid.routerd.net/v1alpha1) gates execution. It is
defined in the hybrid group to sit alongside RemoteAddressClaim and
CloudProviderProfile, which it governs. Its zero value is the safe locked-down
state:
enabled(bool, default false) — execution is disabled unless true.dryRunOnly(*bool, default true when nil) — only dry-run permitted.requireApproval(*bool, default true when nil).allowedProviders/allowedProviderRefs/allowedActions— empty means none (default-deny).allowedCIDRs— the action target address must fall within one.maxActionsPerRun(int, default 0 = no actions; the operator must set a positive bound).allowUndo(bool, default false).executionWindow(string, validated leniently).
routerctl action UX surface (later chunks, documented here)
routerctl action list, show, approve, execute --dry-run|--approved,
journal, and rollback --dry-run. These are the operator surface; Phase 5.0
ships none of them.
Phasing
- Phase 5.0 — framework + data model:
ProviderActionPolicyKind, theaction_executionsjournal, schema/validation. A fake executor (no real cloud) arrives in Phase 5.0's later chunk to exercise the path end-to-end. Phase 5.0 calls no real provider CLI/SDK. - Live mutation smoke — gated, one provider at a time, against the SAM-validated cloud.
- Phase 5.x — hardening (windows, rate limits, richer rollback, audit).
Hard safety stops
- Execution disabled by default.
ProviderActionPolicy.enableddefaults false;dryRunOnlydefaults true. - Explicit approval required. An action executes only if approved (operator
approval, OR policy
requireApproval=falsewithenabled+ notdryRunOnly- allowlist match).
mode=executeis rejected unless there is an approvedaction_executionthat policy permits.idempotencyKeyrequired; a key that already succeeded is not executed again (skipped / duplicate). Import isON CONFLICT DO NOTHING, so a repeated key never creates a second execution row.- All execution results are journaled —
succeeded/failed/skipped/rolledBack, plus thepending/approvedlifecycle states. - Undo/rollback is best-effort — an executor may not support it; rollback
is gated by
allowUndo. - Provider credentials never traverse routerd core — the executor holds and uses its own cloud-native identity.
- Phase 5.0 calls no real provider CLI/SDK — fake executor only.
Consequences
- routerd gains a reviewable, default-off path to drive the cloud-side SAM attach/detach without ever holding cloud credentials.
- The journal is the audit trail and the idempotency guard; it is the single source of truth for what was executed.
- The asymmetry between provision and de-provision (TTL teardown with hysteresis, per ADR 0006) is honoured by keeping execution gated and journaled rather than reactive to every event.