Skip to main content

Federation Release Readiness

Entry point for the CloudEdge Event Federation release status.

Phase completion

PhaseScopeStatusEvidence
Phase 1Event envelope, EventGroup, SQLite store, CLIdonecheckpoint
Phase 1.5EventPeer, EventSubscription Kinds + validationdonecheckpoint
Phase 2Peer delivery, HMAC, retry, prunedonetransport evidence
Phase 3Subscription → plugin → RemoteAddressClaimdonesubscription evidence
Phase 4Provider actionPlan plugins, dry-rundoneADR 0007
Phase 5Provider action execution (gated)doneAWS, Azure, OCI
P1Federation pipeline observability (14 OTel metrics)doneobservability how-to
P2Doctor federation checks, delivery summarydonechangelog
P3FederationSLO Kind, SLO JSON, remediation plandonePR #541
P4Operational qualification & release candidatein progressthis document

Architecture references

Qualification harness

The reusable qualification harness is at scripts/cloudedge-federation-qualification.sh.

scripts/cloudedge-federation-qualification.sh \
--evidence-dir /tmp/fed-qual \
--cycles 2 \
--duration 300 \
--scenarios healthy,partition,ttl-refresh,restart,subscription,config-fault,security,multi-group

8 scenarios are defined:

  1. healthy — baseline delivery + doctor PASS
  2. partition — peer network partition → SLO violation → recovery
  3. ttl-refresh — TTL refresh re-push across partition boundary
  4. restart — eventd restart recovery (sender + receiver)
  5. subscription — subscription plugin failure + recovery
  6. config-fault — expected-peer / config fault detection via doctor
  7. security — HMAC / timestamp / malformed event rejection
  8. multi-group — per-group SLO isolation

Evidence template: evidence/federation-p4-operational-qualification-TEMPLATE.md

Auto-remediation readiness

See federation-remediation-readiness-matrix.md for the P5+ readiness classification of all 7 remediation actions.

Summary: 2 actions are ready for auto-execute (retry-failed-deliveries, force-repush-stale-ttl), 4 are inspect-only, 1 is not ready (configure-peer-endpoint requires operator approval).

Documentation convergence

DocumentStatus
ADR 0006Updated — P1-P3 reflected, FederationSLO Kind listed
ADR 0007Updated — Phases 5.0-5.1 marked DONE
CheckpointHistorical note added
ChangelogP1-P3 + Phase 5 entries added to Unreleased
Observability how-toUpdated with P3 per-group SLO contract

Release criteria

  • All 8 qualification scenarios PASS on at least one provider pair
  • Doctor JSON output matches FederationSLO contract
  • Remediation plan output is deterministic and diff-stable
  • No secrets in evidence files
  • Documentation converged (all rows above = Updated)
  • CI green on qualification branch
  • Evidence committed to docs/releases/evidence/