Shard Management and Online Split Lifecycle
Shard Management and Online Split Lifecycle
Section titled “Shard Management and Online Split Lifecycle”Context
Section titled “Context”WorkerSQL supports live shard rebalancing without downtime. The split lifecycle
is orchestrated inside the edge worker by ShardSplitService
, with data
export/import delegated to the TableShard
durable objects. This document
captures the operational contract, component interactions, and acceptance
criteria for the shard split workflow delivered in this iteration.
Actors and Responsibilities
Section titled “Actors and Responsibilities”- ShardSplitService: owns plan state, phase transitions, progress metrics, and validation against routing policies.
- RoutingVersionManager: persists routing policy history in KV, guards version bumps, and handles rollback.
- ConfigService: supplies table policies (primary keys, shard-by columns) consumed during backfill iteration.
- TableShard Durable Objects: expose
admin/export
,admin/import
, andadmin/events
admin endpoints used for bulk backfill and tail replay. - Cloudflare KV (APP_CACHE): durable storage for split plans, routing versions, and idempotency cursors.
Lifecycle Phases
Section titled “Lifecycle Phases”Phase | Trigger | Key Work | Exit Criteria |
---|---|---|---|
planning | planSplit | Validate tenants, snapshot table policies, persist plan skeleton | Plan persisted with backfill /tail initialized |
dual_write | startDualWrite | Enable dual-write routing; clear previous error state | Ready for backfill launcher |
backfill | runBackfill | Execute async export/import loops per tenant & table; update row counters and cursors | All tables copied, tail.status reset to pending , transition to tailing |
tailing | Backfill completes | Maintain dual-write until tail replay catches up | replayTail finishes with tail.status = caught_up , phase → cutover_pending |
cutover_pending | Tail caught up | Await routing update approval | cutover bumps routing version, phase → completed |
completed | Cutover committed | Tenants now routed to target shard | Metrics frozen, plan retained for audit |
rolled_back | rollback | Revert routing pointer, reset plan state | Ready to re-run workflow |
Data Flow Summary
Section titled “Data Flow Summary”-
Planning
- Inputs: source shard, target shard, tenant IDs.
- Validations: tenant list non-empty, source ≠ target, tenants currently routed to source shard, no overlapping active plan.
- Outcomes: KV record
shard_split:plan:{id}
with canonical backfill/tail structures.
-
Dual Write
- Flip plan phase to
dual_write
, clear residual error state, markdualWriteStartedAt
(used for tail replay window).
- Flip plan phase to
-
Backfill Execution
runBackfill
persists running status then schedulesexecuteBackfill
viaExecutionContext.waitUntil
.backfillTable
iteratesadmin/export
results using cursor + limit 200; each batch invokes target shardadmin/import
in upsert mode.- Progress metrics:
totalRowsCopied
, per-table cursor map,backfill.startedAt/completedAt
.
-
Tail Replay
- Fetch events via
admin/events
withafterId
cursor to ensure idempotency. - Filter events to target tenants, ignore
SELECT
statements, route DDL to/ddl
endpoint, other mutations to/mutation
. - Persist
lastEventId
andlastEventTs
after each event to support resumability. - When batch size < limit, mark
tail.status = caught_up
, phase →cutover_pending
.
- Fetch events via
-
Cutover
- Copy routing policy, update tenant → target shard assignments, increment
version, persist via
RoutingVersionManager.updateCurrentPolicy
. - Plan captures
routingVersionCutover
for audit and metrics reporting.
- Copy routing policy, update tenant → target shard assignments, increment
version, persist via
-
Rollback
- Reset routing pointer to
routingVersionAtStart
. - Clear backfill/tail progress markers and error message, phase →
rolled_back
.
- Reset routing pointer to
Background Task Contract
Section titled “Background Task Contract”- Backfill work is always scheduled via
ExecutionContext.waitUntil
to avoid worker request timeouts. - Each export/import batch yields to event loop (
yieldToEventLoop
) to keep cooperative concurrency responsive. - Failures capture an
errorMessage
and flip the associated phase status (backfill.status
ortail.status
) tofailed
for observability.
Metrics
Section titled “Metrics”ShardSplitService.getMetrics()
surfaces per-plan metrics:
splitId
,sourceShard
,targetShard
phase
,totalRowsCopied
backfillStatus
,tailStatus
tenants
,startedAt
,updatedAt
These are suitable for dashboarding and alerting on stalled phases.
Testing Strategy
Section titled “Testing Strategy”Test Suite | Location | Coverage |
---|---|---|
Unit | tests/services/ShardSplitService.test.ts | Planning validation, async backfill execution, status persistence |
Integration | tests/integration/shardSplit.integration.test.ts | Full lifecycle: plan → dual-write → backfill → tail replay → cutover |
Smoke | tests/smoke/shardSplit.smoke.test.ts | Service boot sanity check |
Fuzz | tests/fuzz/shardSplit.fuzz.test.ts | Randomized tenant lists verifying validation invariants |
Browser (Playwright) | tests/browser/shardSplit.spec.ts | Acceptance documentation rendered and key lifecycle sections visible |
Acceptance Criteria
Section titled “Acceptance Criteria”- Valid Planning: A split cannot be planned unless every tenant is currently routed to the source shard and no other active plan overlaps (unit + fuzz coverage).
- Backfill Safety: Each export/import batch updates cursors atomically and marks success, retrying from persisted state (unit + integration coverage).
- Tail Idempotency: Events replay strictly increasing
id
and skip non-applicable writes, guaranteeing no duplicate mutations (integration coverage). - Cutover Discipline: Routing version increments exactly once and records the new version on the plan (integration coverage).
- Rollback Preparedness: Rolling back restores pending statuses allowing operators to re-run the pipeline (covered by unit smoke assertions of plan normalization).
- Operator Visibility: Metrics expose backfill/tail status transitions for observability (implicitly covered by unit tests verifying persisted state changes).
Operational Runbook
Section titled “Operational Runbook”- Use
planSplit
via admin API (payload: source, target, tenants, description). - After validation, call
startDualWrite
to begin mirrored writes. - Trigger
runBackfill
(worker automatically schedules background execution); monitor metrics forbackfill.status
. - Poll and execute
replayTail
until phase becomescutover_pending
withtail.status = caught_up
. - Call
cutover
to move tenants to the new shard. - If anomalies surface before cutover, call
rollback
to revert routing and reset state.
Follow-up Enhancements
Section titled “Follow-up Enhancements”- Emit structured logs for each phase transition to feed observability pipelines.
- Persist per-table success/failure counters for detailed reporting.
- Add queue-based notification when tail replay completes to reduce manual polling.