Shard Management and Online Split Lifecycle
Shard Management and Online Split Lifecycle
Section titled “Shard Management and Online Split Lifecycle”Context
Section titled “Context”WorkerSQL supports live shard rebalancing without downtime. The split lifecycle
is orchestrated inside the edge worker by ShardSplitService, with data
export/import delegated to the TableShard durable objects. This document
captures the operational contract, component interactions, and acceptance
criteria for the shard split workflow delivered in this iteration.
Actors and Responsibilities
Section titled “Actors and Responsibilities”- ShardSplitService: owns plan state, phase transitions, progress metrics, and validation against routing policies.
- RoutingVersionManager: persists routing policy history in KV, guards version bumps, and handles rollback.
- ConfigService: supplies table policies (primary keys, shard-by columns) consumed during backfill iteration.
- TableShard Durable Objects: expose
admin/export,admin/import, andadmin/eventsadmin endpoints used for bulk backfill and tail replay. - Cloudflare KV (APP_CACHE): durable storage for split plans, routing versions, and idempotency cursors.
Lifecycle Phases
Section titled “Lifecycle Phases”| Phase | Trigger | Key Work | Exit Criteria |
|---|---|---|---|
planning | planSplit | Validate tenants, snapshot table policies, persist plan skeleton | Plan persisted with backfill/tail initialized |
dual_write | startDualWrite | Enable dual-write routing; clear previous error state | Ready for backfill launcher |
backfill | runBackfill | Execute async export/import loops per tenant & table; update row counters and cursors | All tables copied, tail.status reset to pending, transition to tailing |
tailing | Backfill completes | Maintain dual-write until tail replay catches up | replayTail finishes with tail.status = caught_up, phase → cutover_pending |
cutover_pending | Tail caught up | Await routing update approval | cutover bumps routing version, phase → completed |
completed | Cutover committed | Tenants now routed to target shard | Metrics frozen, plan retained for audit |
rolled_back | rollback | Revert routing pointer, reset plan state | Ready to re-run workflow |
Data Flow Summary
Section titled “Data Flow Summary”-
Planning
- Inputs: source shard, target shard, tenant IDs.
- Validations: tenant list non-empty, source ≠ target, tenants currently routed to source shard, no overlapping active plan.
- Outcomes: KV record
shard_split:plan:{id}with canonical backfill/tail structures.
-
Dual Write
- Flip plan phase to
dual_write, clear residual error state, markdualWriteStartedAt(used for tail replay window).
- Flip plan phase to
-
Backfill Execution
runBackfillpersists running status then schedulesexecuteBackfillviaExecutionContext.waitUntil.backfillTableiteratesadmin/exportresults using cursor + limit 200; each batch invokes target shardadmin/importin upsert mode.- Progress metrics:
totalRowsCopied, per-table cursor map,backfill.startedAt/completedAt.
-
Tail Replay
- Fetch events via
admin/eventswithafterIdcursor to ensure idempotency. - Filter events to target tenants, ignore
SELECTstatements, route DDL to/ddlendpoint, other mutations to/mutation. - Persist
lastEventIdandlastEventTsafter each event to support resumability. - When batch size < limit, mark
tail.status = caught_up, phase →cutover_pending.
- Fetch events via
-
Cutover
- Copy routing policy, update tenant → target shard assignments, increment
version, persist via
RoutingVersionManager.updateCurrentPolicy. - Plan captures
routingVersionCutoverfor audit and metrics reporting.
- Copy routing policy, update tenant → target shard assignments, increment
version, persist via
-
Rollback
- Reset routing pointer to
routingVersionAtStart. - Clear backfill/tail progress markers and error message, phase →
rolled_back.
- Reset routing pointer to
Background Task Contract
Section titled “Background Task Contract”- Backfill work is always scheduled via
ExecutionContext.waitUntilto avoid worker request timeouts. - Each export/import batch yields to event loop (
yieldToEventLoop) to keep cooperative concurrency responsive. - Failures capture an
errorMessageand flip the associated phase status (backfill.statusortail.status) tofailedfor observability.
Metrics
Section titled “Metrics”ShardSplitService.getMetrics() surfaces per-plan metrics:
splitId,sourceShard,targetShardphase,totalRowsCopiedbackfillStatus,tailStatustenants,startedAt,updatedAt
These are suitable for dashboarding and alerting on stalled phases.
Testing Strategy
Section titled “Testing Strategy”| Test Suite | Location | Coverage |
|---|---|---|
| Unit | tests/services/ShardSplitService.test.ts | Planning validation, async backfill execution, status persistence |
| Integration | tests/integration/shardSplit.integration.test.ts | Full lifecycle: plan → dual-write → backfill → tail replay → cutover |
| Smoke | tests/smoke/shardSplit.smoke.test.ts | Service boot sanity check |
| Fuzz | tests/fuzz/shardSplit.fuzz.test.ts | Randomized tenant lists verifying validation invariants |
| Browser (Playwright) | tests/browser/shardSplit.spec.ts | Acceptance documentation rendered and key lifecycle sections visible |
Acceptance Criteria
Section titled “Acceptance Criteria”- Valid Planning: A split cannot be planned unless every tenant is currently routed to the source shard and no other active plan overlaps (unit + fuzz coverage).
- Backfill Safety: Each export/import batch updates cursors atomically and marks success, retrying from persisted state (unit + integration coverage).
- Tail Idempotency: Events replay strictly increasing
idand skip non-applicable writes, guaranteeing no duplicate mutations (integration coverage). - Cutover Discipline: Routing version increments exactly once and records the new version on the plan (integration coverage).
- Rollback Preparedness: Rolling back restores pending statuses allowing operators to re-run the pipeline (covered by unit smoke assertions of plan normalization).
- Operator Visibility: Metrics expose backfill/tail status transitions for observability (implicitly covered by unit tests verifying persisted state changes).
Operational Runbook
Section titled “Operational Runbook”- Use
planSplitvia admin API (payload: source, target, tenants, description). - After validation, call
startDualWriteto begin mirrored writes. - Trigger
runBackfill(worker automatically schedules background execution); monitor metrics forbackfill.status. - Poll and execute
replayTailuntil phase becomescutover_pendingwithtail.status = caught_up. - Call
cutoverto move tenants to the new shard. - If anomalies surface before cutover, call
rollbackto revert routing and reset state.
Follow-up Enhancements
Section titled “Follow-up Enhancements”- Emit structured logs for each phase transition to feed observability pipelines.
- Persist per-table success/failure counters for detailed reporting.
- Add queue-based notification when tail replay completes to reduce manual polling.