Skip to Content
InternalDocsOperationsStaging Rls Hardening Plan

Staging Rls Hardening Plan

Source: docs/operations/staging-rls-hardening-plan.md

# Staging RLS Hardening Plan (Worker + Async Paths) Date: 2026-02-17 Owner: Platform engineering Scope: non-request async execution paths that currently use direct Prisma calls without transaction-scoped tenant GUC. ## Implementation status - Queue/job lifecycle hardening: implemented. - Worker handler hardening (`wayfair-upload`, `sima-validation`): implemented. - Notification async stack hardening (scheduler/engine/hooks/templates): implemented. - FORCE RLS re-enable migration: `apps/api/prisma/migrations/20260302000000_reenable_force_rls_tenant_tables`. - Verification checklist: `docs/operations/staging-force-rls-verification-checklist.md`. ## Why this existed Staging previously had `FORCE ROW LEVEL SECURITY` disabled on non-guest tables because worker and async paths did not run inside `withTenant()` transactions. The gaps below have been remediated (see Implementation status above). The inventory is preserved for audit trail. ## Gap inventory (resolved) ### Job claiming and lifecycle - `apps/api/src/workers/job-processor.ts:70` - `apps/api/src/workers/job-processor.ts:114` - `apps/api/src/workers/job-processor.ts:128` - `apps/api/src/lib/jobs.ts:57` - `apps/api/src/lib/jobs.ts:73` - `apps/api/src/lib/jobs.ts:82` - `apps/api/src/lib/jobs.ts:94` - `apps/api/src/lib/jobs.ts:109` Risk: queue cannot claim/update jobs under forced RLS. ### Core ingest handlers (worker-executed) - `apps/api/src/workers/handlers/wayfair-upload.ts:71` - `apps/api/src/workers/handlers/wayfair-upload.ts:103` - `apps/api/src/workers/handlers/wayfair-upload.ts:123` - `apps/api/src/workers/handlers/wayfair-upload.ts:152` - `apps/api/src/workers/handlers/wayfair-upload.ts:389` - `apps/api/src/workers/handlers/sima-validation.ts:104` - `apps/api/src/workers/handlers/sima-validation.ts:110` - `apps/api/src/workers/handlers/sima-validation.ts:135` - `apps/api/src/workers/handlers/sima-validation.ts:153` - `apps/api/src/workers/handlers/sima-validation.ts:182` - `apps/api/src/workers/handlers/sima-validation.ts:381` Risk: async catalog/SIMA lanes fail on first DB call. ### SHIP / ORDERS async libs called by worker - `apps/api/src/lib/ship/ingest.ts:78` - `apps/api/src/lib/ship/ingest.ts:196` - `apps/api/src/lib/orders/ingest.ts:133` - `apps/api/src/lib/orders/ingest.ts:209` - `apps/api/src/lib/orders/linkage.ts:23` - `apps/api/src/lib/orders/linkage.ts:58` - `apps/api/src/lib/orders/linkage.ts:91` - `apps/api/src/lib/orders/linkage.ts:143` - `apps/api/src/lib/orders/linkage.ts:179` - `apps/api/src/lib/orders/linkage.ts:213` - `apps/api/src/lib/ship/carrier-agreement-import.ts:90` Risk: SHIP/Orders ingest path breaks under force. ### Shared config loaders used during processing - `apps/api/src/lib/trade/detection.ts:20` - `apps/api/src/lib/ship/tenant-carrier-overrides.ts:24` Risk: workers fail before business logic runs. ### Notifications async stack (job processor + schedulers + templates) - `apps/api/src/lib/notifications/engine.ts:39` - `apps/api/src/lib/notifications/engine.ts:80` - `apps/api/src/lib/notifications/scheduler.ts:50` - `apps/api/src/lib/notifications/scheduler.ts:71` - `apps/api/src/lib/notifications/spike-detection.ts:39` - `apps/api/src/lib/notifications/caught-order.ts:26` - `apps/api/src/lib/notifications/exception-alert.ts:29` - `apps/api/src/lib/notifications/templates/trade-digest.ts:47` - `apps/api/src/lib/notifications/templates/trade-caught.ts:35` - `apps/api/src/lib/notifications/templates/trade-spike.ts:40` - `apps/api/src/lib/notifications/templates/ship-recon.ts:47` - `apps/api/src/lib/notifications/templates/ship-exception.ts:35` - `apps/api/src/lib/notifications/templates/ship-stale.ts:46` Risk: digest/event jobs and schedule fan-out fail under force. ## Remediation sequence 1. Harden queue primitives first - Make job claim/reclaim and job lifecycle updates tenant-scoped with `withTenant()`. - Keep claim semantics atomic (guarded update) while scoping per tenant. 2. Harden worker handlers next - Ensure each worker-executed DB path runs inside tenant-scoped transactions. - Eliminate direct `prisma.*` in handlers unless inside `withTenant(prisma, tenantId, ...)`. 3. Harden notifications async stack - Wrap engine, scheduler, and template queries in tenant context. - Keep idempotency/rate-limit semantics unchanged. 4. Re-enable FORCE RLS (DB) - Add migration to set `FORCE ROW LEVEL SECURITY` back on staging-deferred tables. - Run smoke flows and parity workflow post-migration. ## Verification gates - `pnpm --filter @rgl8r/api test` - Worker smoke: enqueue and process one job for each lane: - `ship_upload` - `order_upload` - `sima_validation` - `catalog_upload` (legacy `wayfair_upload` rows count under this lane) - `notification_event` - `notification_digest` - Re-enable force migration applies cleanly in staging. - Parity workflow publishes artifact even on failures. ## Notes - Guest flow is already using `withTenant()` and is not the blocker. - This plan intentionally avoids `BYPASSRLS` as a runtime dependency.