Rls Prisma Spike

Source: docs/spikes/rls-prisma-spike.md

# P5-SPIKE: Row-Level Security with Prisma — Feasibility Analysis
 
**Status:** Complete (spike only — no implementation)
**Plan ID:** P5-SPIKE
**Date:** 2026-02-11
 
---
 
## Summary
 
This spike evaluates the feasibility of adding PostgreSQL Row-Level Security (RLS) to the RGL8R platform as a defense-in-depth layer for tenant isolation. The recommended approach uses Prisma's `$transaction` with `SET LOCAL` to scope RLS policies per request.
 
**Verdict:** Feasible. Proceed with P5-A implementation using the transaction wrapper approach.
 
---
 
## 1. Approach: Prisma + `SET LOCAL` Transaction Wrapper
 
### How It Works
 
Each API request wraps its database operations in a Prisma interactive transaction. The first statement within the transaction sets a PostgreSQL session variable scoped to that transaction:
 
```typescript
async function withTenant<T>(
  prisma: PrismaClient,
  tenantId: string,
  fn: (tx: Prisma.TransactionClient) => Promise<T>
): Promise<T> {
  return prisma.$transaction(async (tx) => {
    // SET LOCAL scopes to current transaction only
    await tx.$executeRawUnsafe(
      `SET LOCAL app.current_tenant_id = $1`,
      tenantId
    );
    return fn(tx);
  });
}
```
 
RLS policies reference this variable to filter rows:
 
```sql
-- Current main table is app.sima_outcomes (Prisma model: SIMAOutcome).
-- P5-B later renames this table to app.sima_results.
CREATE POLICY tenant_isolation ON app.sima_outcomes
  USING (tenant_id = current_setting('app.current_tenant_id')::uuid);
```
 
### Why `SET LOCAL`
 
`SET LOCAL` is the critical detail. Unlike `SET`, which persists for the entire database session (connection), `SET LOCAL` only affects the current transaction. When the transaction commits or rolls back, the setting is automatically discarded. This makes it safe with connection pooling because no tenant context leaks between requests sharing the same connection.
 
---
 
## 2. Feasibility Analysis
 
### Prisma Compatibility
 
- **`$transaction` + `$executeRawUnsafe`:** Fully supported. Prisma interactive transactions run all statements on the same database connection within a `BEGIN`/`COMMIT` block. `$executeRawUnsafe` executes arbitrary SQL, including `SET LOCAL`.
- **Query scoping:** All Prisma queries (findMany, create, update, delete) issued within the transaction callback see the RLS filter automatically. No changes to individual queries are needed.
- **Client generation:** No changes to the Prisma schema or client generation process. RLS is enforced at the PostgreSQL level, transparent to the ORM.
 
### Connection Pooling Safety
 
| Pooling Mode | Safe? | Reason |
|-------------|-------|--------|
| **PgBouncer transaction mode** | Yes | Each transaction gets a dedicated connection for its duration. `SET LOCAL` is discarded at `COMMIT`/`ROLLBACK`. No cross-request leakage. |
| **PgBouncer session mode** | No | A connection may serve multiple requests in sequence. Although `SET LOCAL` itself resets at transaction end, session mode introduces risk if any code path uses `SET` (without `LOCAL`) by mistake. |
| **Prisma built-in pool** | Yes | Prisma's internal pool assigns connections per transaction. Same safety model as PgBouncer transaction mode. |
 
**Our configuration:** Prisma built-in connection pool (no external PgBouncer). Safe for `SET LOCAL`.
 
### What Happens If `current_tenant_id` Is Not Set
 
If a query runs outside a transaction wrapper (i.e., `current_setting('app.current_tenant_id')` is not set), PostgreSQL raises an error:
 
```
ERROR: unrecognized configuration parameter "app.current_tenant_id"
```
 
This is a **fail-closed** behavior, which is desirable. No query can accidentally bypass tenant isolation by forgetting the wrapper.
 
If you prefer "empty result set" behavior instead of a hard error when context is missing, use `current_setting(..., true)` in the policy and coerce missing/empty values to a nil UUID:
 
```sql
USING (
  tenant_id = COALESCE(
    NULLIF(current_setting('app.current_tenant_id', true), ''),
    '00000000-0000-0000-0000-000000000000'
  )::uuid
)
```
 
This avoids cast errors from empty strings while still failing closed (no rows matched).
 
---
 
## 3. Performance
 
### Overhead Estimates
 
| Component | Estimated Overhead | Notes |
|-----------|-------------------|-------|
| `SET LOCAL` statement | ~0.5-1ms | Simple parameter set, no disk I/O |
| Transaction wrapper | Negligible | Prisma already uses implicit transactions for writes; interactive transactions add only the `BEGIN`/`COMMIT` round-trips for reads |
| RLS policy evaluation | ~0.1ms per query | Policy is a simple equality check on an indexed column |
| **Total per request** | **~1-2ms** | Acceptable for an API with typical response times of 50-500ms |
 
### Index Impact
 
No new indexes are needed. RLS policies filter on `tenantId`, which already has indexes on every table (required for the existing application-level WHERE clauses). The query planner uses these indexes for RLS policy evaluation automatically.
 
### Benchmark Plan (for P5-A implementation)
 
Before and after enabling RLS on a representative table (e.g., `SIMAOutcome` / `app.sima_outcomes` on current main):
1. Measure p50/p95/p99 latency on `GET /api/sima/results` with 1000 rows across 3 tenants
2. Measure job processing throughput (jobs/minute) for a batch upload
3. Target: < 5% latency increase at p95
 
---
 
## 4. Background Job Processor
 
The job processor (`apps/api/src/workers/job-processor.ts`) runs outside the HTTP request lifecycle. It polls for jobs and processes them in-process.
 
### Option A: Same Transaction Wrapper (Recommended)
 
Jobs have `tenantId` on the `Job` record. The processor can use the same `withTenant()` wrapper:
 
```typescript
// In job-processor.ts
const job = await prisma.job.findFirst({ where: { status: 'PENDING' } });
if (job) {
  await withTenant(prisma, job.tenantId, async (tx) => {
    // All DB operations within this job use RLS
    await processJob(tx, job);
  });
}
```
 
**Pros:** Consistent tenant isolation across HTTP and background contexts. Same code path, same guarantees.
 
**Cons:** Slightly more complex job processor code. The job polling query itself (`findFirst` for pending jobs) needs to run outside RLS (it queries across tenants to find work).
 
### Option B: Bypass RLS for Job Processor Role
 
Create a separate database role for the job processor with `BYPASSRLS`:
 
```sql
CREATE ROLE rgl8r_worker BYPASSRLS;
GRANT ALL ON SCHEMA app TO rgl8r_worker;
```
 
The job processor connects with this role and relies on existing application-level `WHERE tenantId = ...` clauses.
 
**Pros:** Simpler implementation. No transaction wrapper needed in the processor.
 
**Cons:** Loses defense-in-depth for background jobs. Acceptable trade-off since the job processor is internal code with a single entry point.
 
### Recommendation
 
Start with **Option A** for consistency. If it creates unacceptable complexity in the job processor, fall back to Option B. The job polling query can use a separate Prisma client instance without RLS, or run with `BYPASSRLS` just for the polling step.
 
---
 
## 5. Rejected Alternatives
 
### Prisma Client Extensions
 
Prisma client extensions (`$extends`) can theoretically intercept all queries and inject tenant context. However:
 
- **Maturity:** Client extensions are still marked as "Preview" for some features. The `query` extension component needed for this is GA, but the interaction with `$transaction` is not well-documented.
- **Debugging:** Extension-injected behavior is invisible in query logs and hard to trace.
- **Portability:** Ties tenant isolation logic to Prisma-specific abstractions rather than PostgreSQL-native RLS.
 
**Verdict:** Not recommended. The transaction wrapper approach is simpler and uses well-understood PostgreSQL primitives.
 
### Application-Level WHERE Only (Current State)
 
The current approach uses application-level `WHERE tenantId = ...` on all queries. This works and is correct today.
 
- **Risk:** A single forgotten WHERE clause or a raw query without tenant filtering exposes cross-tenant data.
- **Audit burden:** Every new query and every code review must verify tenant filtering.
- **Scale concern:** As the codebase and team grow, the probability of a missed filter increases.
 
**Verdict:** Acceptable for the current scale (2-5 tenants, single developer). RLS adds defense-in-depth that reduces audit burden and prevents a class of bugs entirely.
 
### Different ORM (e.g., Drizzle, Knex)
 
Some ORMs have more native support for RLS or raw SQL within transactions. However:
 
- **Rewrite cost:** Migrating from Prisma would touch every data access layer in the application.
- **Prisma works:** The `$transaction` + `$executeRawUnsafe` approach is fully functional with Prisma.
- **Ecosystem:** Prisma has the best TypeScript type safety and schema management tooling.
 
**Verdict:** Not worth the rewrite cost. Prisma's transaction API is sufficient for the RLS approach.
 
---
 
## 6. Recommendation
 
**Proceed with P5-A: Implement RLS using the Prisma `$transaction` + `SET LOCAL` wrapper.**
 
Rationale:
- **Overhead is minimal.** ~1-2ms per request is negligible relative to typical API response times.
- **Approach is well-understood.** `SET LOCAL` with RLS is a standard PostgreSQL pattern used in production by many multi-tenant applications (Supabase, Citus, etc.).
- **Defense-in-depth is valuable.** Even with careful application-level filtering, RLS eliminates an entire class of cross-tenant data leakage bugs at the database level.
- **Incremental adoption.** RLS can be enabled table-by-table. Start with `SIMAOutcome` (`app.sima_outcomes`) and `Job`, then expand to all tenant-scoped tables.
- **Fallback is trivial.** If RLS causes issues, it can be disabled without data loss (see Section 7).
 
### Implementation Sequence (for P5-A)
 
1. Create `withTenant()` utility in `apps/api/src/lib/rls.ts`
2. Enable RLS on `SIMAOutcome` table (`app.sima_outcomes`) with a tenant isolation policy
3. Wrap SIMA API routes in `withTenant()` using tenant from JWT
4. Add integration tests verifying cross-tenant isolation
5. Benchmark before/after on SIMA endpoints
6. If benchmarks pass, expand to remaining tenant-scoped tables
 
---
 
## 7. Rollback
 
RLS can be disabled at any time without data loss or schema changes:
 
```sql
-- Disable RLS on a specific table (keeps policies but stops enforcing them)
ALTER TABLE app.sima_outcomes DISABLE ROW LEVEL SECURITY;
 
-- Or drop the policy entirely
DROP POLICY tenant_isolation ON app.sima_outcomes;
```
 
Disabling RLS does not affect data, indexes, or application queries. The `withTenant()` wrapper becomes a no-op (the `SET LOCAL` still executes but has no effect without active policies). The wrapper can be removed later in a cleanup pass, or left in place for easy re-enablement.
 
---
 
## References
 
- [PostgreSQL RLS Documentation](https://www.postgresql.org/docs/current/ddl-rowsecurity.html)
- [SET LOCAL](https://www.postgresql.org/docs/current/sql-set.html) — transaction-scoped parameter setting
- [Prisma Interactive Transactions](https://www.prisma.io/docs/concepts/components/prisma-client/transactions#interactive-transactions)
- [Supabase RLS Guide](https://supabase.com/docs/guides/auth/row-level-security) — production reference for the same pattern