The clinics on Callidus had every reason to demand a database each: patient records, GDPR, real regulators. I gave them shared Firestore and slept fine — here is why physical isolation is usually the wrong answer to a real fear.
Before that story becomes useful, you need the map. Three patterns dominate multi-tenant SaaS database design, and they are not treated as equivalent in practice no matter how many architecture diagrams present them side by side.
Shared database, shared schema — all tenants in the same tables, separated by a tenant_id column. Isolation is enforced by Row-Level Security at the database layer or query-level filtering in the application. Cheapest to operate, simplest to migrate, trickiest to get wrong silently.
Shared database, schema-per-tenant — one Postgres database, a separate schema per tenant (tenant_a.users, tenant_b.users). Looks stronger than shared tables, but the search_path mechanism that enforces it is advisory, not a security boundary.
Database-per-tenant — a separate Postgres (or Firestore, or MySQL) instance per customer. Physical isolation. The model enterprises invoke in procurement and architects invoke out of caution.
Here is what each costs at production scale.
Database-per-tenant is usually fear dressed up as architecture

Database-per-tenant is usually fear dressed up as architecture: isolation you will never need, plus an ops burden you cannot staff.
Microsoft's Azure SaaS patterns documentation calls database-per-tenant "the most expensive solution from an overall database cost perspective." Microsoft sells the database services that benefit from you running more of them. This framing should mean something.
The isolation concern behind the pattern is almost always this: tenant A might see tenant B's data if something goes wrong. Real fear. The misdiagnosis is assuming physical separation is the only answer — it is the most expensive answer, not the only one.
When you provision a separate database per tenant, your ops surface scales linearly. At 100 tenants, you are managing 100 backup schedules, 100 connection pools, 100 migration runs for every schema change. At 1,000 tenants, you have 1,000 database instances needing coordinated migration, where "coordinated" means orchestration tooling and retry logic for the instance that was cold when the script ran. One analysis puts database-per-tenant at 3-5 times more expensive to maintain than shared models, using 300% more CPU and 200% more memory.
The harder cost to quantify is engineering velocity. Schema migrations in a shared-schema world run once. In a database-per-tenant world, they run N times. Tuesday morning, planning to add a soft-delete column to the patients table: in shared schema, one migration file, one run, done. In database-per-tenant at 50 clinics, that is 50 migration runs with a shell script to iterate them, a retry queue for the instance that was cold, and someone on-call until the fleet is consistent. Every feature that touches the data model now ships across a fleet.
That is the hidden cost nobody puts in the architecture diagram.
Why schema-per-tenant fails in production faster than you expect

Schema-per-tenant is the middle option nobody should pick. The intuition behind it is appealing: stronger isolation than shared tables without the full cost of separate databases. Tenant A's data in tenant_a.users, tenant B's in tenant_b.users. Different namespaces, same database instance.
The problem is Postgres's own catalog. PlanetScale's analysis of tenancy in Postgres is unambiguous: with hundreds of schemas, each containing tables and indexes, "these catalogs grow into millions of rows" and slow the query planner on every query. The practical ceiling before serious performance degradation is a few hundred tenants — precisely the scale where your SaaS business is starting to matter.
You have felt something like this before, even if not from this cause: query planning suddenly slower, migrations that took two seconds now taking two minutes, connection startup dragging. The cause is not your queries. It is that the Postgres system catalog has grown enormous and the planner consults it on every operation.
The security argument for schema-per-tenant collapses under inspection. SET search_path to a tenant schema is advisory, not a security boundary. An application bug that misconfigures search_path leaks cross-tenant data exactly as easily as a missing WHERE tenant_id = ? clause in the shared-schema world. You carry the operational overhead without a meaningful security gain.
Shared database with RLS: the default that actually scales

Shared schema with Row-Level Security is PlanetScale's recommended default and the pattern I reach for first on a new multi-tenant build. Not because it is the simplest — RLS policy subtleties are real — but because it does not blow up your ops budget before you have meaningful revenue.
Four things that must hold in production:
One indexed equality check per policy. The failure mode in Postgres RLS is the subquery in the policy expression: USING (tenant_id IN (SELECT tenant_id FROM memberships WHERE user_id = auth.uid())). That subquery runs for every row evaluated, creating nested-loop regressions invisible in a 50-row dev database but visible at 500,000 rows with uneven tenant distribution. The correct pattern is a single equality check against a JWT claim: USING (tenant_id = (auth.jwt() ->> 'tenant_id')::uuid). One check, indexed, predictable at scale.
RLS-aware composite indexes. Every table needs (tenant_id, ...) composite indexes on the leading column. The query planner uses them only when the policy expression is simple enough to evaluate at plan time — another reason the subquery pattern is doubly harmful: it prevents index use in addition to its own evaluation cost.
PgBouncer transaction mode. PgBouncer transaction mode does not persist SET LOCAL between statements. If you are using set_config to inject tenant context, switch to JWT claims instead. This catches teams that have a working local dev setup and then discover the context is not persisting in production under the pooler.
Service role isolation. Any Postgres connection using the service role key bypasses RLS entirely. Admin functions — webhook handlers, cron jobs, internal tooling — must use a restricted role with a tenant_id claim baked in, not the bypass key. The react-supabase-rls stack guide covers this in depth.
Here's how that played out on Callidus.
Callidus is the exact case people invoke for database-per-tenant: UK aesthetic clinics, patient records, GDPR, real regulatory isolation pressure. I went shared Firestore instead — tenants/{tenantId}/... paths, JWT-scoped role claims, and the mutation guard enforced at both the client routes and the Firestore rules. Three years earlier I would have reflexively spun up a database per clinic and burned half my prod ops budget on it. The decision that saved me was not a clever pattern; it was admitting clinics never actually look at each other's data, so the isolation guarantee just needs to be enforceable, not physical.
When physical isolation actually earns its cost
Three cases genuinely justify it: contractual data residency requirements that prohibit commingling on shared infrastructure, one tenant consuming 90%+ of database resources (the answer is isolating that tenant, not everyone), and deep per-tenant schema customization that shared tables cannot cleanly handle.
In all three cases, the correct architecture is hybrid — shared schema for the long tail, physical isolation for the exception. The Callidus case study is the closest production reference I have for making this call under real regulatory pressure. The multi-tenant tooling overview covers what makes the routing layer that drives hybrid architectures tractable.
The ops math nobody puts in the architecture diagram
200 separate Postgres databases means: 200 backup jobs needing monitoring for silent failure, 200 migration targets every time you ship a schema change with retry queues for cold instances, 200 connection pool configurations each holding minimum idle connections, and 200 monitoring dashboards — or a tooling investment to aggregate them.
Let me correct the framing on that list: the backup count understates the problem. It is migration orchestration that breaks teams at 2am. Schema changes that feel trivial in shared schema become fleet operations in database-per-tenant, and fleet operations are where deployment discipline breaks down when something goes wrong at night.
The connection pool problem is sharp. Postgres has a hard max_connections limit per instance. PgBouncer transaction mode pools connections per database. PlanetScale explicitly calls this the "fatal flaw" of database-per-tenant at scale: minimum pool sizes across 200 databases add up faster than anyone projects at architecture-decision time.
One thing that has shifted this calculation: Neon built branch-per-tenant Postgres on copy-on-write storage where creating a new branch is instantaneous regardless of database size, and idle branches scale to zero — you pay for storage only. Databricks acquired Neon for approximately $1 billion in May 2025, which signals where the market expects this to go. Branch-per-tenant on Neon is the closest available approximation of database-per-tenant economics that work at startup scale. It still carries different operational overhead than shared schema, but it meaningfully lowered the threshold at which physical isolation becomes affordable.
Which one are you building for: 30 enterprise tenants with hard contractual isolation requirements, or 3,000 SMB tenants who need affordable scale and fast schema iteration? That question determines the architecture — before you touch a diagram.
