If you're building a B2B SaaS product and your notification system is a sendEmail() call inside a route handler, this post is about the exact moment that breaks and what you replace it with.
Notification systems look simple until they're not. The actual architecture problem has four moving parts: how events trigger notifications, how you route them across channels, how you respect user preferences and suppression rules, and what you deliver into each channel's inbox, feed, or push payload. Get one of these wrong and users report bugs that aren't bugs — they're just invisible, undelivered, or duplicated notifications.
This post walks the full architecture — from the simplest setup that actually works to the trade-offs at scale. It covers channel fan-out, digest grouping, preference management, and where the build-vs-buy decision on Knock, Courier, and Novu actually lands.
Why Does the In-App Inbox Beat Every Other Notification Channel?

The in-app inbox is the one notification surface you fully own — push, email, and SMS all route through someone else's infrastructure.
Mobile push requires the user to have granted permission. iOS enterprise opt-in rates sit around 40–60% depending on the product category; in some B2B tools the real number is much lower because users dismiss the permission modal on first load and never see it again. Email arrives sorted by spam filters, promotions tabs, and category rules you have no control over. SMS costs money per message and is regulated under TCPA in the US. The in-app feed, by contrast, lives inside your product — every user who opens the app sees it, unconditionally.
That ownership changes how you should prioritize. Build the in-app feed first. Even a simple database table (notifications with userId, type, content, readAt) that your frontend polls every 30 seconds is better than leading with email and treating in-app as an afterthought.
The real-time delivery question comes next. A polling approach is fine for low-volume notification load — say, under 50 notifications per user per day. Once you're building activity feeds for comment threads or multi-user collaboration, polling adds latency and unnecessary request pressure. At that point you want WebSockets or a pub/sub layer.
Ably delivers in-app messages with ~65ms latency across 15 global data centers and is the lowest-friction path for adding real-time delivery to an existing Next.js backend. Pusher is cheaper at low volume (free tier: 200,000 messages/day) but locks you to a single data center region post-signup — a constraint you can't change without recreating the account. You can self-host Socket.io on your own infrastructure, though scaling it past a single server requires Redis pub/sub coordination that quickly becomes its own operational problem.
One thing worth noting: real-time delivery services handle message transport, not notification logic. They don't manage read states, digest batching, or user preference lookups. That's a separate concern — either your own service layer or something like Knock.
How Do You Route the Same Event Across Email, Push, and In-App?

Fan-out routing means taking a single application event and delivering it to multiple channels based on per-user preferences and notification category rules.
The naïve implementation is synchronous fan-out directly in your route handler. It works at 100 users. At 10,000 concurrent users it creates request latency whenever any single provider is slow. Wednesday at 11pm, your email provider responds in 4 seconds instead of 200ms — and every API request that triggers a notification sits blocked, waiting. At 100,000 users, that single slow call cascades across your entire API surface.
The correct pattern for growth-stage B2B SaaS is async fan-out through a queue:
- Your application publishes an event: notification.created with userId, eventType, and data payload.
- A worker consumes the event, resolves per-user preferences, and determines which channels should receive this notification type.
- The worker fans out to channel workers: email.worker, push.worker, inapp.worker — each independently retryable.
- Each channel worker attempts delivery with exponential backoff and routes failed messages to a dead-letter queue.
This is the architecture Inngest and Trigger.dev are built to handle without you managing queue infrastructure yourself. Inngest's durable functions model handles the retry graph automatically — write the fan-out logic as a TypeScript function and Inngest persists and retries across steps. The equivalent on Redis + BullMQ works but requires you to explicitly wire idempotency keys, DLQ processing, and backoff configuration.
Fan-out at scale adds one more wrinkle: high-subscriber events. A comment in a channel with 100,000 subscribers means 100,000 notification records need to be written. Writing them all synchronously on the event is expensive. According to a systems architecture analysis by Codelit, platforms at scale use a hybrid model: fan-out on write for small subscriber counts (under roughly 1,000), fan-out on read for high-subscriber events — the notification service stores the event once and computes per-user relevance at query time. Same pattern Twitter uses for high-follower accounts.
Webhooks are a fourth channel worth designing into your fan-out layer early. Enterprise customers want to pipe your notifications into their own Slack channels, Jira boards, or internal tooling. A webhook job in the fan-out queue works exactly like any other channel worker: attempt delivery, retry on failure, dead-letter on exhaustion.
Should You Build Your Notification System or Buy Knock, Courier, or Novu?

Build your own when notification requirements are simple, stable, and unlikely to add per-user channel preferences or digest batching within the next six months.
Buy when multi-channel routing, preference management, and a hosted notification feed consume more engineering time than your actual product features. That threshold arrives sooner than most teams expect.
The honest economics: building Scale-tier notification infrastructure — multi-vendor fallback, per-channel preference management, observability, GDPR/CAN-SPAM compliance, and hosted preference pages — typically requires 6–12 months, a team of 3–5 engineers, and costs exceeding $500,000 in the first year when salaries, infrastructure, and opportunity cost are included. That's not a reason to immediately reach for a vendor. It IS a reason to be honest about what stage you're at.
Let me back up — that figure is for Scale-tier (100k+ users, five-plus channels, compliance infrastructure). Most teams evaluating Knock or Courier are at Growth stage, where the calculation is different.
For MVP (under ~1,000 users): call provider APIs directly. Resend or Postmark for email — the Next.js + Resend integration is the lowest-friction path on a Next.js backend. FCM for Android push, APNs for iOS push. A simple notifications table in your database for in-app. No orchestration layer.
At growth stage (1,000–100,000 users), the build-vs-buy question gets real:
| Scenario | Recommended approach | |---|---| | Engineering team owns everything, TypeScript stack | Knock — 2–4 hour setup, pre-built React feed components, 10k free/month | | Design or growth team edits notification templates | Courier — no-code template builder, 50+ provider integrations | | Self-hosting required, open-source preferred | Novu — MIT-licensed, 20k+ GitHub stars, eliminates per-notification costs | | MVP, email + in-app only | Build: Resend + simple notifications table |
Knock is the vendor worth evaluating first for a developer-native implementation. According to Knock's implementation comparison, basic multi-channel delivery takes 2–4 hours to set up. It ships production-ready React components for the notification feed — real-time updates, read states, and a preference center modal — which removes a real chunk of UI work. The pricing jump from the free tier to the next plan is steep, so model your expected monthly notification volume before committing.
Courier has a drag-and-drop template builder that lets non-engineers modify notification content without a deployment. If your growth team will own notification campaigns, that matters. If only engineers touch notifications, that surface is overhead.
Novu is the open-source option (MIT-licensed, 20k+ GitHub stars). Self-host on your own infrastructure and you eliminate per-notification costs entirely at scale. You're responsible for uptime, database migrations, and version updates — true ownership cuts both ways.
Notification Preferences: The Architecture Decision Teams Always Defer
Per-user notification preferences control which channels receive which notification categories. Most teams bolt this on after the first wave of "I'm getting too many emails" tickets. By then the preference model is already fighting an established data shape.
The minimum viable preference model has two dimensions: category (comment notifications, billing alerts, security events) and channel (email, push, in-app). A user can turn off push for comment notifications while keeping email on for billing. Store this as a flat object in your user profile — the exact schema matters less than having a consistent lookup path your router calls on every fan-out event.
The suppression rule that almost always gets overlooked: some notifications must bypass user preferences entirely. Password reset emails, payment failure alerts, security breach notices. A user who opted out of all email still needs to receive "your payment method failed." Add a mandatory flag on notification categories from day one and make the router skip preference lookup for those categories.
Digest grouping is the preference variant most teams underinvest in. Instead of 15 separate "new comment" push notifications over an hour, a digest batches them: "You have 15 new comments." Building this yourself requires a time window (batch for 15 minutes, then deliver), a deduplication check, and a count-aware template. Knock has digest workflows as a built-in primitive. On BullMQ, it's a delayed job keyed per user per notification category — the job accumulates events until the window expires, then delivers once.
One hard constraint worth knowing: FCM (Android) and APNs (iOS) both limit push payloads to approximately 4 KB. Your push channel worker must truncate or summarize content before delivery. Doesn't show up at MVP scale. A real bug in the field once you're sending content-rich notifications.
What does your current notification architecture look like? If it's still synchronous provider calls inside a route handler and you're past 1,000 active users, the queue migration is the highest-return infrastructure change available. The SaaS MVP stack guide covers the rest of the core infrastructure — auth, database, billing. Notifications are the piece that gets deferred longest and breaks loudest when it does.
