Tech Stack12 June 2026 · 10 min read

Building Usage-Based Billing with Stripe Metering in 2026

Production-grade usage-based billing with Stripe's Billing Meters API: how meters work, idempotency gotchas, late-arriving events, and when Lago is the smarter choice.

Building Usage-Based Billing with Stripe Metering in 2026

Usage-based billing is the pricing model that almost makes sense until you actually build it.

Flat subscriptions are easy to reason about. Usage is not. You're metering something that doesn't exist until runtime, attributing it to the right customer, surviving API rate limits without dropping events, and then explaining a $340 invoice to someone who budgeted $60. The engineering surface is larger than it looks, and the failure modes are mostly invisible until billing day.

Stripe's Billing Meters API — the current one, not the legacy usage records endpoint — is the production starting point for most SaaS and AI products. This post covers what actually matters: how the API works, what it gets right, where it will surprise you, and when to reach for Lago or OpenMeter instead.

One thing to settle up front: if you're reading a tutorial that mentions usage_type: 'metered' on a Price without a backing Meter object, that API is gone. Since version 2025-03-31.basil, every metered price requires a Meter. Old tutorials won't work in production. Migrate forward.

What Does the Stripe Billing Meters API Actually Do?

A tall tower of stacked concrete cubes with a single bright cyan cube at the top, warm beige backdrop, dramatic diagonal light

A Stripe Billing Meter is a named aggregation rule that accumulates usage events into a billable quantity per subscription period.

You define a Meter once: give it an event name (api_call, tokens_processed, whatever maps to your billable unit), specify an aggregation method (sum or count), and optionally attach dimensions for routing events to different rate tiers. Then you report Meter Events — lightweight POST requests to /v1/billing/meter_events — as usage happens in your system. Stripe accumulates them. At invoice time, the aggregated quantity drives the line item.

Three objects in practice: Meter, Price (metered, backed by the Meter), Subscription. Events flow in independently of subscription state. Stripe closes the period, finalizes usage, generates the invoice.

One real improvement over the old usage_records model: Meters exist outside the subscription object. You can report events for a customer before they have an active subscription. A single Meter tracks usage across multiple customers without per-subscription wiring. This matters for trial metering or products where billing cadence doesn't align neatly with subscription activation.

The advanced billing API (currently at 2026-05-27.preview) introduces Rate Cards — separate service intervals from billing intervals, up to 500 rates per card — for products with complex tiered pricing or annual billing with monthly usage accumulation. Unless you're building enterprise pricing tiers from day one, start with the standard Meters API and migrate to Rate Cards when you actually need them.

The Meter You Create Is Mostly Permanent

A glowing cyan disc pressed into a pale clay tablet, leaving a circular permanent impression, warm beige background

This is the detail that creates the most pain in production. Once you create a Meter, you cannot change the event name, aggregation method, or customer-mapping key. Those fields are immutable. If you later decide api_call should aggregate as count rather than sum, or the event name needs to change because your data pipeline evolved, you create a new Meter and migrate active customers.

The value_settings.event_payload_key — the field on the event payload that Stripe reads the numeric value from — is also fixed at creation. If your event payload structure changes, you're either migrating or working around it with a thin adapter layer.

Run the Meter in test mode first. Fire realistic event payloads against it, preview the resulting invoices, and verify the aggregation handles edge cases: zero-value events, very large values, events with missing payload keys. I've seen teams skip this step and ship a Meter that sums a binary presence flag rather than actual token counts. Every invoice is wrong from day one, and the fix is a migration script touching active subscribers during a live billing period.

Think carefully before you ship the Meter config. It's cheap to change in test mode, expensive after.

How Do You Report Events at Scale Without Losing Revenue?

A folded paper channel guiding a line of concrete spheres in single file, with one bright cyan sphere at the leading edge under stark directional light

Use idempotent identifiers, pre-aggregate where volume allows, and switch to the v2 Meter Event Stream above a thousand events per second.

The standard /v1/billing/meter_events endpoint handles 1,000 API calls per second per Stripe account. That covers most SaaS products. For AI products metering individual tokens — where 46% of IT leaders cite unpredictable pricing as a barrier to AI adoption — you'll want the v2 Meter Event Stream. It uses short-lived authentication tokens (15-minute validity, refresh before timeout) and supports up to 10,000 events per second, with enterprise volumes up to 200,000 available on request.

The more pressing issue for most teams isn't throughput. It's idempotency.

Each Meter Event accepts an identifier field. Stripe deduplicates on this within a rolling ~24-hour window. If you don't supply one, Stripe auto-generates it per request — meaning a network timeout that triggers a client retry creates a duplicate billable event. Two events, one action.

Have you seen this in production? A POST to Stripe times out after 29 seconds. Your client retries. Now you have two api_call events for one request. The customer's invoice is wrong. The fix is deterministic identifiers: use the underlying request ID, transaction ID, or a hash of the event payload. Something that's the same on retry.

On pre-aggregation: for high-frequency atomic events, don't send one Stripe event per action. Write raw events to a durable queue — Upstash Redis, Inngest, a Supabase table — and flush them as a single aggregated event per customer every 30–60 seconds. This reduces API call volume dramatically without meaningfully affecting billing precision. The timestamp on the flush event is the flush timestamp, not the original action timestamp. That's usually fine. If you need sub-minute precision for billing, you're building something atypical.

Late-arriving events. The event timestamp must be within the past 35 calendar days and no more than 5 minutes in the future (to accommodate clock drift). Events outside these bounds return timestamp_too_far_in_past or timestamp_in_future errors. For events that arrive after their billing period closes — a background job that completed at 11:58 PM on the last day of the month but whose Stripe POST reached the API at 12:03 AM — Stripe processes asynchronously. That event may land in the next period. Define a policy: accept late events up to N minutes after period close, then reroute or discard. Don't leave it undefined and wonder why customers see charges in unexpected periods.

One more gotcha: Meter Event errors are returned asynchronously via webhooks, not in synchronous API responses. A 200 from the Stripe API doesn't mean the event was successfully metered. Wire up the billing.meter.error_report_triggered webhook event early. Silently dropped events are missing revenue.

When Should You Look Beyond Stripe for Usage Billing?

Look beyond Stripe when your model requires prepaid credit wallets, enterprise draw-down contracts, or genuinely complex multi-meter hybrid invoicing.

The Supabase + Stripe integration pattern covers the full subscription lifecycle for a standard per-seat-plus-usage B2B SaaS product. For the majority of products, that's enough.

When it isn't:

| Scenario | Why Stripe falls short | Consider | |---|---|---| | Prepaid credit wallets | No native credit balance that decrements on metered usage | Lago, Flexprice | | Enterprise commit + overage | Draw-down contracts require significant custom invoice logic | Lago, Metronome | | Multi-meter hybrid invoices | Mixing seat + storage + API usage on one invoice is possible but brittle | Lago, OpenMeter | | Audit-grade usage dispute resolution | Stripe's event log isn't designed for per-event customer disputes | OpenMeter + Stripe |

Lago is the most widely deployed open-source option. It processes up to 15,000 events per second, has prepaid credit systems built in, and is payment-agnostic — you keep Stripe as the processor, Lago manages billing logic. That's meaningful extra infrastructure. For a product whose pricing structurally requires credit burndown or multi-tier commit pricing, it's the right call.

A useful counter-example: BookBed runs on Flutter, Firebase, and Stripe — standard flat subscriptions at €9 per month per tenant, no metering layer. The seat-based model matched how customers thought about value. Adding metering complexity there would have been engineering cost with no pricing benefit. The question isn't "should I use Stripe Meters?" The question is "does usage align better with customer value than a flat fee does?"

If the answer is yes, use Meters. If the answer is "maybe", start flat and add metering later when you have evidence.

Building the Customer-Facing Usage Dashboard

The correct data source for a usage dashboard is Meter Event Summaries, not invoice previews.

const summary = await stripe.billing.meters.listEventSummaries(
  meterId,
  {
    customer: customerId,
    start_time: billingPeriodStart,
    end_time: billingPeriodEnd,
  }
);
const used = summary.data[0]?.aggregated_value ?? 0;

The aggregated_value reflects Stripe's aggregation with the expected eventual-consistency lag. Polling every few minutes in a background job is sufficient for billing display. Don't hammer this endpoint on every page load.

Actually — let me back up. If your customers need real-time usage visibility, which is common for token-based AI products, Stripe summaries alone won't work. The lag between event receipt and summary update can be minutes. You need a parallel live counter in your own system — an Upstash Redis INCR per customer per period, or a running total in a Supabase row — and use Stripe summaries for billing verification, not real-time display.

What you show the customer in the moment and what Stripe invoices can diverge by minutes. That's acceptable. Showing 0 when the invoice says 10,000 units is not.

Dimension Cardinality and the Limits Nobody Reads

Dimensions let you route events to different pricing tiers based on payload properties — billing GPT-4 calls at one rate, GPT-3.5 calls at another, using a model dimension. Useful for AI products. The limits are strict: 10,000 unique dimension value combinations per meter per hour, and 100 unique combinations per customer per meter across all time.

For most products these limits are invisible. For a platform with many distinct model variants and high traffic, they bite. Design dimension schemas conservatively — use model tier groupings rather than exact version strings, and verify your expected cardinality against these limits before launch. Hitting the limit causes events to fail with meter_event_dimension_count_too_high. That error comes back asynchronously. See the webhook note above.

The Architecture That Holds Under Load

Whether you use Stripe Meters directly or layer Lago on top, the durable architecture follows the same pattern:

  1. Application writes raw events to an append-only durable queue on the hot path — never send meter events synchronously inside a user request.
  2. Background job deduplicates, pre-aggregates, and fires to Stripe with deterministic identifier values.
  3. Stripe closes the period, generates the invoice.
  4. Customer dashboard polls Meter Event Summaries; real-time UI reads your own counter.

The queue is load-bearing. A full Stripe integration — subscriptions, webhooks, meter events, invoice handling — is one surface worth building carefully once rather than patching under production pressure.

Saturday at midnight is when this matters. Traffic spike. The flush job starts timing out. If your identifiers are deterministic, Stripe silently deduplicates on retry. If they're not, you're debugging a billing discrepancy Monday morning with a customer who's noticed.

Configure the Meter in test mode. Fire production-shaped events against it. Preview the invoice output until it's right. The billing system should be invisible once running. Correct, invisible, boring. That's the goal.

If you haven't done this yet on your current project: spend the thirty minutes in Stripe's test dashboard before writing a line of production billing code. The Meter config is the decision you can't easily reverse.

DL

Dusko Licanin

Full-Stack Developer · Banja Luka, Bosnia

Full-stack developer shipping SaaS MVPs, web apps, and mobile apps 2× faster than agencies using AI-augmented workflows. Live portfolio: BookBed, Callidus, Pizzeria Bestek.

Frequently Asked Questions

How does Stripe handle idempotency for meter events?

Every meter event accepts an `identifier` field; Stripe deduplicates events sharing the same identifier within a rolling 24-hour window, so always supply a deterministic value derived from your underlying action. If you omit it, Stripe auto-generates one per request — meaning a network timeout that causes a client retry creates a duplicate billable event. Use the underlying request ID, transaction ID, or a hash of the event payload. After the 24-hour window expires, you can no longer cancel an event; the only correction is sending a new event with a negative value to offset it, and if net usage goes below zero, Stripe reports the line item as 0.

What happens to late-arriving Stripe meter events?

Late-arriving events that miss their billing period typically land in the next period's invoice, since Stripe closes periods before invoice finalization and processes events asynchronously. The timestamp must be within the past 35 calendar days and no more than 5 minutes in the future — events outside those bounds return explicit error codes. Define a cutoff policy: after the billing period closes, accept events up to N minutes late, then reroute or discard, rather than leaving the behavior undefined and wondering why customers see charges in unexpected periods.

Can I change a Stripe billing meter after creating it?

No — a Stripe billing meter's event name, aggregation method, and customer-mapping key are immutable once set, and changing them requires creating a new meter and migrating all active customers. This makes thorough pre-production testing essential: run the meter in Stripe's test mode, fire realistic event payloads matching your production schema, and verify the invoice output before touching live subscriptions. The 30 minutes spent verifying meter behavior in test mode is far cheaper than a migration script running during an active billing period.

What's the difference between Stripe Billing and Lago for usage-based billing?

Stripe Billing Meters handles ingestion, aggregation, and invoicing in one system with no extra infrastructure; Lago is an open-source billing layer that adds prepaid credit wallets, enterprise draw-down contracts, and complex pricing logic not natively available in Stripe. Use Stripe alone for pay-as-you-go or simple tiered pricing. [Lago](https://www.getlago.com) is the right call when your model structurally requires credit burndown, commit-based enterprise pricing, or invoice logic that Stripe can't express without significant custom code — at that point the infrastructure overhead is justified.

What rate limits apply to Stripe meter event reporting?

The standard `/v1/billing/meter_events` endpoint allows 1,000 API calls per second per Stripe account; the Meter Event Stream API (v2) supports up to 10,000 events per second using short-lived authentication tokens that expire every 15 minutes, with enterprise volumes up to 200,000 events per second available on request. For most SaaS products, the 1,000/second limit is never a constraint. Pre-aggregating usage — flushing batched counts every 30–60 seconds rather than one event per atomic action — keeps you within limits and reduces API call volume regardless of which endpoint you use.