CASE 02 / B2B SAAS Growth stage / 6-week engagement

// ORBITAL (ANON)

TTI from 340ms to 80ms. 76% server CPU cut at 5k concurrent.

Client type
Growth-stage B2B analytics SaaS
Engagement
2026-02 - 2026-03
Tier
Spot Audit + follow-on, 6 weeks

Context

Orbital (anonymized) is a growth-stage B2B analytics platform. Their product is a dashboard: tenants connect a warehouse, the app materializes a subset of their data, and the UI renders a tree of nested metrics with real-time deltas streamed over WebSocket. Most of the UI and a good chunk of the query layer had been scaffolded with LLM assistance over the prior year. It worked beautifully in the demo environment - a synthetic org with three teams and fifteen users. In production, for their largest tenant, the dashboard took four seconds to become interactive and the WebSocket stream occasionally stalled for up to ninety seconds.

They'd already thrown a sprint at it. They added a Redis cache in front of the heaviest endpoint, upgraded their Postgres instance twice, and doubled the Node workers. TTI moved from 4.2s to 3.8s. Server cost doubled. Their engineering lead, to his credit, stopped adding hardware and called us.

The brief was blunt: either the dashboard becomes interactive in under 150ms at p95 for their top-ten tenants, or they are going to lose two named accounts by end of quarter. We had six weeks.

Scope

Three surfaces, all performance-focused:

Method

Performance dimension gets heavy weight in our nine-dimension protocol, but a real perf engagement also pulls hard on observability, concurrency, and data-integrity dimensions - slow queries usually have an index bug or a query-plan bug behind them, not just raw volume.

Tooling

Concrete scenarios

Scenario A - First render of the top-ten tenant dashboard. Cold cache, simulated mid-tier laptop with 4x CPU throttle, 100ms added network latency. We recorded the time from navigation start to the moment the primary metric tiles became scrollable without layout shift. Baseline: 3.4s median, 4.6s p95.

Scenario B - Concurrent tenant switches. Ten simulated users per tenant across five tenants, each switching tenant context every 8 seconds for ten minutes. Measured server CPU, p95 query time, and WebSocket message latency.

Scenario C - WebSocket storm. Forced the backend into a state where 40 metric cards updated every 250ms for a five-minute window. Measured message queue depth, client tab memory, and main-thread blocking time.

Findings

CriticalN+1 query on org to teams to users

The endpoint powering the dashboard header ran an outer query for the org, an inner per-team query for member counts, and a per-user query for the avatar metadata. For small tenants this was six queries. For their largest tenant it was 1,400. Total warehouse time for a single page load: 2.1 seconds on the server alone. The LLM-generated ORM scaffolding had written a loop that looked like eager loading but wasn't - it called the association accessor inside a map, which triggered lazy resolution per iteration.

CriticalReact context re-render cascade

A single root-level context provider held the entire tenant payload: org, teams, users, metrics, filter state, and a timestamp that updated every 250ms from the WebSocket stream. Every subscriber to that context - more than 200 components - re-rendered four times a second. The React DevTools profiler showed commits taking 180ms on mid-tier hardware. Splitting the context into four smaller providers, with the timestamp isolated to only the components that used it, cut commit time to 12ms.

HighWebSocket fan-out with no backpressure

The backend broadcast every metric update to every subscribed client regardless of which metrics that client was actually rendering. The client-side filter came after deserialization, meaning a tab sitting on a dashboard with four cards still processed deltas for every card in the tenant. With 800 cards on the largest tenant and updates every 250ms, each tab was decoding and discarding roughly 3,200 messages per second during a storm. Server-side topic filtering with per-client subscription manifests reduced that by 94%.

HighMaterialized view refresh blocking the read path

The top-level KPI view was refreshed on a five-minute cron with a REFRESH MATERIALIZED VIEW (non-concurrent). During the refresh - which took 7 to 11 seconds for the largest tenant - the read path blocked. Switching to REFRESH MATERIALIZED VIEW CONCURRENTLY required adding a unique index, which the original generation had silently omitted.

MediumBundle bloat from unused date library

The bundle shipped both moment and date-fns. Only three call sites used moment. Replacing them with date-fns equivalents and tree-shaking removed 78 KB gzipped from the critical JS bundle.

Outcome

Headline: TTI dropped from 340ms to 80ms at p95 for the top-ten tenant profile, measured under the Scenario A load profile after all fixes were deployed. Secondary metrics:

Takeaway

Performance problems in AI-generated SaaS cluster around two patterns: ORM scaffolding that hides expensive work behind innocuous-looking accessors, and React/state scaffolding that over-broadcasts updates because the LLM doesn't model the cost of a re-render. Both are invisible in the demo environment and catastrophic at scale. The fix is almost never "more hardware". It's a query-by-query, commit-by-commit profile with real tenant shapes under real CPU throttle. If your dashboard is slow for big customers and fine for small ones, you already know which direction to look.

Want the same?

Book a 20-min call. You explain the surface. I explain which tier fits.

Book a 20-min call arrow_forward
Next case
Toniq → 0 critical in SOC2
All cases
Back to index →