LOOV Data Platform: near real-time without real-time

Building a near-real-time DWH on ClickHouse + Debezium + dbt for a 120-person retail+medtech operation. Why I pushed back on real-time, and where late ownership cost us a five-week rebuild.

"Real-time" is almost never the right ask. The right ask is "fresh enough that the decision is taken here, not at 9am tomorrow." Most of the time those two are five minutes apart in latency and a kilometre apart in cost.

A data platform earns its keep when it stops being a reporting layer and becomes a continuous business signal. At LOOV that flip happened in roughly month seven. The architecture got us there, but the architecture was less interesting than the conversation that produced it.

Why I pushed back on real-time

The first ask from operations came on a Tuesday standup in month two. Anna, our head of analytics, said: "we need real-time dashboards." I asked her to walk me through one.

We picked the obvious one: live order volume on the retail floor. I asked who looks at it, on what cadence, with what action. The store manager, every 30 minutes, with the option of pulling a second cashier from another role. We drew the loop on the whiteboard. The action takes about 8 minutes to execute. The cadence of looking is every 30. The decision is gated by manager bandwidth, not data latency.

We landed on a near-real-time DWH stitched out of operational sources. Updates land within a small number of minutes for the things that need it; everything else runs on hourly or daily windows. The stack is ClickHouse for analytical store, Debezium for change data capture from Postgres and the CRM, dbt for transforms, Metabase for the dashboard surface. The cost of running it is roughly an order of magnitude cheaper than a push-based pipeline would have been on the same volume, and the operational complexity is half. The team built it without ever doing a stream-processing on-call rotation.

The second-order benefit was that the data team did not become the bottleneck. With near-real-time you can iterate on a model without coordinating around a streaming job that someone else owns. We added six new tables in week eight without anyone losing sleep.

Customer 360 as a product, not a table

The other big call was treating customer profiles and segmentation as a first-class internal product, not a side table on the warehouse.

A "customer 360 view" written as a SQL view is a snapshot. It will be useful for two months, then drift, then quietly mislead. We built it instead as a domain object with versioned attributes, owned by a small team, with a written contract, a changelog, and dashboards that monitored its own health (null rates, attribution coverage, segmentation drift). Schema changes followed a stripped-down semver: minor for additive attributes, major for renames or semantics shifts. Major bumps required a written sign-off from each consumer team.

The cost of that discipline showed up immediately. The mobile app team wanted segmentation for personalised loyalty triggers. Instead of writing their own join, they consumed the contract. The CRM internal tools wanted the same segmentation. Same contract, no fork. Within a quarter, "customer" stopped being something different in three places and became one thing in one place.

This is the bit I think is underrated. Most discussions about data platforms are about pipelines and stacks. The structural decision that ends up mattering is whether the canonical entities (customer, order, location, employee) are products or are leftovers from queries.

Where I paid for late ownership

I want to admit one expensive mistake.

Customer 360 we owned and contracted from week one. Order events we did not. We let the order pipeline grow organically: every consumer that wanted order data wrote their own join over Bitrix and the legacy system, and over four months we accumulated five dashboards built on top of slightly different definitions of "order completed." When the company changed how a refund affected order status, three of those dashboards reported the right number, two reported the old number, and the leadership team spent a week not believing any of them.

We froze the order pipeline in month nine, sat with the teams that owned it implicitly, and rebuilt the contract from scratch. Five weeks of engineering time, real cost, and the sales team operated with a handwritten weekly extract during the freeze.

The lesson is not "do contracts for everything." It is: at any given moment a small number of entities are about to become canonical whether you plan it or not. Order was already canonical by month four; I noticed in month nine. Four months of drift, paid in five weeks of rebuild. If I were running the same project today I would budget two days a quarter to ask "what entity has secretly become canonical that I am not treating as canonical."

Where I overcorrected

I held the "no real-time" line a little too hard. There were 5-10 specific decisions in operations where 30-minute latency was actually too slow, and we forced them into the same pattern as the rest. Year two we built a small streaming side-channel for those, and that was the right call. The lesson is not "use streaming" or "do not use streaming." It is "draw the decision loop before you pick the latency budget", and I should have left more room in year one for the few cases where the loop was actually tight.

What this turned into in practice

Today the platform is a near-real-time DWH for the things that need it, batch for the rest. Domain entities (customer, order, location) maintained as products with contracts. Dashboards owned by their consumers and pruned aggressively. Anna is now the data product owner; the engineer-side analyst from the CRM piece moved into her team in year two and started writing the contract document for the location entity in week one.

That is the only metric I trust for this kind of platform: did the next hire start by writing a contract, or by writing a query? If the answer is contract, the platform is doing its job.