OLTP vs OLAP data modeling: two jobs, two shapes
Transactional systems keep the product correct and fast at the row. Warehouses explain the business across huge scans. Mixing the two jobs in one head—or one undifferentiated schema—hurts both.
Think of two different systems with two different bosses:
- OLTP answers “can this user check out right now?”
- OLAP answers “how did revenue move last quarter?”
Same company, different latency, different access patterns, different modeling trade-offs. Pretending they are the same database personality is how you get neither fast writes nor cheap scans—like running a finance close-out query on the same Aurora writer your checkout uses because “it is all SQL.”
OLTP vs OLAP modeling
1. OLTP — the system that keeps the product alive
OLTP (online transaction processing) is what sits behind live traffic: signups, carts, payments, inventory reservations.
Typical engines people mean: PostgreSQL, MySQL, Cockroach-style distributed SQL, Spanner-class systems.
Characteristics that should drive modeling:
- Low latency targets—often single-digit milliseconds for the hot path.
- High concurrency—many small sessions doing short work at once.
- Small result sets—“give me this user,” not “aggregate five years of clicks.”
Modeling habits:
- Normalize to reduce update anomalies; enforce constraints.
- Index for the queries you actually run in production—not every hypothetical join.
- Row-oriented storage (conceptually): fetching all columns of one row is the happy path.
Where it hurts: giant ad-hoc aggregations across billions of rows. You can force it, but you are using the wrong hammer and everyone queues behind you.
Most teams first relief valve is not a warehouse—it is read replicas with guardrails so BI does not steal the buffer cache from UPDATE inventory. Same idea as back pressure, but for who gets to touch rows:
Separate reads & writes
2. OLAP — the system that explains what is happening
OLAP (online analytical processing) powers reporting, dashboards, experimentation summaries, finance close-outs.
Examples: Snowflake, BigQuery, ClickHouse, Redshift-class warehouses.
Characteristics:
- Large scans and aggregations across millions or billions of rows.
- Latency tolerance measured in seconds (sometimes more) if the answer is trustworthy and cheap.
- Column-oriented storage and heavy compression: read only the columns the query touches.
Modeling habits:
- Denormalized star or snowflake schemas—facts plus dimensions on purpose.
- Partition by time (and cluster/sort keys that match real filters).
- Prefer batch or micro-batch loads over trickling row-by-row updates into the fact table.
Where it hurts: pretending it is a general-purpose OLTP replacement—mutable row churn, many small point updates, and interactive session locking are not what warehouses optimize for.
3. The time dimension
OLTP usually cares about now: current cart, current balance, current feature flag.
OLAP cares about history and drift: how cohorts moved week over week, how experiments aged, how revenue recognized.
That difference shows up in keys, slowly changing dimensions, late-arriving facts, and backfills—topics that belong in warehouse modeling, not bolted onto every OLTP table “just in case BI asks.”
4. Why one database rarely does both jobs at scale
You can run small reports on a replica or spin up an analytical extension in a pinch.
At scale, one undifferentiated schema optimized for neither side tends to produce:
- long-running analytics queries stealing buffer cache from checkout,
- indexing strategies that help point reads but hurt wide scans (or the reverse),
- schema change fear because “the dashboard team” and “the payments team” share one fragile table.
Splitting workloads is not always two vendors—it can be primary + read replica with guardrails, or primary + warehouse. The point is separate SLOs and separate modeling headspace.
5. Bridging OLTP and OLAP in practice
Common bridges (pick based on how fresh the warehouse must be):
| Pattern | Shape | Latency vibe |
|---|---|---|
| Batch ETL | extract → transform → load on a schedule | hours common |
| ELT | land raw or lightly typed fast; transform in SQL inside the warehouse | flexible |
| CDC | stream inserts/updates/deletes; rebuild analytic tables incrementally | minutes to near–real-time |
Modeling implication: keep normalized source-of-truth in OLTP, rebuild analytic projections in OLAP. Trying to make one physical table serve both audiences usually ends in compromise nobody enjoys.
6. Interview-summary version
OLTP
- Small, frequent reads/writes.
- Normalized schema + constraints.
- Row stores; index hot paths.
- Optimize for correctness + predictable latency.
OLAP
- Big scans and aggregates.
- Denormalized star/snowflake.
- Column stores; partition time.
- Optimize for scan cost + analyst iteration speed.
Bridge
- ETL / ELT / CDC move data; freshness vs complexity is the trade-off.
Closing
Two systems, two schemas, two SLOs. Let OLTP be boring and fast for users; let OLAP be wide and cheap for questions. Sync the expectations between teams—not by cramming every dimension into the transactional schema, but by owning the pipeline between them.