Writing

Engineering Notes.

I like to talk about systems—how to build them in the right way for your needs. There isn’t a one-size-fits-all design; the best solutions are shaped by constraints, scale, and what you actually need to optimize.

Choosing a shard key: user, tenant, region, or time?

DatabasesDistributed SystemsScalability

Shard on the unit most requests already own. Hash to spread load inside that unit; layer region or time when residency, lifecycle, or skew demand it—never pick a key because it fit a slide.

May 10, 2026·6 min read

Choosing SQL, NoSQL, and search engines (without bingo cards)

DatabasesDistributed SystemsSystem Design

Start from invariants and access patterns: relational SQL for systems of record, NewSQL when scale meets SQL semantics, NoSQL when the workload is partition-shaped, and search as a rebuildable projection—never the ledger.

May 9, 2026·5 min read

OLTP vs OLAP data modeling: two jobs, two shapes

DatabasesData EngineeringStorage

Transactional systems keep the product correct and fast at the row. Warehouses explain the business across huge scans. Mixing the two jobs in one head—or one undifferentiated schema—hurts both.

May 8, 2026·4 min read

Idempotency for write APIs: surviving retries without duplicate harm

ReliabilityDistributed SystemsAPI Design

Treat duplicate delivery as normal: idempotency keys for money paths, resource-oriented verbs where they fit, and idempotent consumers when messages are at-least-once.

May 7, 2026·5 min read

API versioning and evolution: change without breaking clients

API DesignDistributed SystemsScalability

Prefer additive contracts and tolerant parsers. When you must break, version explicitly, expand-then-contract, and run deprecations like a product launch—not a surprise outage.

May 6, 2026·4 min read

Protecting downstream services from traffic spikes

ReliabilityScalabilityLoad Balancing

Rate limits, circuit breakers, load shedding, and bulkheads are how you keep a partner’s bad day from becoming your outage—without pretending dependencies are infinite.

May 5, 2026·5 min read

Backpressure in high-traffic systems: fail fast, do not hang

ReliabilityScalabilityDistributed Systems

When load spikes, the winning move is to slow the right things deliberately—pull models, bounded queues, shedding, worker caps, outbound isolation, and clients that back off—instead of letting memory and threads die together.

May 4, 2026·5 min read

Queue vs direct RPC: when to use which

Distributed SystemsMessage QueuesArchitecture

RPC is for answers you need right now. Queues buffer load, decouple producers from consumers, and carry retries—when eventual processing is the real requirement.

May 2, 2026·5 min read

Synchronous vs asynchronous communication: how to choose

Distributed SystemsMessage QueuesArchitecture

Request–response keeps things simple when the caller must wait. Queues and events buy decoupling and resilience when work can happen later—at the cost of operational complexity.

May 1, 2026·5 min read

How a request actually moves through a production system

Distributed SystemsNetworkingBackend

A grounded tour from DNS and the edge through services, data, async work, and back to the browser—with observability and failure modes you can recognize in the wild.

April 30, 2026·8 min read