Choosing a shard key: user, tenant, region, or time?

The shard key is not a clever hash function contest. It is the answer to: “If I could only co-locate one thing so that most reads and writes stay local, what would that thing be?”

Get that wrong and you win cross-shard joins, hot spots, and pager rotations where nobody remembers why email was the partition key. If you have ever seen Vitess or a sharded Dynamo story, you know the key is half the product contract.

Shard key dimensions

1. First principle: start from ownership and access patterns

Ask:

Who owns the row in product language? A consumer, a company account, a region, a device?
What scope does a typical request carry? user_id, tenant_id, country, session_id?
What must never fan out on the hot path? Checkout is different from “global search across all tenants.”

A good shard choice:

keeps the majority of transactions on one shard (or a small, predictable set),
spreads write load instead of funneling it,
matches compliance and latency stories you can defend in an audit.

2. Sharding by `user_id` (consumer products)

Good fit when most traffic is “this user’s cart, settings, feed slice scoped to me.”

Benefits

Simple mental model for engineers and support.
Co-locates data you usually fetch together.
Easy to reason about backups per cohort (with care).

Trade-offs

Global features (social graph, cross-user search) need fan-out or secondary indices.
Power users can still skew a shard if popularity is wildly uneven—mitigate with sub-sharding or virtual shards.

Heuristic: default for B2C when the product is user-centric and you are not primarily a multi-tenant control plane.

3. Sharding by `tenant_id` or `account_id` (B2B SaaS)

Good fit when the paying customer is a company, isolation is a selling point, and you bill by seat or usage per org.

Benefits

Noisy neighbor containment—move or throttle one tenant without guessing which user rows bled together.
Operational stories: export, delete-for-GDPR, per-tenant upgrades.

Trade-offs

Elephant tenants—one Fortune 10 can dwarf the rest; you need dedicated cells, higher quotas, or sub-partitioning inside the tenant.
Cross-tenant reporting belongs in the warehouse, not ad-hoc SQL across every shard at request time.

Patterns: hash tenant → shard; promote hot tenants to isolated pools; keep a catalog of where each tenant lives.

4. Sharding by region or geography

Good fit when data residency, latency to humans, or regulatory boundaries are first-class requirements.

Benefits

Clear story for lawyers and customers (“EU data stays in EU”).
Users near the region see predictable RTT.

Trade-offs

Humans and devices travel; define what “home region” means.
Cross-region aggregates and DR drills are expensive—plan pipelines, not midnight SQL.

Common pattern: region as the outer boundary, then hash(user_id) inside the region for even spread.

5. Sharding by time

Good fit when data is append-mostly and cold data ages out: logs, metrics, audit, click streams.

Benefits

TTL and archival become mechanical (drop partitions).
Queries usually filter on recent windows.

Trade-offs

“Today” is always hot unless you add a second dimension (hash) to stir writes.
Backfills and late events need explicit handling.

Typical pattern: date_bucket + hash(high_cardinality_id) so retention and load both behave.

6. Other dimensions: feature or domain slices

Sometimes the natural seam is bounded context, not user: orders vs billing vs inventory subsystems that do not need to share a shard map.

Use this when teams and failure domains already align with domain lines—still document cross-domain reads so nobody assumes a magical join across cells.

7. Hash, range, composite, and hierarchical keys

7.1 Hash sharding

Spreads keys pseudo-randomly across nodes—great for even write pressure when the key has high cardinality.

Weak for range scans unless the product query is always equality-shaped.

7.2 Range sharding

Keeps contiguous keys together—great for time-ordered or sorted access.

Risk: hot ranges (popular prefixes, monotonic IDs). Combine with salting or secondary sharding when needed.

7.3 Composite shard keys

Example: (region, user_id) or (tenant_id, hash(user_id)) when one dimension alone would skew or violate boundaries.

7.4 Hierarchical sharding

Outer dimension (region, tenant class) then inner hash—matches how organizations actually grow.

7.5 Practical rules

OLTP user-ish workloads: start with hash on the primary ownership key.
Logs / events / metrics: time bucket + hash of a high-cardinality dimension.
Compliance / locality: hierarchy first, hash inside.
Avoid shard keys that are volatile (email), low cardinality alone (country), or monotonic without mixing (raw created_at only).

Sharding is not the same thing as read replicas, but you often end up with both: partitioned primaries for write scale, replicas for read-heavy paths that still respect tenant boundaries. Sketch of that split:

Separate reads & writes

8. How to choose between user, tenant, region, or something else

Work through:

What is the most common request scope? That is your first-class shard dimension.
What invariants cross scopes? If “always global,” you need async projections—not bigger scatter-gather in the request path.
What fails if one shard melts? Pick boundaries that match blast radius you can explain to finance.
What does ops need? Tenant moves, regional failovers, per-tenant export—make those operations possible without rewriting physics.

Rules of thumb

Product shape	First hop
Consumer app	`hash(user_id)`
B2B SaaS	`hash(tenant_id)` + plan for elephants
Logs / events	`time_bucket` + `hash(dimension)`
Regulated locality	`region` → inner hash

9. Anti-patterns to avoid

Sharding on emails or usernames that users can change—your mapping becomes a migration treadmill.
Sharding on metrics you do not control (current QPS per key) instead of stable ownership.
Monotonic-only keys for writes (naive time) without a stir—everyone piles onto the last node.
Pretending cross-shard joins are free because the ORM still lets you type them.

10. Interview-style summary

Name the unit of ownership the business already uses.
Pick hash vs range vs composite from access paths, not vibes.
Add region or time when law, latency, or lifecycle demand it.
Call out hot spots (elephants, “today,” monotonic IDs) and how you spread them.
Admit fan-out features and solve them with async indexes, warehouses, or product scope changes—not bigger SQL hope.

Summary table

Shard key	Best for	Main risk	Pattern
`user_id`	B2C product surfaces	global graphs / search	`hash(user_id)`
`tenant_id`	multi-tenant SaaS	elephant tenants	`hash(tenant_id)` + isolate big
`region`	residency, RTT	travelers, cross-region BI	region → inner hash
time + hash	telemetry, audit	hot current bucket	`date_bucket` + `hash(id)`

Closing

The shard key is a product decision wearing a database costume.

Pick the dimension your requests already believe in, spread load inside it on purpose, and document the two queries that will hurt—because those are the ones that will page you at 3 a.m. anyway.

Choosing a shard key: user, tenant, region, or time?

1. First principle: start from ownership and access patterns

2. Sharding by user_id (consumer products)

3. Sharding by tenant_id or account_id (B2B SaaS)

4. Sharding by region or geography

5. Sharding by time

6. Other dimensions: feature or domain slices

7. Hash, range, composite, and hierarchical keys

7.1 Hash sharding

7.2 Range sharding

7.3 Composite shard keys

7.4 Hierarchical sharding

7.5 Practical rules

8. How to choose between user, tenant, region, or something else

9. Anti-patterns to avoid

10. Interview-style summary

Summary table

Closing

2. Sharding by `user_id` (consumer products)

3. Sharding by `tenant_id` or `account_id` (B2B SaaS)