Kafka Partitioning Strategy: How to Design Topics That Scale

Partition count is the most consequential — and most irreversible — decision you make when designing a Kafka topic.

Get it right and your system scales gracefully for years. Get it wrong and you're either stuck with a rebalancing disaster at 3 AM or watching idle consumers sit on empty partitions while one thread drowns.

Designing Kafka topologies for a real-time event distribution platform processing 10M+ messages/day across multiple tenants and hotel properties globally, this is the framework I'd use again.

Why Partitions Matter More Than You Think

A partition is the unit of parallelism in Kafka. Each partition is consumed by exactly one consumer within a consumer group at any point in time.

More partitions give you:

Higher consumer parallelism ceiling
Better throughput distribution across brokers
Finer-grained rebalancing granularity

More partitions cost you:

More file handles on each broker (each partition = a directory of segment files)
More metadata for the controller to track
Longer, more disruptive consumer group rebalances
Higher end-to-end latency if over-partitioned (broker memory is spread thin)

Partition count is a one-way door. You can increase it, but increasing causes a rebalance and breaks key-based routing for messages in-flight. You can never decrease without deleting the topic.

The Four Decisions You Actually Need to Make

Decision 1 — The Partition Key

This is the single most important choice. The key determines:

Which partition a message lands in
What ordering guarantees you get
Whether you'll have hotspots

The Selection Framework

Requirement	Key to choose	Risk to watch
Per-entity ordering (e.g., all events for entity X in order)	Entity ID	Skew if some entities are far more active
Global ordering	Single constant key	All messages → 1 partition — kills parallelism
Max throughput, ordering not needed	Null (round-robin)	No ordering guarantee whatsoever
Multi-tenant, per-entity ordering	Entity ID (with per-tenant topics)	Topic proliferation
Multi-tenant, shared topics	Composite: TenantID + EntityID	Custom partitioner needed

The Hotspot Check

Before committing to any key, sample 24 hours of production traffic and ask:

Does the top 1% of key values account for > 20% of total message volume?

If yes — you have a hotspot risk. Mitigation options:

Add a bucket suffix to the key: entityId + "-" + (random % N) — trades ordering granularity for even distribution
Pre-partition at the producer level
Choose a higher-cardinality dimension of the same entity

In our platform, we partitioned by property ID (a high-cardinality entity identifier). Traffic distribution analysis showed no single property drove more than 0.1% of total volume — safe from hotspots.

Decision 2 — Partition Count

The formula everyone uses:

Partitions = max(throughput-based count, consumer-parallelism-based count)

Throughput-Based Count

Partitions = Target throughput (MB/s)
             ─────────────────────────────────────────────────────
             min(Producer throughput per partition,
                 Consumer throughput per partition)

A single Kafka partition typically handles 10–50 MB/s depending on message size, compression, and consumer processing speed. For most event-driven workloads processing KB-sized messages, throughput alone rarely drives you above 10–20 partitions.

Consumer-Parallelism-Based Count

This is usually the binding constraint:

Partitions ≥ Max consumers you will ever want to run simultaneously

Take your target pod count at peak load, add a growth multiplier, and round up to a clean number. A 10:1 partition-to-consumer ratio is a reasonable starting point — it gives you headroom to scale consumers up to 10x without repartitioning.

The Rule of Thumb Table

Daily message volume	Typical peak TPS	Partition range
< 1M / day	< 50	10–30
1M–10M / day	50–500	30–100
10M–100M / day	500–5,000	100–500
> 100M / day	> 5,000	500+ (measure carefully)

At 10M+ messages/day with peaks hitting ~1,000 messages/second per processing pipeline, our high-volume topics sat comfortably in the 500-partition range with a 10:1 partition-to-consumer ratio.

Match Partition Count to Topic Role

Not every topic in your pipeline carries the same volume or needs the same parallelism ceiling. A common mistake is assigning the same partition count everywhere.

Topic Role	Relative Volume	Partition guidance
High-volume fanout	High	Match consumer pool
Pre-processing	Low-medium	4–48, don't over-shard
Downstream push	Matches fanout	Same as fanout topic
Burst/resync	Spiky	High — headroom first

In our platform, a pre-processing topic deliberately ran at a fraction of our main topic's partition count — over-partitioning it would have created unnecessary rebalancing overhead for a step that didn't need parallelism.

Decision 3 — Replication Factor

For production: always 3.

Config	Value	Why
`replication.factor`	3	Tolerates 1 broker failure
`min.insync.replicas`	2	Prevents silent data loss when 1 broker is behind
`acks` (producer)	all	Waits for all ISRs to acknowledge

Watch Broker Leader Balance

With 3 brokers and many partitions, leaders distribute across brokers. Over time — especially after broker restarts — they drift.

Monitor replication bytes in per broker. If one broker is handling 5x the replication traffic of another, you have a leader imbalance. Use kafka-leader-election.sh --election-type PREFERRED to rebalance.

In our cluster, post-incident broker restarts regularly caused temporary leader imbalance. Making preferred leader election part of our post-incident runbook solved it.

Decision 4 — Consumer Group Mapping

Partition count means nothing without a clean consumer group design.

Rules that hold at scale:

One consumer group per topic — shared consumer groups across teams/services create invisible coupling. When one consumer falls behind, the group's lag aggregation hides where the problem is.
Consumer count ≤ partition count — consumers beyond the partition count sit idle. The partition is the ceiling, not the consumer count.
Consistent naming convention — {ENV}.{SERVICE}.consume-{topic-short-name}. Makes Grafana queries, Splunk searches, and KEDA triggers consistent across environments.
Monitor lag per partition, not just aggregate — a single hot partition drowning in lag is invisible in an aggregate view.

Common Mistakes and Their Symptoms

Mistake	Symptom	Fix
Too few partitions	Consumers max out CPU, lag grows despite healthy consumers	Increase partitions (causes rebalance)
Too many partitions on low-volume topic	Slow controller failover, excessive file handles, slow rebalances	Match partitions to actual need
Wrong partition key (e.g., timestamp)	Random distribution, ordering lost, unpredictable consumer load	Redesign key — unavoidable topic recreation
Hotspot key	One partition at 90% lag, others empty	Compound key with bucket suffix
Leader imbalance	One broker at 5x replication bytes vs others	Run preferred leader election
Shared topics in multi-tenant system	Can't isolate tenant incidents	Migrate to per-tenant topic model

The Pre-Launch Checklist

□ Partition key chosen — ordering requirements documented
□ Key distribution analyzed — top 1% of keys < 20% of volume
□ Partition count ≥ max concurrent consumers planned
□ 2–3x growth headroom added
□ Replication factor = 3, min.insync.replicas = 2
□ Consumer group naming convention defined and documented
□ Per-partition lag monitoring configured before go-live
□ Broker leader distribution verified post-topic creation
□ Producer acks=all configured
□ Topic retention policy set (time + size-based)

Summary

Decision	Recommendation	Key risk to avoid
Partition key	Entity ID for per-entity ordering	Hotspot from low-cardinality key
Partition count	max(throughput-based, consumer pool size) × 2–3x headroom	Under-partitioning (harder to fix than over)
Topic design	Match partitions to topic role — not one-size-fits-all	Over-partitioning low-volume topics
Multi-tenancy	Per-tenant topics over shared topics	Shared topics hide blast radius
Replication	RF=3, min.insync.replicas=2, acks=all	Under-replication tolerating silent data loss
Consumer groups	1 per topic, monitor per-partition lag	Aggregate lag metrics hiding hot partitions

Partition design is upstream of every other Kafka optimization. KEDA autoscaling, consumer lag tuning, disaster recovery — all of them build on top of the foundation you set here. Get this right before you tune anything else.

Kafka Partitioning Strategy in Production

Why Partitions Matter More Than You Think

The Four Decisions You Actually Need to Make

Decision 1 — The Partition Key

The Selection Framework

The Hotspot Check

Decision 2 — Partition Count

Throughput-Based Count

Consumer-Parallelism-Based Count

The Rule of Thumb Table

Match Partition Count to Topic Role

Decision 3 — Replication Factor

Watch Broker Leader Balance

Decision 4 — Consumer Group Mapping

Common Mistakes and Their Symptoms

The Pre-Launch Checklist

Summary

Comments

More from this blog

Dynamic Kafka Consumer Orchestration

Beyond CPU Autoscaling

Event-Driven Systems Beyond the Happy Path

Command Palette

Why Partitions Matter More Than You Think

The Four Decisions You Actually Need to Make

Decision 1 — The Partition Key

The Selection Framework

The Hotspot Check

Decision 2 — Partition Count

Throughput-Based Count

Consumer-Parallelism-Based Count

The Rule of Thumb Table

Match Partition Count to Topic Role

Decision 3 — Replication Factor

Watch Broker Leader Balance

Decision 4 — Consumer Group Mapping

Common Mistakes and Their Symptoms

The Pre-Launch Checklist

Summary

Comments

More from this blog