Kafka Partitioning Strategy in Production
A Framework That Actually Scales

Partition count is the most consequential — and most irreversible — decision you make when designing a Kafka topic.
Get it right and your system scales gracefully for years. Get it wrong and you're either stuck with a rebalancing disaster at 3 AM or watching idle consumers sit on empty partitions while one thread drowns.
Designing Kafka topologies for a real-time event distribution platform processing 10M+ messages/day across multiple tenants and hotel properties globally, this is the framework I'd use again.
Why Partitions Matter More Than You Think
A partition is the unit of parallelism in Kafka. Each partition is consumed by exactly one consumer within a consumer group at any point in time.
More partitions give you:
Higher consumer parallelism ceiling
Better throughput distribution across brokers
Finer-grained rebalancing granularity
More partitions cost you:
More file handles on each broker (each partition = a directory of segment files)
More metadata for the controller to track
Longer, more disruptive consumer group rebalances
Higher end-to-end latency if over-partitioned (broker memory is spread thin)
Partition count is a one-way door. You can increase it, but increasing causes a rebalance and breaks key-based routing for messages in-flight. You can never decrease without deleting the topic.
The Four Decisions You Actually Need to Make
Decision 1 — The Partition Key
This is the single most important choice. The key determines:
Which partition a message lands in
What ordering guarantees you get
Whether you'll have hotspots
The Selection Framework
| Requirement | Key to choose | Risk to watch |
|---|---|---|
| Per-entity ordering (e.g., all events for entity X in order) | Entity ID | Skew if some entities are far more active |
| Global ordering | Single constant key | All messages → 1 partition — kills parallelism |
| Max throughput, ordering not needed | Null (round-robin) | No ordering guarantee whatsoever |
| Multi-tenant, per-entity ordering | Entity ID (with per-tenant topics) | Topic proliferation |
| Multi-tenant, shared topics | Composite: TenantID + EntityID | Custom partitioner needed |
The Hotspot Check
Before committing to any key, sample 24 hours of production traffic and ask:
Does the top 1% of key values account for > 20% of total message volume?
If yes — you have a hotspot risk. Mitigation options:
Add a bucket suffix to the key:
entityId + "-" + (random % N)— trades ordering granularity for even distributionPre-partition at the producer level
Choose a higher-cardinality dimension of the same entity
In our platform, we partitioned by property ID (a high-cardinality entity identifier). Traffic distribution analysis showed no single property drove more than 0.1% of total volume — safe from hotspots.
Decision 2 — Partition Count
The formula everyone uses:
Partitions = max(throughput-based count, consumer-parallelism-based count)
Throughput-Based Count
Partitions = Target throughput (MB/s)
─────────────────────────────────────────────────────
min(Producer throughput per partition,
Consumer throughput per partition)
A single Kafka partition typically handles 10–50 MB/s depending on message size, compression, and consumer processing speed. For most event-driven workloads processing KB-sized messages, throughput alone rarely drives you above 10–20 partitions.
Consumer-Parallelism-Based Count
This is usually the binding constraint:
Partitions ≥ Max consumers you will ever want to run simultaneously
Take your target pod count at peak load, add a growth multiplier, and round up to a clean number. A 10:1 partition-to-consumer ratio is a reasonable starting point — it gives you headroom to scale consumers up to 10x without repartitioning.
The Rule of Thumb Table
| Daily message volume | Typical peak TPS | Partition range |
|---|---|---|
| < 1M / day | < 50 | 10–30 |
| 1M–10M / day | 50–500 | 30–100 |
| 10M–100M / day | 500–5,000 | 100–500 |
| > 100M / day | > 5,000 | 500+ (measure carefully) |
At 10M+ messages/day with peaks hitting ~1,000 messages/second per processing pipeline, our high-volume topics sat comfortably in the 500-partition range with a 10:1 partition-to-consumer ratio.
Match Partition Count to Topic Role
Not every topic in your pipeline carries the same volume or needs the same parallelism ceiling. A common mistake is assigning the same partition count everywhere.
| Topic Role | Relative Volume | Partition guidance |
|---|---|---|
| High-volume fanout | High | Match consumer pool |
| Pre-processing | Low-medium | 4–48, don't over-shard |
| Downstream push | Matches fanout | Same as fanout topic |
| Burst/resync | Spiky | High — headroom first |
In our platform, a pre-processing topic deliberately ran at a fraction of our main topic's partition count — over-partitioning it would have created unnecessary rebalancing overhead for a step that didn't need parallelism.
Decision 3 — Replication Factor
For production: always 3.
| Config | Value | Why |
|---|---|---|
replication.factor |
3 | Tolerates 1 broker failure |
min.insync.replicas |
2 | Prevents silent data loss when 1 broker is behind |
acks (producer) |
all | Waits for all ISRs to acknowledge |
Watch Broker Leader Balance
With 3 brokers and many partitions, leaders distribute across brokers. Over time — especially after broker restarts — they drift.
Monitor replication bytes in per broker. If one broker is handling 5x the replication traffic of another, you have a leader imbalance. Use kafka-leader-election.sh --election-type PREFERRED to rebalance.
In our cluster, post-incident broker restarts regularly caused temporary leader imbalance. Making preferred leader election part of our post-incident runbook solved it.
Decision 4 — Consumer Group Mapping
Partition count means nothing without a clean consumer group design.
Rules that hold at scale:
One consumer group per topic — shared consumer groups across teams/services create invisible coupling. When one consumer falls behind, the group's lag aggregation hides where the problem is.
Consumer count ≤ partition count — consumers beyond the partition count sit idle. The partition is the ceiling, not the consumer count.
Consistent naming convention —
{ENV}.{SERVICE}.consume-{topic-short-name}. Makes Grafana queries, Splunk searches, and KEDA triggers consistent across environments.Monitor lag per partition, not just aggregate — a single hot partition drowning in lag is invisible in an aggregate view.
Common Mistakes and Their Symptoms
| Mistake | Symptom | Fix |
|---|---|---|
| Too few partitions | Consumers max out CPU, lag grows despite healthy consumers | Increase partitions (causes rebalance) |
| Too many partitions on low-volume topic | Slow controller failover, excessive file handles, slow rebalances | Match partitions to actual need |
| Wrong partition key (e.g., timestamp) | Random distribution, ordering lost, unpredictable consumer load | Redesign key — unavoidable topic recreation |
| Hotspot key | One partition at 90% lag, others empty | Compound key with bucket suffix |
| Leader imbalance | One broker at 5x replication bytes vs others | Run preferred leader election |
| Shared topics in multi-tenant system | Can't isolate tenant incidents | Migrate to per-tenant topic model |
The Pre-Launch Checklist
□ Partition key chosen — ordering requirements documented
□ Key distribution analyzed — top 1% of keys < 20% of volume
□ Partition count ≥ max concurrent consumers planned
□ 2–3x growth headroom added
□ Replication factor = 3, min.insync.replicas = 2
□ Consumer group naming convention defined and documented
□ Per-partition lag monitoring configured before go-live
□ Broker leader distribution verified post-topic creation
□ Producer acks=all configured
□ Topic retention policy set (time + size-based)
Summary
| Decision | Recommendation | Key risk to avoid |
|---|---|---|
| Partition key | Entity ID for per-entity ordering | Hotspot from low-cardinality key |
| Partition count | max(throughput-based, consumer pool size) × 2–3x headroom | Under-partitioning (harder to fix than over) |
| Topic design | Match partitions to topic role — not one-size-fits-all | Over-partitioning low-volume topics |
| Multi-tenancy | Per-tenant topics over shared topics | Shared topics hide blast radius |
| Replication | RF=3, min.insync.replicas=2, acks=all | Under-replication tolerating silent data loss |
| Consumer groups | 1 per topic, monitor per-partition lag | Aggregate lag metrics hiding hot partitions |
Partition design is upstream of every other Kafka optimization. KEDA autoscaling, consumer lag tuning, disaster recovery — all of them build on top of the foundation you set here. Get this right before you tune anything else.


