Like what you see? ⭐ Star the repo ⭐ to support the project and keep it in the spotlight. See the stargazers →

Rationale

KafScale exists because the original assumptions behind Kafka brokers no longer hold for a large class of modern workloads. This page explains those assumptions, what changed, and why KafScale is designed the way it is.

This is not a comparison page and not a feature list. It documents the architectural reasoning behind the system.


The original Kafka assumptions

Kafka was designed in a world where durability lived on local disks attached to long-running servers. Brokers owned their data. Replication, leader election, and partition rebalancing were necessary because broker state was the source of truth.

That model worked well when:

  • Disks were the primary durable medium
  • Brokers were expected to be long-lived
  • Scaling events were rare and manual
  • Recovery time could be measured in minutes or hours

Many Kafka deployments today still operate under these assumptions, even when the workload does not require them.


Object storage changes the durability model

Object storage fundamentally changes where durability lives.

Modern object stores provide extremely high durability, elastic capacity, and simple lifecycle management. Once log segments are durable and immutable in object storage, keeping the same data replicated across broker-local disks stops adding resilience and starts adding operational cost.

With object storage:

  • Data durability is decoupled from broker lifecycle
  • Storage scales independently from compute
  • Recovery no longer depends on copying large volumes of data between brokers

This enables a different design space where brokers no longer need to be stateful.

What changed
Traditional Kafka • Brokers own durable data • Replication required for durability • Scaling moves data • Failures require repair • Disk management is critical Stateful brokers = operational complexity Stateless Brokers (KafScale) • Object storage owns durability • Replication is implicit in S3 • Scaling adds compute only • Failures handled by replacement • Disk management disappears Stateless brokers = simpler operations When durability moves out of the broker, the operational model changes with it.

Why brokers should be ephemeral

In KafScale, brokers are treated as ephemeral compute.

They serve the Kafka protocol, buffer and batch data, and flush immutable segments to object storage. They do not own durable state. Any broker can serve any partition.

This has several consequences:

  • Scaling is a scheduling problem, not a data movement problem
  • Broker restarts are cheap and predictable
  • Failures are handled by replacement, not repair
  • Kubernetes can manage brokers like any other stateless workload

This model matches how modern infrastructure platforms already operate.

Design flow
Durable storage (S3) Ephemeral compute Simpler operations

Why self-hosted control planes still matter

Some systems that adopt stateless brokers rely on vendor-managed control planes for metadata and coordination. That can be a good tradeoff for teams that want a fully managed service.

KafScale makes a different choice.

By keeping metadata, offsets, and consumer group state in a self-hosted store, KafScale can run entirely within your own infrastructure boundary. This matters for:

  • Regulated and sovereign environments
  • Private and air-gapped deployments
  • Teams that require open licensing and forkability
  • Platforms that want to avoid external control plane dependencies

The goal is not to reject managed services, but to make the architecture usable under stricter constraints.


What KafScale deliberately does not do

KafScale is not trying to replace every Kafka deployment.

It deliberately does not target:

  • Sub-10ms end-to-end latency workloads
  • Exactly-once transactions
  • Compacted topics
  • Embedded stream processing inside the broker

Those features increase broker statefulness and operational complexity. For many workloads, they are unnecessary.

KafScale focuses on the common case: durable message transport, replayability, predictable retention, and low operational overhead.


Summary

Stateless brokers backed by object storage are not a trend. They are a correction.

Once durability moves out of the broker, the system can be simpler, cheaper to operate, and easier to scale. KafScale is built on that assumption, while preserving Kafka protocol compatibility and self-hosted operation.

The architecture is inevitable. The design choices are deliberate.