Like what you see? ⭐ Star the repo ⭐ to support the project and keep it in the spotlight. See the stargazers →

Architecture

KafScale brokers are stateless pods on Kubernetes. Metadata lives in etcd, while immutable log segments live in S3. Clients speak the Kafka protocol to a proxy that abstracts broker topology. Brokers flush segments to S3 and serve reads with caching.


Platform overview

Architecture overview
Kafka Clients producers & consumers Proxy rewrites metadata single IP KUBERNETES CLUSTER Broker 0 stateless Broker 1 stateless Broker N stateless ← HPA etcd (3 nodes) topics, offsets, group assignments S3 Data Bucket S3 Backup etcd snapshots flush segments fetch + cache metadata snapshots One endpoint for clients. S3 is the source of truth. Brokers are stateless.

How the proxy works

The Kafka protocol requires clients to discover broker topology. When a client connects, the broker returns a list of all brokers and their partition assignments. Clients then connect directly to each broker they need.

This creates a problem for ephemeral infrastructure. Every broker restart breaks client connections. Scaling events require clients to rediscover the cluster.

KafScale’s proxy solves this by intercepting two types of responses:

Request What the proxy does
Metadata Returns the proxy’s own address instead of individual broker addresses
FindCoordinator Returns the proxy’s address for consumer group coordination

Clients believe they are talking to a single broker. The proxy routes requests to the actual brokers internally.

This enables:

  • Infinite horizontal scaling: Add brokers without client awareness
  • Zero-downtime deployments: Rotate broker pods behind the proxy
  • Standard networking: One LoadBalancer, one DNS name, standard TLS termination

For configuration details, see Operations: External Broker Access.


Decoupled processing (addons)

KafScale keeps brokers focused on Kafka protocol and storage. Add-on processors handle downstream tasks by reading completed segments directly from S3, bypassing brokers entirely. Processors are stateless: offsets and leases live in etcd, input lives in S3, output goes to external catalogs.

Data processor architecture
KAFSCALE Brokers Kafka protocol S3 .kfs segments etcd metadata, offsets PROCESSOR Processor stateless pods, topic-scoped leases, HPA read segments offsets, leases OUTPUT Metadata Catalog S3 Warehouse metadata parquet CONSUMERS Analytics, AI agents query engines query Processors bypass brokers entirely. State lives in etcd. Data lives in S3.

The processor reads .kfs segments from S3, tracks progress in etcd, and writes Parquet files to an Iceberg warehouse. Any Iceberg-compatible catalog can serve the tables to downstream consumers.

For deployment and configuration, see the Iceberg Processor docs.


Key design decisions

Decision Rationale
Proxy for topology abstraction Clients see one endpoint. Brokers scale without client awareness.
S3 as source of truth 11 nines durability, unlimited capacity, ~$0.023/GB/month
Stateless brokers Any pod serves any partition. HPA scales 0→N instantly.
etcd for metadata Leverages existing K8s patterns. Strong consistency.
~500ms latency Acceptable trade-off for ETL, logs, async events
No transactions Simplifies architecture. Covers 80% of Kafka use cases.
4MB segment size Balances S3 PUT costs (~$0.005/1000) vs flush latency

Produce flow

Write path
Producer Kafka client Broker validate, batch assign offsets Buffer in-memory batches S3 sealed segment 1 2 3 produce batch flush
  1. Produce: Client sends records to any broker via Kafka protocol
  2. Batch: Broker validates, batches records, assigns offsets
  3. Flush: When buffer reaches 4MB or 500ms, segment is sealed and uploaded to S3

Data is not acknowledged until S3 upload completes. This guarantees 11 nines durability on ACK.


Fetch flow

Read path
Consumer Kafka client Broker locate segment check cache LRU Cache hit → fast path S3 miss → fetch 1 2 3 4 5 fetch cache? miss
  1. Fetch: Consumer requests data from broker
  2. Cache check: Broker looks up segment in LRU cache
  3. S3 fetch: On cache miss, broker fetches from S3
  4. Populate: Fetched segment is cached for future requests
  5. Return: Data returned to consumer

Component responsibilities

Component Responsibilities
Proxy Rewrites Metadata/FindCoordinator responses, routes requests to brokers, enables topology abstraction
Broker Kafka protocol, batching, offset assignment, S3 read/write, caching
etcd Topic metadata, consumer offsets, group assignments, leader election
S3 Durable segment storage, source of truth, lifecycle-based retention
Operator CRD reconciliation, etcd snapshots, broker lifecycle management

Segment format summary

Segments are self-contained files with header, Kafka-compatible record batches, and footer.

Field Size Description
Magic 4 bytes 0x4B414653 (“KAFS”)
Version 2 bytes Format version (1)
Flags 2 bytes Compression codec
Base Offset 8 bytes First offset in segment
Message Count 4 bytes Number of messages
Timestamp 8 bytes Created (Unix ms)
Batches variable Kafka RecordBatch format
CRC32 4 bytes Checksum
Footer Magic 4 bytes 0x454E4421 (“END!”)

See Storage Format for complete details on segment structure, indexes, and S3 key layout.


Next steps