Apache 2.0 licensed. No vendor lock-in. Self-hosted.
One endpoint. Infinite scale.
Kafka-compatible streaming platform.
Scale streaming and analytics cloud-native on S3. Automated.
What teams are saying
"After WarpStream got acquired, KafScale became our go-to. Better S3 integration, lower latency than we expected, fully scalable, and minimal ops burden."
— Platform team, Series B fintech
"We moved 50 topics off Kafka in a weekend. No more disk alerts, no more partition rebalancing. Our on-call rotation got a lot quieter."
— SRE lead, e-commerce platform
"The Apache 2.0 license was the deciding factor. We can't build on BSL projects, and we won't depend on a vendor's control plane."
— CTO, healthcare data startup
Why teams adopt KafScale
One endpoint, infinite producers
Kafka clients discover partition leaders and connect to each broker directly. KafScale's proxy rewrites metadata responses. One DNS name. Brokers scale behind it. Clients never break.
Stateless brokers
Spin brokers up and down without disk shuffles. S3 is the source of truth. No partition rebalancing, ever.
S3-native durability
11 nines of durability. Immutable segments, lifecycle-based retention, predictable costs.
Storage-native processing
Processors read segments directly from S3, bypassing brokers entirely. Streaming and analytics never compete for the same resources.
Open segment format
The .kfs format is documented. Build custom processors without waiting for vendors to ship features.
Apache 2.0 license
No BSL restrictions. No usage fees. No control plane dependency. Fork it, sell it, run it however you want.
What You Should Consider
KafScale is not a drop-in replacement for every Kafka workload. Here's when it fits and when it doesn't.
KafScale is for you if
- Latency of 200-500ms is acceptable
- You run ETL, logs, or async events
- You want processors that bypass brokers (Iceberg, analytics, AI agents)
- You want minimal ops and no disk management
- Apache 2.0 licensing matters to you
- You prefer self-hosted over managed services
KafScale is not for you if
- You need sub-10ms latency
- You require Kafka transactions (exactly-once across topics)
- You rely on compacted topics
- You want a fully managed service
How KafScale works
Clients connect to a single proxy endpoint. The proxy rewrites Kafka metadata responses so clients never see broker topology. Brokers flush segments to S3. Processors read directly from S3 without touching brokers.
Streaming and analytics share data but never compete for resources.
Built for AI agent infrastructure
AI agents making decisions need context. That context comes from historical events: what happened, in what order, and why the current state exists. Traditional stream processing optimizes for milliseconds. Agents need something different: completeness, replay capability, and the ability to reconcile current state with historical actions.
Storage-native streaming makes this practical. The immutable log in S3 becomes the source of truth that agents query, replay, and reason over. The Iceberg Processor converts that log to tables that analytical tools understand. Agents get complete historical context without competing with streaming workloads for broker resources.
Two-second latency for analytical access is acceptable when the alternative is incomplete context or degraded streaming performance. AI agents do not need sub-millisecond reads. They need the full picture.
Processors
Processors read completed segments directly from S3, enabling independent scaling and custom implementations. The .kfs segment format is open and documented.
Iceberg Processor
Reads .kfs segments from S3. Writes Parquet to Iceberg tables. Works with Unity Catalog, Polaris, AWS Glue. Zero broker load.
DocumentationSQL Processor (KAFSQL)
Query KafScale segments with Postgres-compatible SQL. No Flink, no Spark, no complex pipelines. Just SQL against your Kafka data in S3.
DocumentationBuild your own
The .kfs segment format is documented. Build processors for your use case without waiting for vendors or negotiating enterprise contracts.
Storage format specDocumentation
Protocol compatibility
21 Kafka APIs supported. Produce, Fetch, Metadata, consumer groups, and more.
View API docsStorage format
Segment layout, index structure, S3 key paths, and cache architecture.
Explore storageGet started
Install the operator, define a topic, produce with existing Kafka tools. If you already run Kubernetes and Kafka clients, you can deploy a cluster and start producing data in minutes.
Backed by
KafScale is developed and maintained with support from Scalytics, Inc. and NovaTechflow.
Apache 2.0 licensed. No CLA required. Contributions welcome.