Onboarding Guide
A progressive tutorial collection that takes you from basic Kafka client operations to full stream-processing pipelines on KafScale.
Who this is for
- Developers new to KafScale who already know the basics of Apache Kafka
- Teams evaluating KafScale as a drop-in Kafka replacement
- Anyone wanting hands-on experience with KafScale’s stateless-broker architecture
Prerequisites
- Docker 20.10+, Java 11+, Maven 3.6+
- kubectl 1.28+, kind 0.20+, helm 3.12+
- make, curl, git
- Basic understanding of Kafka concepts (topics, producers, consumers, consumer groups)
Learning path
Work through the examples in order. Each builds on concepts from the previous one.
| Example | Framework | What you learn | Skill level | Time |
|---|---|---|---|---|
| E10 | Pure Java | Produce/consume basics, KafScale-specific config | Beginner | 15 min |
| E20 | Spring Boot | REST API integration, profiles, Docker deployment | Beginner | 20 min |
| E30 | Apache Flink | Stateful streaming, 3 deployment modes | Intermediate | 30 min |
| E40 | Apache Spark | Structured streaming, DataFrame API, checkpoints | Intermediate | 30 min |
| E50 | JavaScript/Node.js | Web UI, agent integration, KafkaJS client | Intermediate | 20 min |
Total time: ~2 hours for the full path, or ~35 minutes for the minimal path (E10 + E20).
E10: Java Kafka Client
Directory: examples/E10_java-kafka-client-demo/
The starting point. Uses the standard kafka-clients library to list topics, create a topic, produce 25 messages, and consume 5 messages.
Key KafScale configuration:
acks=0— KafScale brokers are stateless; acknowledgement semantics differ from traditional Kafkaenable.idempotence=false— KafScale does not support idempotent producers- Bootstrap server:
127.0.0.1:39092(default local dev port)
cd examples/E10_java-kafka-client-demo
mvn clean package
java -jar target/kafka-client-demo-*.jar
E20: Spring Boot
Directory: examples/E20_spring-boot-kafscale-demo/
A REST API that produces and consumes orders through KafScale, with a Bootstrap 5 web UI.
REST endpoints:
| Method | Path | Description |
|---|---|---|
POST |
/api/orders |
Create and send an order to Kafka |
GET |
/api/orders |
List consumed orders (in-memory) |
GET |
/api/orders/cluster-info |
View cluster metadata and topics |
POST |
/api/orders/test-connection |
Test Kafka connectivity |
GET |
/actuator/prometheus |
Prometheus metrics |
Deployment profiles: see Common configuration below.
cd examples/E20_spring-boot-kafscale-demo
mvn clean package
java -jar target/kafscale-demo-*.jar
# Or with Docker:
docker build -t kafscale-demo-e20 .
docker run -p 8080:8080 kafscale-demo-e20
E30: Apache Flink
Directory: examples/E30_flink-kafscale-demo/
A Flink streaming job that performs word counts across message headers, keys, and values separately, tracking statistics for missing fields.
Three deployment modes:
- Standalone Java — runs locally with embedded Flink mini-cluster (Web UI at
localhost:8091) - Docker standalone cluster — Flink JobManager + TaskManager containers (Web UI at
localhost:8081) - Kubernetes/kind cluster — full K8s deployment
cd examples/E30_flink-kafscale-demo
mvn clean package
# Standalone:
make run-standalone
# Docker:
make run-docker
# Kubernetes:
make run-k8s
E40: Apache Spark
Directory: examples/E40_spark-kafscale-demo/
Structured streaming with micro-batch execution. Groups word counts by field type (header, key, value) using the DataFrame API.
Checkpointing:
- Default location:
/tmp/kafscale-spark-checkpoints - Supports durable storage (NFS, S3) for production
failOnDataLossflag controls behavior on offset conflicts
cd examples/E40_spark-kafscale-demo
mvn clean package
make run
E50: JavaScript / Node.js
Directory: examples/E50_JS-kafscale-demo/
A Node.js application using KafkaJS with an interactive Kanban board UI (drag-and-drop task management) and real-time WebSocket monitoring.
Agent architecture: queue-driven with Kafka topics for orchestration — requests flow through agent.requests to the agent service, responses return on agent.responses.
cd examples/E50_JS-kafscale-demo
npm install
npm start
# Open http://localhost:3000
Common configuration
All Java examples (E10–E40) use Spring-style deployment profiles:
| Profile | Bootstrap server | Use case |
|---|---|---|
default |
localhost:39092 |
Local development |
cluster |
kafscale-broker:9092 |
In-cluster Kubernetes |
local-lb |
localhost:59092 |
Remote via load balancer |
Activate a profile with --spring.profiles.active=cluster or the SPRING_PROFILES_ACTIVE environment variable.
Environment variables common across examples:
| Variable | Default | Description |
|---|---|---|
KAFKA_BOOTSTRAP_SERVERS |
127.0.0.1:39092 |
Kafka/KafScale broker address |
KAFKA_TOPIC |
(varies by example) | Target topic name |
Developer guide
The 101_kafscale-dev-guide/ contains an 8-chapter written guide covering:
- Introduction — what KafScale is, architecture overview, limitations
- Quick Start — run the local demo with
make demo - Spring Boot Configuration — configure your application for KafScale
- Running Your Application — platform demo walkthrough
- Troubleshooting — common issues and fixes
- Next Steps — production deployment guidance
- Flink Word Count Demo — stream processing with Flink (E30)
- Spark Word Count Demo — stream processing with Spark (E40)
Core path (chapters 1–6): ~45–60 minutes. Each stream processing chapter adds 20–30 minutes.
Next steps
- Quickstart — install the KafScale operator and create your first topic
- User Guide — full reference for operating KafScale
- Architecture — how stateless brokers and S3 storage work together