Storage Format
KafScale stores all message data in S3 as immutable segment files. This page covers the binary formats, caching strategy, and retention configuration.
S3 key layout
s3://{bucket}/{namespace}/{topic}/{partition}/segment-{base_offset}.kfs
s3://{bucket}/{namespace}/{topic}/{partition}/segment-{base_offset}.index
Example:
s3://kafscale-data/production/orders/0/segment-00000000000000000000.kfs
s3://kafscale-data/production/orders/0/segment-00000000000000000000.index
The 20-digit zero-padded offset ensures lexicographic sorting matches offset order.
Segment file format
Each .kfs segment is a self-contained file with header, message batches, and footer.
Segment header (32 bytes)
0x4B414653 ("KAFS")Segment body (variable)
Segment footer (16 bytes)
0x454E4421 ("END!")Message batch format
Batches are Kafka-compatible (magic byte 2) for client interoperability.
Batch header (49 bytes)
2 (Kafka v2 format)-1 (no idempotence)-1-1Individual record format
Each record within a batch uses varint encoding for compactness.
-1 for null, else byte countIndex file format
Sparse index for fast offset-to-position lookups. One entry per N messages.
Index header (16 bytes)
0x494458 ("IDX")1Index entries (12 bytes each)
To locate offset N: binary search index entries, then scan forward from nearest position.
Cache architecture
Cache configuration
| Variable | Default | Description |
|---|---|---|
KAFSCALE_CACHE_SIZE |
1GB |
L1 hot segment cache size |
KAFSCALE_INDEX_CACHE_SIZE |
256MB |
L2 index cache size |
KAFSCALE_READAHEAD_SEGMENTS |
2 |
Segments to prefetch |
Flush triggers
Segments are sealed and flushed to S3 when any condition is met:
KAFSCALE_SEGMENT_BYTES
KAFSCALE_FLUSH_INTERVAL_MS
Flush sequence
- Seal current buffer (no more writes accepted)
- Compress batches (Snappy by default)
- Build sparse index file
- Upload segment + index to S3 (both must succeed)
- Update etcd with new segment metadata
- Ack waiting producers (if
acks=all) - Clear flushed data from buffer
S3 lifecycle configuration
Use bucket lifecycle rules to automatically expire old segments. Align with your topic retention settings.
Example: 7-day retention
{
"Rules": [
{
"ID": "kafscale-retention-7d",
"Filter": {
"Prefix": "production/"
},
"Status": "Enabled",
"Expiration": {
"Days": 7
}
}
]
}
AWS CLI setup
aws s3api put-bucket-lifecycle-configuration \
--bucket kafscale-data \
--lifecycle-configuration file://lifecycle.json
Terraform example
resource "aws_s3_bucket_lifecycle_configuration" "kafscale" {
bucket = aws_s3_bucket.kafscale_data.id
rule {
id = "kafscale-retention"
status = "Enabled"
filter {
prefix = "production/"
}
expiration {
days = 7
}
}
}
Per-topic retention
For different retention per topic, use prefix-based rules:
{
"Rules": [
{
"ID": "logs-1d",
"Filter": { "Prefix": "production/logs/" },
"Status": "Enabled",
"Expiration": { "Days": 1 }
},
{
"ID": "events-30d",
"Filter": { "Prefix": "production/events/" },
"Status": "Enabled",
"Expiration": { "Days": 30 }
},
{
"ID": "default-7d",
"Filter": { "Prefix": "production/" },
"Status": "Enabled",
"Expiration": { "Days": 7 }
}
]
}
Rules are evaluated in order; most specific prefix wins.
Compression
KafScale supports batch-level compression using Kafka-compatible codecs.
| Codec | ID | Notes |
|---|---|---|
| None | 0 | No compression |
| Snappy | 1 | Default — fast, moderate ratio |
| LZ4 | 3 | Faster decompression |
| ZSTD | 4 | Best ratio, slower |
Set via KAFSCALE_COMPRESSION_CODEC or per-topic in CRD:
apiVersion: kafscale.io/v1alpha1
kind: KafscaleTopic
metadata:
name: logs
spec:
partitions: 6
retention: 24h
compression: zstd # Better ratio for logs