Storage Backend Compatibility
KafScale uses S3-compatible object storage as its source of truth. While designed for AWS S3, it works with any storage backend that implements the S3 API.
Compatibility Matrix
| Provider | Compatibility | Notes |
|---|---|---|
| AWS S3 | ✅ Native | Full support, including IRSA |
| DigitalOcean Spaces | ✅ Native | Drop-in replacement |
| Cloudflare R2 | ✅ Native | Zero egress fees |
| Backblaze B2 | ✅ Native | S3-compatible API |
| Wasabi | ✅ Native | Flat pricing model |
| Linode Object Storage | ✅ Native | S3-compatible |
| Vultr Object Storage | ✅ Native | S3-compatible |
| MinIO | ✅ Native | Self-hosted, any infrastructure |
| Google Cloud Storage | ⚠️ Interop | Requires HMAC keys, XML API |
| Oracle Cloud | ⚠️ Interop | S3 Compatibility API |
| IBM Cloud Object Storage | ⚠️ Interop | S3-compatible API |
| Azure Blob Storage | ❌ Proxy | Requires MinIO Gateway |
Legend:
- ✅ Native: Standard S3 SDK works with endpoint change
- ⚠️ Interop: Works via compatibility layer, minor config differences
- ❌ Proxy: Requires additional infrastructure
AWS S3
Native support. No endpoint configuration needed.
KafScaleCluster
apiVersion: kafscale.io/v1alpha1
kind: KafScaleCluster
metadata:
name: production
namespace: kafscale
spec:
brokers:
replicas: 3
s3:
bucket: kafscale-production
region: us-east-1
credentialsSecretRef: kafscale-s3
etcd:
endpoints: []
Credentials Secret
kubectl -n kafscale create secret generic kafscale-s3 \
--from-literal=AWS_ACCESS_KEY_ID=AKIA... \
--from-literal=AWS_SECRET_ACCESS_KEY=...
IAM Role (EKS with IRSA)
For production on EKS, use IAM Roles for Service Accounts instead of static credentials:
apiVersion: v1
kind: ServiceAccount
metadata:
name: kafscale-broker
namespace: kafscale
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/kafscale-s3-role
Required IAM permissions:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::kafscale-production",
"arn:aws:s3:::kafscale-production/*"
]
}
]
}
DigitalOcean Spaces
Drop-in S3 replacement. Change endpoint and region.
KafScaleCluster
apiVersion: kafscale.io/v1alpha1
kind: KafScaleCluster
metadata:
name: production
namespace: kafscale
spec:
brokers:
replicas: 3
s3:
bucket: kafscale-production
region: nyc3
endpoint: https://nyc3.digitaloceanspaces.com
credentialsSecretRef: kafscale-s3
etcd:
endpoints: []
Credentials Secret
Generate Spaces access keys in the DigitalOcean console under API → Spaces Keys.
kubectl -n kafscale create secret generic kafscale-s3 \
--from-literal=AWS_ACCESS_KEY_ID=DO00... \
--from-literal=AWS_SECRET_ACCESS_KEY=...
Available regions: nyc3, sfo3, ams3, sgp1, fra1
Cloudflare R2
S3-compatible with zero egress fees. Ideal for high-read workloads.
KafScaleCluster
apiVersion: kafscale.io/v1alpha1
kind: KafScaleCluster
metadata:
name: production
namespace: kafscale
spec:
brokers:
replicas: 3
s3:
bucket: kafscale-production
region: auto
endpoint: https://<ACCOUNT_ID>.r2.cloudflarestorage.com
credentialsSecretRef: kafscale-s3
etcd:
endpoints: []
Credentials Secret
Generate R2 API tokens in Cloudflare dashboard under R2 → Manage R2 API Tokens.
kubectl -n kafscale create secret generic kafscale-s3 \
--from-literal=AWS_ACCESS_KEY_ID=... \
--from-literal=AWS_SECRET_ACCESS_KEY=...
Note: Replace <ACCOUNT_ID> with your Cloudflare account ID.
Backblaze B2
Cost-effective S3-compatible storage.
KafScaleCluster
apiVersion: kafscale.io/v1alpha1
kind: KafScaleCluster
metadata:
name: production
namespace: kafscale
spec:
brokers:
replicas: 3
s3:
bucket: kafscale-production
region: us-west-004
endpoint: https://s3.us-west-004.backblazeb2.com
credentialsSecretRef: kafscale-s3
etcd:
endpoints: []
Credentials Secret
Create application keys in B2 console with read/write access to your bucket.
kubectl -n kafscale create secret generic kafscale-s3 \
--from-literal=AWS_ACCESS_KEY_ID=<keyID> \
--from-literal=AWS_SECRET_ACCESS_KEY=<applicationKey>
Endpoint format: https://s3.<region>.backblazeb2.com
Wasabi
Hot cloud storage with flat pricing and no egress fees.
KafScaleCluster
apiVersion: kafscale.io/v1alpha1
kind: KafScaleCluster
metadata:
name: production
namespace: kafscale
spec:
brokers:
replicas: 3
s3:
bucket: kafscale-production
region: us-east-1
endpoint: https://s3.us-east-1.wasabisys.com
credentialsSecretRef: kafscale-s3
etcd:
endpoints: []
Credentials Secret
kubectl -n kafscale create secret generic kafscale-s3 \
--from-literal=AWS_ACCESS_KEY_ID=... \
--from-literal=AWS_SECRET_ACCESS_KEY=...
Available regions: us-east-1, us-east-2, us-west-1, eu-central-1, eu-west-1, ap-northeast-1, ap-northeast-2
Google Cloud Storage
GCS provides S3 interoperability via its XML API. Requires HMAC keys.
Setup
- Enable interoperability in GCS console: Storage → Settings → Interoperability
- Create HMAC keys for a service account
KafScaleCluster
apiVersion: kafscale.io/v1alpha1
kind: KafScaleCluster
metadata:
name: production
namespace: kafscale
spec:
brokers:
replicas: 3
s3:
bucket: kafscale-production
region: auto
endpoint: https://storage.googleapis.com
credentialsSecretRef: kafscale-s3
etcd:
endpoints: []
Credentials Secret
Use HMAC keys (not JSON service account keys):
kubectl -n kafscale create secret generic kafscale-s3 \
--from-literal=AWS_ACCESS_KEY_ID=GOOG... \
--from-literal=AWS_SECRET_ACCESS_KEY=...
Limitations
- Some S3 features may behave differently (e.g., versioning, lifecycle policies)
- Path-style URLs required (GCS doesn’t support virtual-hosted style for interop)
- Multipart upload semantics may vary slightly
Azure Blob Storage
Azure Blob Storage is not S3-compatible. Use MinIO as a gateway proxy.
Architecture
KafScale Brokers → MinIO Gateway → Azure Blob Storage
Deploy MinIO Gateway
apiVersion: apps/v1
kind: Deployment
metadata:
name: minio-azure-gateway
namespace: kafscale
spec:
replicas: 2
selector:
matchLabels:
app: minio-gateway
template:
metadata:
labels:
app: minio-gateway
spec:
containers:
- name: minio
image: minio/minio:latest
args:
- gateway
- azure
env:
- name: MINIO_ROOT_USER
valueFrom:
secretKeyRef:
name: minio-gateway-creds
key: accessKey
- name: MINIO_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: minio-gateway-creds
key: secretKey
- name: AZURE_STORAGE_ACCOUNT
valueFrom:
secretKeyRef:
name: azure-storage-creds
key: accountName
- name: AZURE_STORAGE_KEY
valueFrom:
secretKeyRef:
name: azure-storage-creds
key: accountKey
ports:
- containerPort: 9000
---
apiVersion: v1
kind: Service
metadata:
name: minio-gateway
namespace: kafscale
spec:
selector:
app: minio-gateway
ports:
- port: 9000
targetPort: 9000
KafScaleCluster
Point KafScale at the MinIO gateway:
apiVersion: kafscale.io/v1alpha1
kind: KafScaleCluster
metadata:
name: production
namespace: kafscale
spec:
brokers:
replicas: 3
s3:
bucket: kafscale-production
endpoint: http://minio-gateway.kafscale.svc:9000
credentialsSecretRef: minio-gateway-creds
etcd:
endpoints: []
Secrets
# MinIO gateway credentials (what KafScale uses)
kubectl -n kafscale create secret generic minio-gateway-creds \
--from-literal=accessKey=kafscale-access \
--from-literal=secretKey=kafscale-secret-key
# Azure storage credentials (what MinIO uses)
kubectl -n kafscale create secret generic azure-storage-creds \
--from-literal=accountName=<storage-account-name> \
--from-literal=accountKey=<storage-account-key>
Trade-offs
- Added latency: Extra network hop through gateway
- Operational overhead: Another component to manage and monitor
- Single point of failure: Gateway needs HA configuration
- Cost: Compute for gateway instances
Consider native S3-compatible providers if Azure isn’t a hard requirement.
MinIO (Self-Hosted)
Run your own S3-compatible storage on any infrastructure—on-prem, edge, or any cloud.
Docker Compose (Development)
The default docker-compose.yml includes MinIO:
docker-compose up -d
MinIO runs on port 9000 with default credentials minioadmin:minioadmin.
KafScaleCluster (Kubernetes)
apiVersion: kafscale.io/v1alpha1
kind: KafScaleCluster
metadata:
name: demo
namespace: kafscale
spec:
brokers:
replicas: 3
s3:
bucket: kafscale-data
endpoint: http://minio.minio-system.svc:9000
credentialsSecretRef: kafscale-s3
etcd:
endpoints: []
Production MinIO
For production, deploy MinIO in distributed mode with erasure coding:
helm repo add minio https://charts.min.io/
helm install minio minio/minio \
--namespace minio-system --create-namespace \
--set mode=distributed \
--set replicas=4 \
--set persistence.size=100Gi
Common Considerations
Path Style vs Virtual Hosted Style
Some S3-compatible backends only support path-style URLs. If you encounter bucket resolution issues, ensure your SDK is configured for path-style access.
TLS/SSL
For production, always use HTTPS endpoints. Self-signed certificates may require additional CA configuration in the broker pods.
Regional Latency
Place your storage in the same region as your Kubernetes cluster. KafScale’s ~500ms latency target assumes low-latency storage access.
Bucket Lifecycle
KafScale manages segment files in .kfs format. Configure bucket lifecycle policies carefully—avoid auto-deletion rules that could remove active segments.
Testing Your Backend
Before deploying KafScale, verify S3 compatibility:
# Using AWS CLI with custom endpoint
aws s3 ls --endpoint-url https://your-endpoint.com
# Create test bucket
aws s3 mb s3://kafscale-test --endpoint-url https://your-endpoint.com
# Upload test object
echo "test" | aws s3 cp - s3://kafscale-test/test.txt --endpoint-url https://your-endpoint.com
# Verify
aws s3 ls s3://kafscale-test/ --endpoint-url https://your-endpoint.com
Tested Configurations
| Backend | Version Tested | Status |
|---|---|---|
| AWS S3 | - | ✅ Production |
| MinIO | 2024-01-xx | ✅ Production |
| DigitalOcean Spaces | - | ✅ Tested |
| Cloudflare R2 | - | ✅ Tested |
Other backends listed should work but may not be continuously tested. Community reports welcome.