Recommended defaults
Based on extensive benchmarking (e.g. AWS c7i.8xlarge), a good starting point for most production deployments is:
[server]
num_shards = 2048
batch_size = 256
buffer_size = 16384
buffer_pool_size = 2048
max_connections = 10000
[performance]
tcp_nodelay = true
tcp_keepalive = 60
[memory]
max_memory = 0
eviction_policy = "allkeys-lru"
This configuration achieved ≈6.87M GET ops/s and 2.74M SET ops/s in upstream benchmarks with sub-millisecond p50 latency.
Shard count (num_shards)
Goal: balance parallelism and memory overhead. Each shard adds overhead but reduces contention.
- 256 shards – lower memory, more contention; small datasets & moderate concurrency.
- 2048 shards – recommended balance for most workloads.
- 4096 shards – maximum GET throughput; higher memory use.
Batch size (batch_size)
Goal: match your client pipeline depth to minimize syscalls without adding latency.
| Pipeline depth (-P) | Recommended batch_size |
|---|---|
| 1–16 | 16 |
| 16–64 | 64–128 |
| 64–128 | 256 |
| >128 | 512 |
Buffer pool size & TCP
buffer_pool_size controls the number of reusable response buffers. Too small increases allocations; too large wastes memory.
- Start with 2048 buffers and increase for very high connection counts (>1000).
- Enable
tcp_nodelayfor low-latency interactive workloads. - Use reasonable
tcp_keepalive(60–300s) to keep long-lived connections healthy.
Workload-specific configurations
Read-heavy caching (>90% GETs)
[server]
num_shards = 4096
batch_size = 256
buffer_pool_size = 2048
[memory]
max_memory = 8589934592 # 8GB
eviction_policy = "allkeys-lru"
Balanced 50/50 read-write
[server]
num_shards = 2048
batch_size = 256
buffer_pool_size = 2048
[memory]
max_memory = 4294967296 # 4GB
eviction_policy = "allkeys-lru"
Low-latency interactive
[server]
num_shards = 2048
batch_size = 16
[performance]
tcp_nodelay = true
Memory planning & eviction
An approximate formula for total memory:
Total ≈ num_keys × (key_size + value_size + 100B) + shard_overhead + buffer_pool
Guidelines:
- Set
max_memoryto ≈70–80% of available RAM. - Use
allkeys-lrufor most caches; usenoevictionwhen data loss is unacceptable and handle errors in your app. - Monitor
used_memoryandevicted_keysviaINFOor the HTTP health endpoint.
Benchmarking & troubleshooting
Quick benchmarking
redis-benchmark -h localhost -p 6379 \
-t set,get \
-n 2000000 \
-c 500 \
-P 128 \
--csv
Common issues
- High latency: ensure
tcp_nodelay=true, reducebatch_size, and increasenum_shards. - Low throughput: increase
batch_sizeandnum_shards, and ensure clients are pipelining. - Memory pressure: set
max_memoryand eviction policy; add TTLs.
For deeper system-level tuning (Linux kernel parameters, NUMA pinning, huge pages) and more detailed recipes, see the full Performance Tuning Guide in the upstream docs.