The right caching strategy depends on your data freshness requirements, deployment topology, and access patterns. Here's how to choose and implement the right approach.
JOptimize Team
Caching is the single most impactful optimization for most Spring Boot applications. A database query that runs 100 times per second becomes a cache hit at microsecond latency. But caching done wrong introduces stale data bugs, inconsistency across instances, and cache stampedes under load.
This guide covers the right caching approach for different data types and access patterns.
The first decision is where to cache:
In-process cache (Caffeine) — stores data in the JVM heap. Sub-millisecond access time. No network overhead. Data is local to the instance — not shared across pods. Cache is lost on restart. Suitable for reference data, configuration, and read-heavy stable data.
Distributed cache (Redis) — stores data in a shared Redis instance. ~1ms access time. Shared across all application instances. Data persists across restarts. Suitable for user sessions, shared state, and data that must be consistent across instances.
Multi-layer (L1 + L2) — Caffeine as L1 (fast, local), Redis as L2 (shared, persistent). Best performance, most complexity. Suitable for high-read stable data in multi-instance deployments.
<dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-cache</artifactId> </dependency> <dependency> <groupId>com.github.ben-manes.caffeine</groupId> <artifactId>caffeine</artifactId> </dependency>
@Configuration @EnableCaching public class CacheConfig { @Bean public CacheManager cacheManager() { CaffeineCacheManager manager = new CaffeineCacheManager(); // Default spec for all caches manager.setCaffeine(defaultSpec()); // Per-cache configuration manager.registerCustomCache("products", Caffeine.newBuilder() .maximumSize(10_000) .expireAfterWrite(10, TimeUnit.MINUTES) .recordStats() .build()); manager.registerCustomCache("configurations", Caffeine.newBuilder() .maximumSize(500) .expireAfterWrite(1, TimeUnit.HOURS) // Config changes rarely .build()); return manager; } private Caffeine<Object, Object> defaultSpec() { return Caffeine.newBuilder() .maximumSize(1000) .expireAfterWrite(5, TimeUnit.MINUTES) .recordStats(); } } @Service @RequiredArgsConstructor public class ProductService { @Cacheable(value = "products", key = "#id") @Transactional(readOnly = true) public ProductDto getProduct(Long id) { return productRepo.findById(id).map(ProductMapper::toDto).orElseThrow(); } @CachePut(value = "products", key = "#result.id") // Update cache on write @Transactional public ProductDto updateProduct(Long id, UpdateProductRequest req) { Product product = productRepo.findById(id).orElseThrow(); product.update(req); return ProductMapper.toDto(productRepo.save(product)); } @CacheEvict(value = "products", key = "#id") // Remove from cache on delete @Transactional public void deleteProduct(Long id) { productRepo.deleteById(id); } }
@Component @RequiredArgsConstructor @Slf4j public class CacheStatsMonitor { private final CacheManager cacheManager; @Scheduled(fixedDelay = 60_000) public void logStats() { cacheManager.getCacheNames().forEach(name -> { Cache cache = cacheManager.getCache(name); if (cache instanceof CaffeineCache caffeineCache) { CacheStats stats = caffeineCache.getNativeCache().stats(); log.info("Cache '{}': hitRate={:.1f}%, size={}, evictions={}", name, stats.hitRate() * 100, caffeineCache.getNativeCache().estimatedSize(), stats.evictionCount()); } }); } } // If hit rate < 80%, your cache TTL may be too short or the cache is too small
For multi-instance deployments, Caffeine doesn't share state between pods. Redis solves this:
spring: cache: type: redis redis: time-to-live: 300000 # 5 minutes default cache-null-values: false # Don't cache null returns data: redis: host: redis port: 6379
@Configuration @EnableCaching public class RedisCacheConfig { @Bean public RedisCacheConfiguration defaultCacheConfig() { return RedisCacheConfiguration.defaultCacheConfig() .entryTtl(Duration.ofMinutes(5)) .disableCachingNullValues() .serializeValuesWith( RedisSerializationContext.SerializationPair .fromSerializer(new GenericJackson2JsonRedisSerializer()) ); } @Bean public RedisCacheManagerBuilderCustomizer cacheManagerCustomizer() { return builder -> builder .withCacheConfiguration("products", RedisCacheConfiguration.defaultCacheConfig() .entryTtl(Duration.ofMinutes(10))) .withCacheConfiguration("user-sessions", RedisCacheConfiguration.defaultCacheConfig() .entryTtl(Duration.ofHours(1))); } }
L1 (Caffeine) + L2 (Redis) gives sub-millisecond reads for hot data with distributed consistency:
@Service @RequiredArgsConstructor public class TwoLevelCacheService { private final Cache<Long, ProductDto> l1Cache = Caffeine.newBuilder() .maximumSize(1000) .expireAfterWrite(1, TimeUnit.MINUTES) // Short L1 TTL .build(); private final RedisTemplate<String, ProductDto> redis; private final ProductRepository productRepo; public ProductDto getProduct(Long id) { // L1: check in-process cache first ProductDto dto = l1Cache.getIfPresent(id); if (dto != null) return dto; // L2: check Redis dto = redis.opsForValue().get("product:" + id); if (dto != null) { l1Cache.put(id, dto); // Warm L1 return dto; } // DB: load from database dto = productRepo.findById(id).map(ProductMapper::toDto).orElseThrow(); redis.opsForValue().set("product:" + id, dto, Duration.ofMinutes(10)); // Warm L2 l1Cache.put(id, dto); // Warm L1 return dto; } public void invalidate(Long id) { l1Cache.invalidate(id); // Clear L1 on this instance redis.delete("product:" + id); // Clear L2 (shared) // L1 on other instances expires via TTL — acceptable for short TTL } }
Cache stampede: cache entry expires, 1000 concurrent requests all miss and hammer the database simultaneously:
@Service public class StampedeProtectedService { private final Cache<Long, CompletableFuture<ProductDto>> inflightRequests = Caffeine.newBuilder().maximumSize(500).build(); public ProductDto getProduct(Long id) throws ExecutionException, InterruptedException { // Only one request per key flies to the DB simultaneously CompletableFuture<ProductDto> future = inflightRequests.get(id, k -> CompletableFuture.supplyAsync(() -> loadFromDb(k))); return future.get(); // All concurrent requests for the same id wait on the same Future } } // Caffeine's built-in solution: Cache<Long, ProductDto> cache = Caffeine.newBuilder() .maximumSize(1000) .expireAfterWrite(5, TimeUnit.MINUTES) .build(); // cache.get() is atomic — only one loader runs per key at a time ProductDto dto = cache.get(id, k -> productRepo.findById(k) .map(ProductMapper::toDto).orElseThrow());
@CachePut or @CacheEvict are required on write operationsmaximumSize, the cache grows until it causes an OOM; always set an upper boundSpring Boot caching strategy: Caffeine for in-process caching (reference data, configuration), Redis for distributed caching (shared state across instances), two-level for maximum performance in multi-instance deployments. Monitor hit rates; below 80% means the TTL is too short or the cache is too small. Always set maximum size limits and TTLs. Always add @CachePut or @CacheEvict on write operations.
Caching helps with repeated reads. JOptimize helps with the underlying queries — N+1 patterns and missing indexes that cause slowness even before caching.
Cache smart. Query smart.
Master Spring Boot, security, and Java performance with hands-on courses.
JOptimize finds N+1 queries, EAGER collections, and 70+ other issues in your Java codebase — in under 30 seconds.