Spring Boot Caching Strategies: From Simple @Cacheable to Multi-Layer Cache (2026)

Caching is the single most impactful optimization for most Spring Boot applications. A database query that runs 100 times per second becomes a cache hit at microsecond latency. But caching done wrong introduces stale data bugs, inconsistency across instances, and cache stampedes under load.

This guide covers the right caching approach for different data types and access patterns.

Choosing the Right Cache Level

The first decision is where to cache:

In-process cache (Caffeine) — stores data in the JVM heap. Sub-millisecond access time. No network overhead. Data is local to the instance — not shared across pods. Cache is lost on restart. Suitable for reference data, configuration, and read-heavy stable data.

Distributed cache (Redis) — stores data in a shared Redis instance. ~1ms access time. Shared across all application instances. Data persists across restarts. Suitable for user sessions, shared state, and data that must be consistent across instances.

Multi-layer (L1 + L2) — Caffeine as L1 (fast, local), Redis as L2 (shared, persistent). Best performance, most complexity. Suitable for high-read stable data in multi-instance deployments.

Layer 1: Caffeine In-Process Cache

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-cache</artifactId>
</dependency>
<dependency>
    <groupId>com.github.ben-manes.caffeine</groupId>
    <artifactId>caffeine</artifactId>
</dependency>

@Configuration
@EnableCaching
public class CacheConfig {

    @Bean
    public CacheManager cacheManager() {
        CaffeineCacheManager manager = new CaffeineCacheManager();

        // Default spec for all caches
        manager.setCaffeine(defaultSpec());

        // Per-cache configuration
        manager.registerCustomCache("products",
            Caffeine.newBuilder()
                .maximumSize(10_000)
                .expireAfterWrite(10, TimeUnit.MINUTES)
                .recordStats()
                .build());

        manager.registerCustomCache("configurations",
            Caffeine.newBuilder()
                .maximumSize(500)
                .expireAfterWrite(1, TimeUnit.HOURS)  // Config changes rarely
                .build());

        return manager;
    }

    private Caffeine<Object, Object> defaultSpec() {
        return Caffeine.newBuilder()
            .maximumSize(1000)
            .expireAfterWrite(5, TimeUnit.MINUTES)
            .recordStats();
    }
}

@Service
@RequiredArgsConstructor
public class ProductService {

    @Cacheable(value = "products", key = "#id")
    @Transactional(readOnly = true)
    public ProductDto getProduct(Long id) {
        return productRepo.findById(id).map(ProductMapper::toDto).orElseThrow();
    }

    @CachePut(value = "products", key = "#result.id")  // Update cache on write
    @Transactional
    public ProductDto updateProduct(Long id, UpdateProductRequest req) {
        Product product = productRepo.findById(id).orElseThrow();
        product.update(req);
        return ProductMapper.toDto(productRepo.save(product));
    }

    @CacheEvict(value = "products", key = "#id")  // Remove from cache on delete
    @Transactional
    public void deleteProduct(Long id) {
        productRepo.deleteById(id);
    }
}

Monitoring Cache Effectiveness

@Component
@RequiredArgsConstructor
@Slf4j
public class CacheStatsMonitor {

    private final CacheManager cacheManager;

    @Scheduled(fixedDelay = 60_000)
    public void logStats() {
        cacheManager.getCacheNames().forEach(name -> {
            Cache cache = cacheManager.getCache(name);
            if (cache instanceof CaffeineCache caffeineCache) {
                CacheStats stats = caffeineCache.getNativeCache().stats();
                log.info("Cache '{}': hitRate={:.1f}%, size={}, evictions={}",
                    name,
                    stats.hitRate() * 100,
                    caffeineCache.getNativeCache().estimatedSize(),
                    stats.evictionCount());
            }
        });
    }
}
// If hit rate < 80%, your cache TTL may be too short or the cache is too small

Layer 2: Redis Distributed Cache

For multi-instance deployments, Caffeine doesn't share state between pods. Redis solves this:

spring:
  cache:
    type: redis
    redis:
      time-to-live: 300000  # 5 minutes default
      cache-null-values: false  # Don't cache null returns
  data:
    redis:
      host: redis
      port: 6379

@Configuration
@EnableCaching
public class RedisCacheConfig {

    @Bean
    public RedisCacheConfiguration defaultCacheConfig() {
        return RedisCacheConfiguration.defaultCacheConfig()
            .entryTtl(Duration.ofMinutes(5))
            .disableCachingNullValues()
            .serializeValuesWith(
                RedisSerializationContext.SerializationPair
                    .fromSerializer(new GenericJackson2JsonRedisSerializer())
            );
    }

    @Bean
    public RedisCacheManagerBuilderCustomizer cacheManagerCustomizer() {
        return builder -> builder
            .withCacheConfiguration("products",
                RedisCacheConfiguration.defaultCacheConfig()
                    .entryTtl(Duration.ofMinutes(10)))
            .withCacheConfiguration("user-sessions",
                RedisCacheConfiguration.defaultCacheConfig()
                    .entryTtl(Duration.ofHours(1)));
    }
}

Two-Level Cache: Best of Both Worlds

L1 (Caffeine) + L2 (Redis) gives sub-millisecond reads for hot data with distributed consistency:

@Service
@RequiredArgsConstructor
public class TwoLevelCacheService {

    private final Cache<Long, ProductDto> l1Cache = Caffeine.newBuilder()
        .maximumSize(1000)
        .expireAfterWrite(1, TimeUnit.MINUTES)  // Short L1 TTL
        .build();

    private final RedisTemplate<String, ProductDto> redis;
    private final ProductRepository productRepo;

    public ProductDto getProduct(Long id) {
        // L1: check in-process cache first
        ProductDto dto = l1Cache.getIfPresent(id);
        if (dto != null) return dto;

        // L2: check Redis
        dto = redis.opsForValue().get("product:" + id);
        if (dto != null) {
            l1Cache.put(id, dto);  // Warm L1
            return dto;
        }

        // DB: load from database
        dto = productRepo.findById(id).map(ProductMapper::toDto).orElseThrow();
        redis.opsForValue().set("product:" + id, dto, Duration.ofMinutes(10));  // Warm L2
        l1Cache.put(id, dto);  // Warm L1
        return dto;
    }

    public void invalidate(Long id) {
        l1Cache.invalidate(id);              // Clear L1 on this instance
        redis.delete("product:" + id);       // Clear L2 (shared)
        // L1 on other instances expires via TTL — acceptable for short TTL
    }
}

Cache Stampede Prevention

Cache stampede: cache entry expires, 1000 concurrent requests all miss and hammer the database simultaneously:

@Service
public class StampedeProtectedService {

    private final Cache<Long, CompletableFuture<ProductDto>> inflightRequests =
        Caffeine.newBuilder().maximumSize(500).build();

    public ProductDto getProduct(Long id) throws ExecutionException, InterruptedException {
        // Only one request per key flies to the DB simultaneously
        CompletableFuture<ProductDto> future = inflightRequests.get(id,
            k -> CompletableFuture.supplyAsync(() -> loadFromDb(k)));
        return future.get();
        // All concurrent requests for the same id wait on the same Future
    }
}

// Caffeine's built-in solution:
Cache<Long, ProductDto> cache = Caffeine.newBuilder()
    .maximumSize(1000)
    .expireAfterWrite(5, TimeUnit.MINUTES)
    .build();

// cache.get() is atomic — only one loader runs per key at a time
ProductDto dto = cache.get(id, k -> productRepo.findById(k)
    .map(ProductMapper::toDto).orElseThrow());

Common Mistakes to Avoid

No TTL — cached data must expire; an infinite-TTL cache eventually serves stale data permanently
Caching mutable data without eviction — if product price changes, you must evict or update the cache; @CachePut or @CacheEvict are required on write operations
Caffeine in a multi-instance deployment without a plan — each pod has its own L1 cache; after updating a product, other pods' caches are stale until TTL expires; use a short TTL or Redis pub/sub for invalidation
No cache size limit — without maximumSize, the cache grows until it causes an OOM; always set an upper bound

Summary

Spring Boot caching strategy: Caffeine for in-process caching (reference data, configuration), Redis for distributed caching (shared state across instances), two-level for maximum performance in multi-instance deployments. Monitor hit rates; below 80% means the TTL is too short or the cache is too small. Always set maximum size limits and TTLs. Always add @CachePut or @CacheEvict on write operations.

Reduce DB Load with Better Caching and Better Queries

Caching helps with repeated reads. JOptimize helps with the underlying queries — N+1 patterns and missing indexes that cause slowness even before caching.

IntelliJ Plugin — query analysis: Install JOptimize
Web Dashboard — full performance audit: Analyze your project free →

Cache smart. Query smart.