Back to Blog
spring-bootresilience4jcircuit-breakermicroservicesperformancejava

Resilience4j with Spring Boot: Circuit Breakers, Retry, and Rate Limiting (2026)

A single slow downstream service can cascade and take down your entire Spring Boot app. Learn how to implement circuit breakers, retry with backoff, and rate limiting with Resilience4j.

J

JOptimize Team

May 25, 2026· 9 min read

A microservice that calls 3 external services is 3 times more likely to fail than one that calls none. Without resilience patterns, one slow dependency cascades into a full outage: threads pile up waiting for a timeout, connection pools exhaust, and your healthy service becomes unhealthy. Resilience4j gives you the tools to break this cascade.


Setup

<dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-aop</artifactId> </dependency> <dependency> <groupId>io.github.resilience4j</groupId> <artifactId>resilience4j-spring-boot3</artifactId> <version>2.2.0</version> </dependency>

Circuit Breaker — Stop Calling a Failing Service

A circuit breaker monitors calls to an external service. After a threshold of failures, it "opens" and returns a fallback immediately without calling the service — giving it time to recover.

CLOSED (normal) → failure rate > 50% → OPEN (fast-fail) → wait 30s → HALF-OPEN (trial calls) → back to CLOSED
@Service @RequiredArgsConstructor public class PaymentService { private final PaymentClient paymentClient; @CircuitBreaker( name = "payment-service", fallbackMethod = "paymentFallback" ) public PaymentResult processPayment(PaymentRequest request) { return paymentClient.process(request); } // Fallback — called when circuit is OPEN private PaymentResult paymentFallback( PaymentRequest request, Exception ex) { log.warn("Payment service unavailable, queuing request: {}", ex.getMessage()); paymentQueue.enqueue(request); return PaymentResult.queued(request.getOrderId()); } }

Configuration:

# application.yml resilience4j: circuitbreaker: instances: payment-service: sliding-window-type: COUNT_BASED sliding-window-size: 10 # Evaluate last 10 calls failure-rate-threshold: 50 # Open if >50% fail wait-duration-in-open-state: 30s # Wait 30s before trying again permitted-number-of-calls-in-half-open-state: 3 minimum-number-of-calls: 5 # Need at least 5 calls before evaluating record-exceptions: - java.io.IOException - java.util.concurrent.TimeoutException - feign.FeignException

Retry with Exponential Backoff

For transient failures (network blip, brief overload), retry is the right tool. Without backoff, retries can amplify load on an already-struggling service.

@Service public class InventoryService { @Retry( name = "inventory-service", fallbackMethod = "inventoryFallback" ) public InventoryStatus checkStock(Long productId) { return inventoryClient.getStatus(productId); } private InventoryStatus inventoryFallback( Long productId, Exception ex) { log.warn("Inventory check failed after retries: {}", ex.getMessage()); return InventoryStatus.unknown(); // Degrade gracefully } }
resilience4j: retry: instances: inventory-service: max-attempts: 3 wait-duration: 500ms enable-exponential-backoff: true exponential-backoff-multiplier: 2.0 # 500ms → 1s → 2s retry-exceptions: - java.io.IOException - java.net.SocketTimeoutException ignore-exceptions: - com.myapp.exception.ValidationException # Don't retry validation errors

Combining Circuit Breaker + Retry

The recommended order: Retry wraps CircuitBreaker. If the circuit is open, the retry immediately gets a CallNotPermittedException and retries are wasted. Instead, apply retry first:

@Retry(name = "payment-service") // Outer: retry transient failures @CircuitBreaker(name = "payment-service", fallbackMethod = "paymentFallback") // Inner: open on sustained failure public PaymentResult processPayment(PaymentRequest request) { return paymentClient.process(request); }

Order of decorators matters: the outermost annotation is applied last in the chain.


Bulkhead — Limit Concurrent Calls

A bulkhead isolates calls to a service so one slow dependency can't exhaust all available threads:

@Bulkhead( name = "slow-external-api", type = Bulkhead.Type.SEMAPHORE, fallbackMethod = "externalApiFallback" ) public ExternalData callExternalApi(String query) { return externalApiClient.query(query); }
resilience4j: bulkhead: instances: slow-external-api: max-concurrent-calls: 10 # Max 10 concurrent calls to this service max-wait-duration: 0 # Fail immediately if at capacity

With 10 max concurrent calls and 200 threads trying to call the slow API, 190 threads get an immediate fallback response instead of queuing and eventually timing out.


Rate Limiter — Protect Yourself and Others

@RateLimiter( name = "external-api", fallbackMethod = "rateLimitFallback" ) public SearchResult search(String query) { return searchApiClient.search(query); } private SearchResult rateLimitFallback(String query, RequestNotPermitted ex) { return SearchResult.cached(query); }
resilience4j: ratelimiter: instances: external-api: limit-for-period: 100 # 100 calls limit-refresh-period: 1s # per second timeout-duration: 0 # Fail immediately if rate exceeded

Use rate limiter when calling external APIs with quotas (Google Maps, Stripe, etc.) or to protect your own endpoints from being overloaded.


Monitoring with Actuator

management: endpoints: web: exposure: include: health,circuitbreakers,retries health: circuitbreakers: enabled: true

This exposes /actuator/health with circuit breaker state:

{ "status": "UP", "components": { "circuitBreakers": { "status": "UP", "details": { "payment-service": { "status": "CIRCUIT_CLOSED", "details": { "failureRate": "12.5%", "bufferedCalls": 8 } } } } } }

Micrometer exposes Resilience4j metrics to Prometheus: resilience4j_circuitbreaker_state, resilience4j_retry_calls_total, resilience4j_bulkhead_available_concurrent_calls.


Common Mistakes to Avoid

  • Using @Retry without ignore-exceptions — retrying a 400 Bad Request (which will never succeed) wastes time and adds load; always specify which exceptions to retry
  • Circuit breaker with too-low sliding window — a window of 3 calls means 2 failures open the circuit; set minimum-number-of-calls ≥ 10 for stable behavior
  • No fallback method — without a fallback, an open circuit throws CallNotPermittedException and returns 500; always provide a meaningful fallback
  • Timeout not configured on HTTP client — without a connection timeout, the circuit breaker waits indefinitely before recording a failure; set connectTimeout and readTimeout on your HTTP client

Summary

Resilience4j provides four tools for microservice resilience: circuit breaker (stop calling failing services), retry with exponential backoff (handle transient failures), bulkhead (limit concurrent calls to isolate load), and rate limiter (protect quotas). Layer them in order — retry wraps circuit breaker — and monitor state via Actuator + Micrometer to catch cascading failures before they propagate.


Detect Missing Resilience Patterns

JOptimize flags HTTP client calls without circuit breaker or retry configuration, uncapped thread pools calling external services, and missing timeout configurations in Spring Boot microservices.

Prevent cascading failures before they reach production.

Want to go deeper?

Master Spring Boot, security, and Java performance with hands-on courses.

Detect issues in your project

JOptimize finds N+1 queries, EAGER collections, and 70+ other issues in your Java codebase — in under 30 seconds.