Most Spring Boot Kafka consumers work fine in dev and fall apart under production load. Learn partition strategy, batch consuming, error handling, and the consumer lag traps.
JOptimize Team
Spring Boot's Kafka integration via spring-kafka makes it trivially easy to write a consumer. It's also trivially easy to write one that looks correct but fails under load, accumulates consumer lag, or silently loses messages on restart.
This guide covers the patterns that separate a toy Kafka consumer from a production-grade one.
@Component public class OrderEventConsumer { @KafkaListener(topics = "orders", groupId = "order-service") public void consume(OrderEvent event) { orderService.process(event); } }
This works. It processes one message at a time, auto-commits offsets, and handles deserialization automatically. For low-volume topics it's fine. For anything real, you'll hit problems.
By default, @KafkaListener uses a single thread per consumer. If your topic has 12 partitions and you deploy 1 instance, 11 partitions are idle.
Rule: consumers ≤ partitions. More consumers than partitions = wasted instances.
// application.properties spring.kafka.consumer.group-id=order-service # Concurrency: one thread per partition assignment spring.kafka.listener.concurrency=3 // 3 threads, each handling up to 4 partitions (12-partition topic)
Or via @KafkaListener:
@KafkaListener( topics = "orders", groupId = "order-service", concurrency = "3" // 3 consumer threads ) public void consume(OrderEvent event) { ... }
Each thread maintains its own partition assignment and offset. Increasing concurrency above partition count gains nothing — the extra threads sit idle.
The default enable.auto.commit=true periodically commits the last fetched offset, regardless of whether processing succeeded.
Fetch batch → process message 1 ✅ → process message 2 💥 (exception) → auto-commit fires → offset advanced past message 2 → message 2 is LOST
Switch to manual offset commits:
// application.properties spring.kafka.consumer.enable-auto-commit=false spring.kafka.listener.ack-mode=MANUAL_IMMEDIATE
@KafkaListener(topics = "orders", groupId = "order-service") public void consume(OrderEvent event, Acknowledgment ack) { try { orderService.process(event); ack.acknowledge(); // Only commit after successful processing } catch (Exception e) { // Don't ack — message will be redelivered log.error("Failed to process order event: {}", event.getOrderId(), e); throw e; // Triggers retry/error handler } }
Processing messages individually means one DB call per message, one HTTP call per message. For high-throughput topics this doesn't scale.
Use batch listening:
// application.properties spring.kafka.listener.type=BATCH spring.kafka.consumer.max-poll-records=500 // Up to 500 messages per poll
@KafkaListener(topics = "orders", groupId = "order-service") public void consumeBatch( List<OrderEvent> events, Acknowledgment ack) { // Process all in one DB call instead of N calls orderService.processBatch(events); ack.acknowledge(); }
With saveAll() instead of N individual saves, you get a 10-50x throughput improvement for DB-backed consumers.
A poison message (malformed JSON, null field, unexpected data) that always throws will block the partition indefinitely if you keep retrying it.
@Configuration public class KafkaConfig { @Bean public DefaultErrorHandler errorHandler( KafkaTemplate<String, Object> kafkaTemplate) { // Send to DLT after 3 retries DeadLetterPublishingRecoverer recoverer = new DeadLetterPublishingRecoverer(kafkaTemplate, (record, ex) -> new TopicPartition( record.topic() + ".DLT", record.partition())); // Exponential backoff: 1s, 2s, 4s, then DLT ExponentialBackOffWithMaxRetries backOff = new ExponentialBackOffWithMaxRetries(3); backOff.setInitialInterval(1000L); backOff.setMultiplier(2.0); return new DefaultErrorHandler(recoverer, backOff); } }
After 3 retries with exponential backoff, the message goes to orders.DLT (dead letter topic) and processing continues. A separate consumer monitors the DLT for alerting and manual replay.
Consumer lag = messages in topic - messages processed. A lag growing to 500K means your consumers can't keep up with the producer rate.
Expose lag as a metric:
@Component @RequiredArgsConstructor public class KafkaLagMonitor { private final MeterRegistry meterRegistry; private final KafkaAdmin kafkaAdmin; @Scheduled(fixedDelay = 30_000) public void recordLag() { // Spring Boot Actuator exposes kafka.consumer.lag automatically // with spring-kafka 3.x + Micrometer } }
With Spring Boot Actuator + Micrometer, add:
# application.properties management.metrics.enable.kafka=true
This exposes kafka.consumer.fetch-latency-avg, kafka.consumer.records-lag-max, and kafka.consumer.records-consumed-rate to Prometheus/Grafana automatically.
# application.properties # Consumer spring.kafka.consumer.bootstrap-servers=kafka:9092 spring.kafka.consumer.group-id=order-service spring.kafka.consumer.auto-offset-reset=earliest spring.kafka.consumer.enable-auto-commit=false spring.kafka.consumer.max-poll-records=100 spring.kafka.consumer.fetch-min-size=1024 # Wait for at least 1KB spring.kafka.consumer.fetch-max-wait=500ms # Max 500ms wait spring.kafka.consumer.heartbeat-interval=3000ms spring.kafka.consumer.session-timeout-ms=30000ms # Listener spring.kafka.listener.ack-mode=MANUAL_IMMEDIATE spring.kafka.listener.concurrency=3 spring.kafka.listener.type=BATCH spring.kafka.listener.poll-timeout=3000ms # Deserialization spring.kafka.consumer.key-deserializer=org.apache.kafka.common.serialization.StringDeserializer spring.kafka.consumer.value-deserializer=org.springframework.kafka.support.serializer.JsonDeserializer spring.kafka.consumer.properties.spring.json.trusted.packages=com.myapp.events
Kafka guarantees at-least-once delivery in most configurations. Your consumer must be idempotent:
@Service @RequiredArgsConstructor public class OrderEventService { private final OrderRepository orderRepository; private final ProcessedEventRepository processedEventRepository; @Transactional public void process(OrderEvent event) { // Idempotency check — skip if already processed if (processedEventRepository.existsById(event.getEventId())) { log.debug("Skipping duplicate event: {}", event.getEventId()); return; } orderRepository.save(event.toOrder()); processedEventRepository.save(new ProcessedEvent(event.getEventId())); } }
The processed_events table with a unique constraint on event_id guarantees idempotency even under redelivery.
enable.auto.commit=true in production — any exception causes message loss; always use MANUAL_IMMEDIATEDefaultErrorHandler with backoffProduction Kafka consumers require: matching concurrency to partition count, manual offset commits to prevent message loss, batch processing for throughput, a dead letter topic for poison messages, and consumer lag monitoring. These five changes take a toy consumer to a production-grade one.
JOptimize flags blocking calls inside Kafka listener methods, missing error handler configurations, and auto-commit settings that risk message loss in Spring Boot projects.
Catch messaging bugs before they cause silent message loss in production.
Master Spring Boot, security, and Java performance with hands-on courses.
JOptimize finds N+1 queries, EAGER collections, and 70+ other issues in your Java codebase — in under 30 seconds.