Kafka and Event-Driven Architecture in Spring Boot: Beyond the Basics (2026)

Most teams start using Kafka as a fast message queue — a replacement for REST calls between services. That's a reasonable starting point, but it misses the deeper architectural shift that event-driven systems enable. When you design around events rather than commands and queries, you gain loose coupling, a built-in audit log, the ability to add new consumers without touching producers, and the option to replay history to build new read models.

This article goes beyond the basic producer/consumer setup and covers the architectural thinking behind event-driven systems built with Kafka and Spring Boot.

Events vs Commands: The Critical Distinction

The most important decision in an event-driven system is what you put in your messages. There are two fundamentally different approaches, and mixing them causes confusion:

Commands tell another service what to do: SendEmailCommand, ReserveStockCommand, ChargePaymentCommand. They're directed at a specific recipient and imply that the recipient is responsible for doing the work. Commands look like RPC over a message broker — which is fine, but it's not event-driven architecture.

Events announce that something happened: OrderPlaced, PaymentProcessed, StockReserved. They're factual records of the past. The producer doesn't know or care who's listening. Any number of consumers can react to the event independently.

The event approach gives you loose coupling: when you add a new feature (say, an analytics service that tracks order patterns), you add a new consumer of OrderPlaced without touching the order service at all. With commands, you'd need to add a new command type and update the sender.

Domain Events vs Integration Events

A useful distinction is between domain events and integration events:

Domain events are internal to a bounded context. They carry rich domain objects and are consumed within the same service or module. Spring's ApplicationEventPublisher is ideal for these.

Integration events cross service boundaries. They should be slim — carrying only IDs and essential data, not entire domain objects. They need to be versioned because multiple independent services consume them and can't all be updated simultaneously.

// Domain event — internal, rich, not versioned
public record OrderPlacedDomainEvent(
    Order order,          // Full entity — fine for internal use
    Customer customer,
    List<OrderItem> items
) {}

// Integration event — external, slim, versioned
@JsonTypeInfo(use = JsonTypeInfo.Id.NAME)
public record OrderPlacedEvent(
    Long orderId,         // Just the ID — consumers fetch what they need
    String customerId,
    BigDecimal totalAmount,
    Instant occurredAt,
    String eventVersion   // "1.0" — for schema evolution
) {}

Event Schema Design for Longevity

Kafka topics can retain events for years. Your schemas need to be designed for evolution:

// Always include these fields in every integration event:
public record BaseEvent(
    String eventId,       // UUID — for idempotency and deduplication
    String eventType,     // "OrderPlaced" — for event routing
    String eventVersion,  // "1" — increment on breaking changes
    Instant occurredAt,   // When the event happened (not when it was published)
    String sourceService  // "order-service" — for debugging
) {}

// Add fields with defaults — backward compatible
public record OrderPlacedEventV2(
    Long orderId,
    String customerId,
    BigDecimal totalAmount,
    String currency,      // NEW field — existing consumers must handle null/default
    String region,        // NEW field — same
    Instant occurredAt,
    String eventId
) {}

The rule for schema evolution: adding optional fields is backward compatible. Removing or renaming fields is breaking. Change the event version and run old and new consumers in parallel during the transition.

Producing Events with Spring Kafka

@Service
@RequiredArgsConstructor
public class OrderEventProducer {

    private final KafkaTemplate<String, OrderPlacedEvent> kafkaTemplate;
    private final ObjectMapper objectMapper;

    @Async
    public void publishOrderPlaced(Order order) {
        OrderPlacedEvent event = new OrderPlacedEvent(
            order.getId(),
            order.getCustomerId(),
            order.getTotal(),
            Instant.now(),
            UUID.randomUUID().toString()
        );

        // Use orderId as partition key — ensures all events for the same order
        // go to the same partition → order guarantee for that order
        kafkaTemplate.send("orders.placed", order.getId().toString(), event)
            .whenComplete((result, ex) -> {
                if (ex != null) {
                    log.error("Failed to publish OrderPlacedEvent for order {}: {}",
                        order.getId(), ex.getMessage());
                    // Consider fallback: Transactional Outbox pattern
                } else {
                    log.debug("Published OrderPlacedEvent, offset: {}",
                        result.getRecordMetadata().offset());
                }
            });
    }
}

The partition key is critical for ordering. If you need all events for a given order to be processed in order, use the order ID as the key — Kafka guarantees ordering within a partition.

Consuming Events Correctly

Consuming Kafka events correctly requires handling several concerns that don't exist with REST calls:

@Service
@RequiredArgsConstructor
public class InventoryEventConsumer {

    private final InventoryService inventoryService;
    private final ProcessedEventRepository processedEvents;  // For idempotency

    @KafkaListener(
        topics = "orders.placed",
        groupId = "inventory-service",
        containerFactory = "kafkaListenerContainerFactory"
    )
    public void onOrderPlaced(OrderPlacedEvent event,
                              @Header(KafkaHeaders.RECEIVED_TOPIC) String topic,
                              @Header(KafkaHeaders.RECEIVED_PARTITION) int partition,
                              @Header(KafkaHeaders.OFFSET) long offset) {

        // Idempotency: skip if already processed
        // Kafka delivers at-least-once — duplicates are possible
        if (processedEvents.existsByEventId(event.eventId())) {
            log.debug("Skipping already processed event: {}", event.eventId());
            return;
        }

        try {
            inventoryService.reserveStock(event.orderId(), event.items());
            processedEvents.save(new ProcessedEvent(event.eventId(), Instant.now()));
        } catch (InsufficientStockException e) {
            // Business exception — don't retry, publish compensating event
            log.warn("Insufficient stock for order {}: {}", event.orderId(), e.getMessage());
            publishStockReservationFailed(event.orderId());
        }
        // Let unexpected exceptions propagate → Kafka retries automatically
    }
}

Idempotent consumers are essential. Kafka's at-least-once delivery guarantee means the same event can arrive twice — on consumer restart, after rebalancing, or after network issues. Design your event handlers to be safe when called multiple times with the same event.

Consumer Groups and Scaling

Kafka's consumer group model is one of its most powerful features. Multiple instances of the same service share work automatically:

orders.placed topic — 6 partitions:

Partition 0 → inventory-service instance A
Partition 1 → inventory-service instance A
Partition 2 → inventory-service instance B
Partition 3 → inventory-service instance B
Partition 4 → inventory-service instance C
Partition 5 → inventory-service instance C

Add a 4th instance and Kafka automatically rebalances the partition assignments. Remove an instance and the remaining ones absorb its partitions. This is horizontal scaling with no code changes.

Important constraint: you can't have more consumer instances than partitions. If you have 6 partitions and 8 instances, 2 instances are idle. Set your partition count generously when creating topics.

Error Handling and Dead Letter Topics

Not every event failure should cause indefinite retries. Malformed events, data that can never be processed correctly, and bugs in consumer code should go to a dead letter topic for human inspection:

@Configuration
public class KafkaErrorConfig {

    @Bean
    public DefaultErrorHandler errorHandler(KafkaTemplate<String, Object> template) {
        // Exponential backoff: 1s, 2s, 4s, 8s, 16s — then dead letter
        ExponentialBackOffWithMaxRetries backoff = new ExponentialBackOffWithMaxRetries(5);
        backoff.setInitialInterval(1000);
        backoff.setMultiplier(2.0);

        DeadLetterPublishingRecoverer recoverer = new DeadLetterPublishingRecoverer(template,
            (record, exception) -> new TopicPartition(
                record.topic() + ".DLT",  // orders.placed.DLT
                record.partition()
            ));

        DefaultErrorHandler handler = new DefaultErrorHandler(recoverer, backoff);

        // Don't retry on business logic errors — send immediately to DLT
        handler.addNotRetryableExceptions(
            JsonParseException.class,
            IllegalArgumentException.class
        );

        return handler;
    }
}

Dead letter topics are essential for operational health. Without them, a bad event blocks an entire partition indefinitely.

Common Mistakes to Avoid

Putting too much data in events — events should contain IDs and minimal context; consumers fetch the rest from APIs or their own data stores
Ignoring consumer lag — if consumers fall behind producers, events accumulate; monitor kafka_consumergroup_lag in Grafana and alert when it grows
Shared consumer group across different services — each service should have its own consumer group; sharing a group means only one service receives each event
Not planning for schema evolution — adding a required field to an existing event breaks all existing consumers; always add fields as optional with defaults

Summary

Event-driven architecture with Kafka is about more than pub/sub. It's about designing systems where producers and consumers are completely decoupled, where events are factual records of what happened, and where new features are added by subscribing to existing events rather than modifying existing services. Idempotent consumers, dead letter topics, proper partition keys, and schema versioning are the operational foundations that make EDA work reliably in production.

Optimize Your Event-Driven Spring Boot App

Event-driven systems introduce new performance patterns. JOptimize helps you find N+1 queries in event handlers, missing indexes on event-driven aggregates, and over-fetching in consumer services.

IntelliJ Plugin — performance analysis for Spring Boot: Install JOptimize
Web Dashboard — full event-driven architecture audit: Analyze your project free →

Event-driven by design, performant by analysis.