Database Indexing for Spring Boot Developers: What Every JPA User Must Know (2026)

JPA's abstraction is powerful — until you realize it silently generates queries with no indexes. A findByEmail() on a 10M-row users table does a full sequential scan taking 2 seconds. findByCustomerIdAndStatus() might ignore your index entirely because the column order is wrong. And every foreign key without an index causes a sequential scan on every JOIN.

Rule 1: Every Foreign Key Needs an Index

-- Common JPA entity relationship
@ManyToOne
private Customer customer;  -- maps to customer_id column

-- PostgreSQL does NOT automatically index foreign keys
-- This query does a sequential scan on orders:
SELECT * FROM orders WHERE customer_id = 42;

-- Fix:
CREATE INDEX idx_orders_customer_id ON orders(customer_id);

In Hibernate/JPA, every @ManyToOne or @OneToMany that maps to a column in the child table needs an explicit index. Without it, every findByCustomerId() is a full table scan.

Rule 2: Composite Index Column Order Matters

-- Query:
SELECT * FROM orders WHERE customer_id = 42 AND status = 'PENDING';

-- Index A: (customer_id, status) — USED for the above query
CREATE INDEX idx_orders_customer_status ON orders(customer_id, status);

-- Index B: (status, customer_id) — NOT used efficiently for the above query
-- (Can only use the first column for range scans unless both are equality predicates)

Rule: put the most selective column first (the one that eliminates the most rows). For equality + range predicates, put equality columns first:

-- Query: WHERE customer_id = 42 AND created_at >= '2026-01-01'
-- Correct order: equality column first
CREATE INDEX idx_orders_customer_created ON orders(customer_id, created_at);

Rule 3: Partial Indexes for Sparse Queries

-- 98% of orders are 'COMPLETED' — only 2% are 'PENDING'
-- A full index on status is mostly wasted on COMPLETED rows

-- Full index — large, 100% of rows
CREATE INDEX idx_orders_status ON orders(status);

-- Partial index — tiny, only indexes PENDING rows
CREATE INDEX idx_orders_pending ON orders(created_at)
    WHERE status = 'PENDING';

-- Query that uses partial index:
SELECT * FROM orders WHERE status = 'PENDING' ORDER BY created_at;
-- PostgreSQL uses idx_orders_pending — 50x smaller than full index

Partial indexes are dramatically smaller and faster for sparse conditions.

Adding Indexes via JPA/Flyway

// JPA — @Index annotation on entity
@Entity
@Table(name = "orders", indexes = {
    @Index(name = "idx_orders_customer_id",    columnList = "customer_id"),
    @Index(name = "idx_orders_status_created", columnList = "status, created_at"),
    @Index(name = "idx_orders_email",          columnList = "customer_email", unique = true)
})
public class Order { ... }

-- Flyway migration — preferred for production control
-- V10__add_orders_indexes.sql
CREATE INDEX CONCURRENTLY idx_orders_customer_id
    ON orders(customer_id);

CREATE INDEX CONCURRENTLY idx_orders_status_created
    ON orders(status, created_at);

-- CONCURRENTLY builds index without locking the table
-- (takes longer but doesn't block production traffic)

Always use CREATE INDEX CONCURRENTLY in production — regular CREATE INDEX locks the table for the entire build duration.

Detecting Missing Indexes

-- PostgreSQL: find sequential scans on large tables
SELECT schemaname, tablename, seq_scan, seq_tup_read,
       idx_scan, idx_tup_fetch
FROM pg_stat_user_tables
WHERE seq_scan > 0
ORDER BY seq_tup_read DESC;

-- High seq_tup_read with low idx_scan = missing index

-- EXPLAIN ANALYZE: see what the query planner actually does
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT * FROM orders WHERE customer_id = 42 AND status = 'PENDING';

-- Look for:
-- Seq Scan → full table scan (bad)
-- Index Scan → uses index (good)
-- Index Only Scan → no heap access needed (best)
-- Buffers: shared hit=X → data from cache (good)
-- Buffers: shared read=X → data from disk (expensive)

Covering Indexes (Index Only Scan)

-- Query that reads 3 columns from a large table:
SELECT id, status, total FROM orders WHERE customer_id = 42;

-- Regular index — fetches row from heap for each match
CREATE INDEX idx_orders_customer ON orders(customer_id);

-- Covering index — all needed columns IN the index
CREATE INDEX idx_orders_customer_covering
    ON orders(customer_id) INCLUDE (status, total);

-- EXPLAIN shows: Index Only Scan (no heap access)
-- For very frequent queries, covering indexes eliminate disk I/O entirely

Indexes That Hurt Performance

-- Too many indexes on a write-heavy table
-- Every INSERT/UPDATE must update all indexes
CREATE TABLE events (
    id BIGSERIAL PRIMARY KEY,
    user_id BIGINT,
    event_type VARCHAR(50),
    created_at TIMESTAMP,
    payload JSONB
);

-- DANGEROUS for insert-heavy table:
CREATE INDEX ON events(user_id);
CREATE INDEX ON events(event_type);
CREATE INDEX ON events(created_at);
CREATE INDEX ON events(user_id, event_type, created_at);  -- Plus composite
-- 4 indexes = 4 writes per INSERT → 4x write amplification

For time-series or event tables with 10K+ inserts/second, fewer indexes = better write throughput. Use PostgreSQL table partitioning + one index per partition instead.

Common Mistakes to Avoid

No index on columns used in ORDER BY — sorting without an index causes a filesort; for paginated queries, the ORDER BY column must be indexed
Indexing low-cardinality columns alone — CREATE INDEX ON orders(status) where status has 3 values covers 33% of rows per value; PostgreSQL often ignores it and does a seq scan
Not monitoring index usage — pg_stat_user_indexes shows how often each index is used; zero-use indexes waste write performance and should be dropped
CREATE INDEX without CONCURRENTLY in prod — locks the entire table for minutes; always use CONCURRENTLY

Summary

For Spring Boot JPA apps: index every foreign key column, get composite index column order right (equality columns first), use partial indexes for sparse conditions, and always create production indexes with CONCURRENTLY. Detect missing indexes with pg_stat_user_tables sequential scan analysis and EXPLAIN ANALYZE. Covering indexes eliminate heap access for hot read paths. The difference between a missing index and the correct one can be the difference between 2 seconds and 2 milliseconds.

Detect Missing Index Opportunities

JOptimize analyzes your JPA entities and repository queries to identify @ManyToOne relationships without indexes, unindexed ORDER BY columns in paginated queries, and missing composite index definitions.

IntelliJ Plugin — JPA index analysis inline: Install JOptimize for IntelliJ
Web Dashboard — full database performance audit: Analyze your project free →

Find missing indexes before your queries time out in production.