Elasticsearch segment merging is automatic background process that combines multiple small immutable Lucene segments into fewer larger segments within a shard. Purpose: (1) Reclaim disk space by physically expunging soft-deleted documents (marked in .liv bitmap files, not actually deleted until merge), (2) Improve query performance by reducing segment count (searching 10 segments faster than 100 - fewer file handles, less overhead), (3) Lower memory pressure from excessive segment state overhead (each segment has metadata, inverted indexes in heap). How it works: TieredMergePolicy (default since Lucene 5.0) monitors shard's segment sizes and count, calculates logarithmic staircase budget based on index size, when segment count exceeds budget selects candidates for merge. Merge process: reads source segments into memory, combines inverted indexes, skips deleted documents, writes single merged segment to disk, atomic switch replaces old segments after fsync completes.
Elasticsearch Segment Merging FAQ & Answers
4 expert Elasticsearch Segment Merging answers researched from official documentation. Every answer cites authoritative sources you can verify.
unknown
4 questionsTieredMergePolicy key parameters (configurable): (1) index.merge.policy.max_merged_segment (default 5GB) - segments larger than this excluded from merging, prevents merging already-large segments, (2) index.merge.policy.segments_per_tier (default 10) - target segment count per tier, lower = fewer segments (better search, more merge overhead), higher = more segments (faster indexing, slower search), (3) index.merge.policy.floor_segment (default 2MB) - minimum segment size, very small segments always merged, (4) index.merge.policy.max_merge_at_once (default 10) - max segments merged in single operation, (5) index.merge.policy.expunge_deletes_allowed (default 10%) - threshold to trigger delete expunging (if segment has >10% deleted docs, eligible for merge to reclaim space). Throttling: index.merge.scheduler.max_merge_inprogress (default 20MB/s) limits merge I/O to prevent query performance degradation, auto-throttles when merge I/O exceeds limit. Threads: index.merge.scheduler.max_thread_count (default Math.max(1, cores/2)), higher allows parallel merges but increases I/O contention.
Force merge: POST /my-index/_forcemerge?max_num_segments=1 produces single segment, optimizing searches but preventing future merges. When to use: (1) Read-only indices - completed time-series data (logs from last month, archived data), historical indices no longer receiving writes, (2) Snapshot optimization - reduce segment count before snapshot (fewer files, faster restore), (3) Delete expunging - forcibly reclaim disk space from soft-deleted documents. When NOT to use: (1) Active write indexes - creates large 5GB+ segments that cannot participate in automatic merges (TieredMergePolicy excludes segments >max_merged_segment), accumulates deletions over time degrading performance, (2) Frequent force merges - causes write amplification (bytes processed >> final index size), sustained high I/O and CPU. Critical warning: force merging active indexes causes long-term performance problems. Production best practice (2025): Let automatic TieredMergePolicy handle routine merging, use forcemerge only on read-only indices for optimization.
Performance impact: Merging is write-amplification intensive - ratio of bytes processed vs final index size can be 3-5x (write 1GB data, merge processes 3-5GB total over time). Merge storms occur with heavy indexing (10M+ docs/hour) creating many small segments, causing sustained high I/O and CPU. Tradeoff: Higher refresh_interval (default 1s, increase to 5s-30s) reduces merge frequency (fewer small segments created) but increases search latency (new documents visible slower). Typical merge cycle: index 1M docs → creates 100 small segments → automatic merges combine to 10 segments over 5 minutes → query performance improves 10x. Monitor merge activity: GET /_stats shows merges.current (active merges), merges.total (lifetime merges), GET /_cat/indices?v&h=merges.current,merges.total per-index view. For write-heavy workloads: increase index.refresh_interval to 30s (reduces segment creation rate), increase max_thread_count for parallel merges, use staggered shard allocation across data tiers.