distributed_locks_microservices 7 Q&As

Distributed Locks Microservices FAQ & Answers

7 expert Distributed Locks Microservices answers researched from official documentation. Every answer cites authoritative sources you can verify.

unknown

7 questions
A

Distributed locks coordinate access to shared resources across services/instances, preventing race conditions in distributed systems. Use cases: leader election in clustered apps, scheduled job coordination (cron), inventory allocation (prevent overselling). Implementations: Redis Redlock, etcd lease-based locking, ZooKeeper ephemeral nodes, PostgreSQL advisory locks. Avoid when: can use database transactions, queue-based coordination (SQS, Kafka), CRDTs for eventual consistency.

99% confidence
A

Redis Redlock: acquire locks on majority of Redis nodes (3-5 instances). Algorithm: generate unique lock_id (UUID), attempt SET lock_key lock_id NX PX 30000 on all nodes, if majority succeed within drift window (lock_ttl * 0.01, typically 300ms), lock acquired. Release with Lua script: if redis.call('GET',KEYS[1]) == ARGV[1] then return redis.call('DEL',KEYS[1]) else return 0 end (prevents releasing someone else's lock). Libraries: node-redlock, redlock-py, Redisson. Pros: low latency (2-5ms), high availability. Cons: controversial correctness during network partitions (Martin Kleppmann criticism).

99% confidence
A

etcd: lease-based locking with compare-and-swap. Create lease (TTL 30s), acquire lock with txn: if key not exists → set key with lease, else fail. Keep-alive extends lease. Automatic release on client disconnect. Native library support in Go, Python, Java. Pros: strong consistency (Raft consensus), 99.9% reliability. Cons: higher latency (10-20ms), requires etcd cluster. Recommended for strong consistency requirements (financial, inventory). Production config: lock TTL 10-30 seconds, retry with exponential backoff (100ms, 200ms, 400ms), timeout 5-10 seconds.

Sources
99% confidence
A

ZooKeeper: ephemeral sequential nodes pattern. Create /locks/resource-0000001, get children of /locks, if yours is lowest sequence → lock acquired, else watch next-lowest node. Automatic cleanup when session expires. Apache Curator library provides recipes. Pros: battle-tested (Kafka, HBase use), automatic cleanup. Cons: complex setup, Java-centric ecosystem. Use for: leader election (acquire lock on /leader key, holder is leader runs cron jobs/stream processing).

99% confidence
A

PostgreSQL advisory locks: SELECT pg_try_advisory_lock(12345) returns true if acquired. Session-level (auto-release on disconnect) or transaction-level locks. Pros: simple if already using PostgreSQL, transactional guarantees. Cons: database becomes coordination bottleneck, lock table contention. Use for: simplicity when database already present, non-high-throughput scenarios. Avoid for: high-throughput distributed systems where database contention would be bottleneck.

99% confidence
A

Challenges: (1) Lock expiry (process holding lock crashes/delays) - use heartbeat to extend TTL (etcd/ZooKeeper keep-alive), set TTL > worst-case processing time + buffer, (2) Split-brain (network partition causes multiple lock holders) - use fencing tokens (monotonically increasing counter, ZooKeeper zxid or etcd revision), resource checks token before accepting operations, (3) Deadlocks - use lock timeouts (try_lock with 5-10 sec timeout), ordered locking (alphabetical resource names), (4) Performance - locks serialize operations (10K req/sec → 100 req/sec). Mitigations: minimize lock scope (lock per user_id not global), use optimistic locking (version fields, retry on conflict), queue-based coordination.

99% confidence
A

2025 recommendation: etcd for strong consistency requirements (financial, inventory), Redis Redlock for low-latency best-effort locks (cache invalidation, non-critical coordination), PostgreSQL advisory for simplicity if database already present. Monitoring: track lock acquisition latency P95 (<50ms healthy), hold duration (alert if >TTL * 0.8), contention rate. Production patterns: leader election (etcd election, ZK LeaderLatch), distributed cron (acquire lock before job execution), resource allocation (lock before assigning limited resources like IP addresses, license seats).

99% confidence