As33
@periodic/
arsenic
redis_wait
⚠️ Warning

Blocks until replicas acknowledge writes — latency determined by replication lag

WAIT blocks the client until a specified number of replicas acknowledge pending write operations, or the timeout expires. It provides a stronger durability guarantee than fire-and-forget writes by confirming replication before returning. The latency cost is the round-trip time from master to replica plus network jitter — typically 1–10ms on a healthy cluster but unbounded if replication lag is high.

Common Causes

  • Strong consistency requirements where data loss is unacceptable on failover
  • Financial, audit, or compliance data written to Redis before the primary database
  • Leader-election patterns that require confirmed propagation before proceeding
  • Multi-region deployments with high replication lag calling WAIT synchronously

How to Fix

  1. 1.Set a finite timeout — never call WAIT with timeout 0 on a hot path
  2. 2.Check the return value: WAIT returns the actual number of replicas that acknowledged; handle partial acknowledgement
  3. 3.Move WAIT to background confirmation jobs for non-latency-critical durability checks
  4. 4.For critical data, use your primary database (PostgreSQL etc.) for durability instead of Redis

WAIT adds replication latency to your request

WAIT latency is determined by your replication setup, not your Redis instance. High replication lag — caused by network issues, a slow replica, or heavy write traffic — directly adds to request latency. Always set a timeout and handle the case where fewer replicas acknowledged than requested.

Example

typescript
// BAD — infinite wait, replication lag becomes request latency
await redis.set('account:123:balance', newBalance);
await redis.wait(1, 0); // blocks until 1 replica acks, no timeout

// GOOD — bounded timeout with acknowledgement check
async function writeWithReplicationCheck(
  redis: Redis,
  key: string,
  value: string,
  requiredReplicas = 1
) {
  await redis.set(key, value);
  const ackedReplicas = await redis.wait(requiredReplicas, 100); // 100ms timeout

  if (ackedReplicas < requiredReplicas) {
    // Log the partial acknowledgement but do not fail the request
    logger.warn('redis.replication.partial', {
      key,
      required: requiredReplicas,
      acked: ackedReplicas,
    });
  }
}

// GOOD — durability-critical writes backed by primary DB
// For financial data, write to PostgreSQL first (durable), then Redis (cache)
await db.transaction(async (tx) => {
  await tx.accounts.update({ where: { id }, data: { balance: newBalance } });
});
await redis.set('account:123:balance', newBalance, 'EX', 300); // cache, not source of truth
// No WAIT needed — PostgreSQL is the durable store

// GOOD — background replication confirmation for audit
async function backgroundReplicationAudit(redis: Redis) {
  const ackedCount = await redis.wait(2, 5000); // 2 replicas, 5 second window
  if (ackedCount < 2) {
    alertOps('Redis replication behind', { ackedCount });
  }
}