newspaint

Documenting Problems That Were Difficult To Find The Answer To

Avoiding Thundering Herd in Memcached

This article might also be interpreted as “locking shared resources using memcache”.

A common problem with sites that use memcached is the worry that when memcached is restarted and the cache is empty then a highly-scaled application will hit the cold cache and find it empty and then many threads will proceed to all simultaneously make heavy computed lookups (typically in a database) bringing the whole site to a halt. This is called the “Thundering Herd” problem.

There are two worries about “Thundering Herd”. The first, and considered by this article, is where you may have a complex and expensive database query that is used to calculate a cached value; you don’t want 100 users simultaneously trying to access this value soon after you start your application resulting in that complex database query being executed 100 times when a single execution will do the job. Better to wait for that single query to complete, populate the cache, then all the other users can get the response near instantly afterwards. The second worry is that an empty cache will need filling from many different types of queries – this article does not help with that problem – but then pre-filling a cache could be difficult (if which needed cached values are not determinable ahead of time) and will take time in such a situation nonetheless.

A common recommendation to avoid this is to create “warm-up” scripts that pre-fill memcached before letting the application loose at it.

I think there’s another way. Using access controls via memcached itself. Because the locks should be cheap and access fast (compared to doing a database lookup). You can build these techniques into your application and forget about Thundering Herd.

This works by locking memcached keys using another memcached key. And putting your application to sleep (polling) until the lock is released if another process has it. This way if two clients attempt to access the same key – then one can discover it is missing/empty, do the expensive computation once, and then release the lock – the other client will wake up and use the pre-computed key.

If You Can Get The Result From The ADD Operation

We can use the fact that the ADD operation is atomic. Either it succeeds because the key didn’t exist before, or it fails because the key exists. We set an expiry time of 10 seconds in case the process crashes before releasing the lock ensuring that it eventually times out and disappears.

function get_cache_or_calculate( memcache_key, compute_callback, lifetime ) {
  // wait to acquire lock key
  // add key if doesn't exist (10 seconds lifetime)
  while ( ADD( memcache_key + "_lock", 10 ) == false ) {
    sleep( 100ms ); // sleep a little bit then check again
  }

  // we hold the lock
  var value = GET( memcache_key );
  if ( value != null ) {
    // release lock
    DELETE( memcache_key + "_lock" );
    return( value );
  }

  // key did not exist in memcache, compute
  var value = compute_callback();
  PUT( memcache_key, value, lifetime );

  // release lock
  DELETE( memcache_key + "_lock" );
  return( value );
}

function set_cache( memcache_key, value, lifetime ) {
  // wait to acquire lock key
  // add key if doesn't exist (10 seconds lifetime)
  while ( ADD( memcache_key + "_lock", 10 ) == false ) {
    sleep( 100ms ); // sleep a little bit then check again
  }

  // we hold the lock
  PUT( memcache_key, value, lifetime );

  // release lock
  DELETE( memcache_key + "_lock" );
}

If You Cannot Get The Result From The ADD Operation

function get_cache_or_calculate( memcache_key, compute_callback, lifetime ) {
  // add key if doesn't exist
  ADD( memcache_key + "_lock", 10 ); // 10 seconds lifetime

  // wait to acquire lock key
  while ( true ) {
    var lock_count = INCR( memcache_key + "_lock" );
    if ( lock_count == 1 )
      break; // we have acquired the lock!

    // somebody else has the lock
    DECR( memcache_key + "_lock" );
    sleep( 100ms );
  }

  // we hold the lock
  var value = GET( memcache_key );
  if ( value != null ) {
    DELETE( memcache_key + "_lock" );
    return( value );
  }

  // key did not exist in memcache, compute
  var value = compute_callback();
  PUT( memcache_key, value, lifetime );
  DELETE( memcache_key + "_lock" );
  return( value );
}

function set_cache( memcache_key, value, lifetime ) {
  // wait to acquire lock key
  while ( true ) {
    var lock_count = INCR( memcache_key + "_lock" );
    if ( lock_count == 1 )
      break; // we have acquired the lock!

    // somebody else has the lock
    DECR( memcache_key + "_lock" );
    sleep( 100ms );
  }

  // we hold the lock
  PUT( memcache_key, value, lifetime );
  DELETE( memcache_key + "_lock" );
}

What if the process holding the key dies?

To protect against this a timeout/expiry should be set for every lock object of a reasonable value – say 10 seconds. It would be painful for the particular customer waiting this period but should happen only once.

This involves 3 memcached calls instead of 1 for a read! Isn’t that inefficient?

Yes, it is inefficient. However it is still a lot faster than making a single SELECT call to a database.

Why do you DELETE instead of DECR the lock after you’re finished with it?

Because we want the next acquirer of the lock to set the timeout on it. And one of them will ADD the key (doesn’t matter which) just before the same one or another one acquires it (gets a value of 1 upon INCR). Any others will get a result >1 for the INCR operation and go to sleep before re-testing.

If we merely DECR upon finishing with the lock the lock timeout would still be ticking – leaving us with the possibility that another process will acquire the old lock and it might expire while in use resulting in a collision when another process creates a new lock.

A sleep time of 100ms? Won’t that cause problems?

Potentially a variation of “live lock” can occur. Say you have 100 threads all wanting to get a common cached value; they all try and ADD a lock only to find it already exists. So they all go to sleep for 100ms. Then they all wake up at the same time and only one gets to acquire a lock. Now you have 99 threads that go to sleep for 100ms. If this continues in this fashion then the last thread to acquire the lock will have waited 10 seconds (100 x 0.1s = 10s).

Such a situation is fairly unlikely as slight variations in execution time will mean that, in all probability, the wake up time will vary by a few milliseconds. So the sleep time could be reduced to, say, 25ms – but this may put additional CPU load (and network traffic to memcached, though slight) on your system.

A remedial action could be to specify a psuedo-random number as the sleep time – thus greatly increasing the probability that different threads will wake up at different times reducing the possibility of contention with another.. if this worries you enough.

Leave a comment