JavaScript Race Conditions & Function Throttling with Promises

In this blog post I outline a method to avoid race conditions and to throttle a function in JavaScript.

# Race conditions in JavaScript?

Wikipedia (opens new window) has a good definition of a race condition:

A race condition or race hazard is the condition of an electronics, software, or other system where the system's substantive behavior is dependent on the sequence or timing of other uncontrollable events. It becomes a bug when one or more of the possible behaviors is undesirable.

When I first started learning Node, I thought race conditions weren't possible. My previous experience with race conditions had always been with multiple threads accessing a common resource. It was the order and timing of the operations that determined the outcome (for better or worse). I associated race conditions with multi-threaded environments.

However, this is clearly not the case. Node uses a single thread for the event loop, but has other threads for asynchronous operations such as file system I/O and network requests.

This is where I encountered a race condition with Node. In one of my sites I have an HTTP service running on Node to generate a thumbnail on demand. The basic algorithm is:

  • download the original photo from a remote location,
  • save the photo to disk to cache a local copy,
  • render a thumbnail from the local copy and save it to disk, and
  • serve the thumbnail to the client.

Each operation is asynchronous and runs outside the event loop.

The race condition occurred when two requests for the same photo were made in rapid succession. Both requests started a download to save the photo, but one request completed before the other. The first request began generating the thumbnail while the second request was still writing to the same file. This caused the thumbnail generation to fail.

Here's how the code looked (simplified for brevity):

async function downloadAndCreateThumbnail(remotepath) {
  const localpath = await downloadAndSaveToDisk(remotepath);
  const thumbpath = await createThumbnail(localpath);
  return thumbpath;
}
1
2
3
4
5

In other languages you might have encountered the synchronized keyword, or the concept of a mutex, lock, or semaphore to prevent separate threads from accessing a shared resource at the same time. What's important to understand is that most languages enforce this by blocking execution of the thread until the lock can be obtained. For example, the synchronized keyword can be used in Java:

public class MyObject {
	public synchronized void doSomething() {
		// do something that takes a long time
	}
}

// thread 1 - called first, runs immediately
MyObject.doSomething()

// thread 2 - called shortly after thread 1
// thread is blocked, continues after thread 1 completes
MyObject.doSomething()
1
2
3
4
5
6
7
8
9
10
11
12

Blocking in Node is not an option since it is single threaded. But being asynchronous means we can use promises to achieve something similar.

A few semaphore, lock, and mutex packages exist on npm, but I wanted to write my own. The result is @chriscdn/promise-semaphore, which can be found on npm (opens new window) and GitHub (opens new window).

It can be used to create an exclusive lock on a block of code:

const Semaphore = require("@chriscdn/promise-semaphore");
const semaphore = new Semaphore();

async function downloadAndCreateThumbnail(remotepath) {
  try {
    // semaphore.acquire() returns a promise, which resolves once the lock is acquired
    await semaphore.acquire();

    const localpath = await downloadAndSaveToDisk(remotepath);
    const thumbpath = await createThumbnail(localpath);
    return thumbpath;
  } finally {
    // release the lock and let the next asynchronous call to this function continue
    semaphore.release();
  }
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

This works, but the issue is that it only permits one operation at a time, even if different images are being requested.

This can be resolved by passing an optional key parameter to the semaphore.acquire() and semaphore.release() methods. This ensures only one operation for that key can run at a time. For example:

async function downloadAndCreateThumbnail(remotepath) {
  try {
    // resolve once the lock on remotepath is acquired
    await semaphore.acquire(remotepath);
    // ...
  } finally {
    // release the lock for remotepath
    semaphore.release(remotepath);
  }
}
1
2
3
4
5
6
7
8
9
10

No more errors generating thumbnails!

# Function Throttling

After developing the mechanism for locking, I realised the same concept could be used to throttle (or rate limit) a function.

I have a personal project for viewing photos on a website. Part of the setup process is the extraction of the Exif (opens new window) data from each photo (e.g., width, height, caption, latitude, longitude, etc.) and saving the values to a database.

My initial implementation had an asynchronous function for reading a file and saving the values. It looked something like this:

const fsp = require('fs').promises
const files = [...] // potentially 10'000+ photo file paths

files.forEach(async file => {
	const buf = await fsp.readFile(file)

	// extract the exif from buf and save it to the database
})
1
2
3
4
5
6
7
8

See the problem? This effectively calls readfile() 10'000+ times in rapid succession, which causes the memory consumption of the process to skyrocket.

My initial fix was to lock the process (as before) to limit the execution to one operation at a time.

const fsp = require('fs').promises
const Semaphore = require('@chriscdn/promise-semaphore')
const semaphore = new Semaphore()

const files =  ... // array containing potentially 10'000+ photo file paths

files.forEach(async file => {
	semaphore.acquire()
		.then(() => fsp.readFile(file))
		.then(buff => {
			// extract the exif...
		})
		.finally(() => semaphore.release())
})
1
2
3
4
5
6
7
8
9
10
11
12
13
14

This brought things under control, but I soon realised it would better if the function could be throttled to permit a limited number of concurrent operations. It was a simple change to the Semaphore constructor to accept a parameter to allow this:

const semaphore = new Semaphore(8); // allow up to 8 concurrent operations
1

With everything else being the same, the Exif extraction runs faster without any memory consumption spikes. Problem solved.

You can install it with npm (opens new window) or find the source on GitHub (opens new window). I hope you find it useful.