JavaScript Race Conditions & Function Throttling with Promises

In this blog post I outline a method to avoid race conditions and to throttle a function in JavaScript.

# Race conditions in JavaScript?

Wikipedia has a good definition of a race condition:

A race condition or race hazard is the condition of an electronics, software, or other system where the system's substantive behavior is dependent on the sequence or timing of other uncontrollable events. It becomes a bug when one or more of the possible behaviors is undesirable.

When I first started learning Node I thought race conditions weren't possible. My previous experience with race conditions had always been with multiple threads accessing a common resource. It was the order and timing of the operations that determined the outcome (for better or worse). I associated race conditions with multi-threaded environments.

However, this is clearly not the case. Node uses a single thread for the event loop, but has other threads for asynchronous operations such as file system I/O and network requests.

This is where I encountered a race condition with Node. In one of my sites I have an HTTP service running on Node to generate a thumbnail on demand. The basic algorithm is:

  • download the original photo from a remote location and save it to disk;
  • render a thumbnail from the local copy and save it to disk; and
  • serve the thumbnail to the client.

Each operation is asynchronous and runs outside the event loop.

The race condition occurred when two requests for the same photo were made in rapid succession. Both requests started to download and save the photo, but one request completed before the other. The first request began generating the thumbnail while the second request was still writing the file. This caused the thumbnail generation to fail.

Here's how the code looked (simplified for brevity):

async function downloadAndCreateThumbnail(remotepath) {
	const localpath = await downloadAndSaveToDisk(remotepath)
	const thumbpath = await createThumbnail(localpath)
	return thumbpath
}
1
2
3
4
5

In other languages you might have encountered the synchronized keyword, or the concept of a mutex, lock, or semaphore to prevent separate threads from accessing a shared resource at the same time. What's important to understand is that most languages enforce this by blocking execution of the thread until the lock can be obtained. For example, in Java the synchronized keyword can be used:

public class MyObject {
	public synchronized void doSomething() {
		// do something that takes a long time
	}
}

// thread 1 - called first, runs immediately
MyObject.doSomething()

// thread 2 - called shortly after thread 1
// block the thread, continue after thread 1 completes
MyObject.doSomething()
1
2
3
4
5
6
7
8
9
10
11
12

Blocking in Node is not an option since it is single threaded. But being asynchronous means we can use promises to achieve something similar.

A few semaphore, lock, and mutex packages exist on npm, but I decided to write my own. The result is @chriscdn/promise-semaphore, which can be found on npm and GitHub.

It can be used to create an exclusive lock on a block of code:

const Semaphore = require('@chriscdn/promise-semaphore')
const lock = new Semaphore()

async function downloadAndCreateThumbnail(remotepath) {
	// lock.acquire() returns a promise, which resolves once the lock is acquired
	await lock.acquire()

	try {
		const localpath = await downloadAndSaveToDisk(remotepath)
		const thumbpath = await createThumbnail(localpath)
		return thumbpath
	} finally {
		// release the lock and let the next asynchronous call to this function continue
		lock.release()
	}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

This works, but the issue is that it only permits one operation at a time - even if different thumbnails are being requested.

This can be resolved by passing an optional key parameter to the lock.acquire() and lock.release() methods. This ensures only one operation for that key can run at a time. For example:

async function downloadAndCreateThumbnail(remotepath) {
	// resolve once the lock on remotepath is acquired
	await lock.acquire(remotepath)

	try {
		// ...
	} finally {
		// release the lock for remotepath
		lock.release(remotepath)
	}
}
1
2
3
4
5
6
7
8
9
10
11

No more errors generating thumbnails!

# Function Throttling

After developing the mechanism for locking, I realised the same concept could be used to throttle (or rate limit) a function.

I have a personal project for viewing photos on a website. Part of the setup process is the extraction of the Exif data from each photo (e.g., width, height, caption, latitude, longitude, etc.) and saving the values in a database.

My initial implementation had an asynchronous function for reading a file and saving the values. It looked something like this:

const fsp = require('fs').promises
const files = [...] // potentially 10'000+ photo file paths 

files.forEach(async file => {
	let buf = await fsp.readFile(file)

	// extract the exif from buf and save it to the database 
})
1
2
3
4
5
6
7
8

See the problem? This effectively calls readfile() 10'000+ times in rapid succession, which causes the memory consumption of the process to skyrocket.

My initial fix was to lock the process (as before) to limit the execution to one operation at a time.

const fsp = require('fs').promises
const Semaphore = require('@chriscdn/promise-semaphore')
const lock = new Semaphore()

const files =  ... // array containing potentially 10'000+ photo file paths 

files.forEach(async file => {
	lock.acquire()
		.then(() => fsp.readFile(file))
		.then(buff => {
			// extract the exif...
		})
		.finally(() => lock.release())
})
1
2
3
4
5
6
7
8
9
10
11
12
13
14

This brought things under control, but I soon realised it would better if the function could be throttled (or rate limited) to permit a limited number of concurrent operations. It was a simple change to the Semaphore constructor to accept a parameter to allow this:

const lock = new Semaphore(8) // allow up to 8 concurrent operations
1

With everything else being the same, the Exif extraction runs faster without any memory consumption spikes. Problem solved.

You can install it with npm or find the source on GitHub. I hope you find it useful.