Dutch national flag sorting problem

I was asked this question a few years ago in an interview for one of Cisco subsidiaries. To be honest, I did not make it, but I loved the question, and as soon as I got back from the interview, I sat down until I came with an acceptable solution. Here is the question, and of course, solution explained in details.

For this problem, your goal is to sort an array of 0, 1 and 2’s but you must do this in place, in linear time (O(n)) and without any extra space (such as creating an extra array). This is called the Dutch national flag sorting problem (indeed when I was asked the question, they replaced the numbers by the Dutch flag colors). For example, if the input array is [2,0,0,1,2,1] then output should be [0,0,1,1,2,2].

This questions has a simpler version, where you need to sort an array containing only two different numbers, 0s and 1s for example. Let’s start with it.

One very simplistic approach is two pass a first time and holding a counter for every number, then passing again and set as many 0s as counted, and as many 1s as counted, like this:

void simpleSort(int *arr, unsigned int size)
{
	int i = 0;
	unsigned int zeros = 0, ones = 0;

	if (arr == nullptr || size == 0) {
		cout << "invalid arguments" << endl;
		return;
	}

	/* Counting the 1s and 0s */
	for (i = 0; i < size; ++i) {
		if (arr[i] == 0){
			++zeros;
		}
		else if (arr[i] == 1) {
			++ones;
		}
	}

	/* Setting 0s and 1s at their right place,
           according to counters */
	for (i = 0; i < zeros; ++i) {
		arr[i] = 0;
	}
	for (int j = 0; j < ones; ++j) {
		arr[i + j] = 1;
	}
}

Though this solution works, and of course it will also work with 3 or more different numbers, it is far from being satisfactory. Why? Let’s imagine we have to sort an array containing different instantiations of a Student class. One of the attribute of this class is a Boolean exam_status, indicating whether the student passed or failed his exam.
Sorting this kind of array, using the exam_status as sorting criteria, means that real objects need to be copied to their new place, not only insignificant numbers.

Let’s start with a tip: swapping

This tip says one very simple thing: whenever you need to sort a collection in-place, start thinking how you can solve it with swapping. Swapping simply means switching between two object memory locations (remember the swap function the CS professor taught us when first time talking about pointers??).

So here, the idea is pretty simple: we define two pointers, one pointing on the start of the array – left_ptr, and the other at its end – right_ptr. Those two pointers will move one to the other until they finally meet. At each step, left_ptr will move forward until it reaches a 1, and right_ptr will move backward until it reaches a 0. At this step, all we have to do is to switch the content of left_ptr and right_ptr, see:

/* Sort an int array containing only 1s and 0s */
void sortOnesZeros(int intArray[], unsigned int size)
{
	int *left_ptr = &intArray[0], *right_ptr = &intArray[size - 1];

	while(left_ptr < right_ptr) {
		while(*left_ptr == 0 && left_ptr < right_ptr) {
			++ left_ptr;
		}
		while(*right_ptr == 1 && left_ptr < right_ptr){
			--right_ptr;
		}
		swap(left_ptr, right_ptr);
	}
}

Now we understood the idea for sorting an array with 2 numbers (0 and 1), let’s move on to our initial question. How can we do the same when the array contains 3 different numbers: 0, 1 and 2 ? Here again, swapping is the key. It will allow us to solve the problem in place.

First way to do it, is simply by passing over the array two times, with two pointers. First time, let’s say you can move the 2s only to the right side of the array.
When it’s done, right pointer is one index less than the most left 2.
Now we reinitialize the left pointer at the start of the array again, and restart the process descrived above for sorting 0s and 1s. Here’s the situation:

Pointers after 2s were moved to the right

See the code:

void swap (int *left, colors_t *right)
{
	int temp;
	temp = *left;
	*left = *right;
	*right = temp;
}

void sortDutchFlag(int arr[], unsigned int size)
{
	int* left = arr, *right = arr + size - 1;

	/* Take all the 2s to the right */
	while (left < right){
		/* Find next 2 */
		while(*left != 2 && left < right) {
			++left;
		}
		/* Skips already placed 2 */
		while(*right == 2  && left < right){
			--right;
		}
		/* Swaps for placing the 2 at its place */
		swap (left, right);
	}
	--right;

	/* Reinitialize  left pointer to the start */
	left = arr;

	/* Order 0s and 1s */
	while (left < right){
		/* Find next 1 */
		while(*left == 0 && left < right) {
			++ left;
		}
		/* Find next 0 */
		while(*right == 1  && left < right){
			--right;
		}
		/* Swaps between 0 and 1 */
		swap (left, right);
	}
} 

Pay attention to following code line (line 23 above):

--right;

This decrement operation makes sure left pointer is correctly placed after we sorted the 2s to their place, meaning one element down the first sorted 2 (see figure above, green pointer).

Alright! This technique works well and it is even efficient and very simple to implement and understand. Still, the optimal and elegant solution is one step further. Be reassured, the idea is still the same, with some improvement, but it is still simple enough…that’s the beauty!!

Let’s try with the hands…

Imagine you have balls in an array, of 3 different colors, let’s say Red, Green and Blue (RGB…) and you want to sort them (red first, then green, then blue) in the most efficient way. How would you do? Probably you would grab the first one (index = 0) and try to figure out what to do with it.
1) Let’s say it is a blue ball. What should we do with it?
Maybe we should look at the right side and look for the biggest index (right side) not already containing a blue ball!! Then just permute them.
2) If it is it red or green, it seems we should not do anything. Because if it is red, it means it is already placed. And if it is green…well if it is green, its place is not left and not right. So we would just let it at its current place.

Now you move on to the next and second ball, and it is now green. As we stated above, when it is green, we have nothing to do with it, just move on until you grab a red, or a blue. If for example, the next non-green ball is a red ball, you know that its place is at the lowest position (left side) not already containing a red ball.
So what you are basically doing, is to passing the balls one after the other, to the left or to the right according to the color. Green balls are automatically sorted as we move the red and blue balls to their appropriate place.

The algorithm

For this algorithm, we need not only 2 but 3 pointers: let’s call them red, green and blue pointers. Red and green are initialized to point to the first element of the array, and will move forward, while blue pointer, is initialized to point to the last element and will move backward.
Green pointer is the one which will be incremented in our loop (it is the hand grabbing the balls we talked about above, picking the balls one after the other, figuring out where to place them). According to the green pointer value, swaps will be done towards the start or the end of the array. Red and blue pointers hold the current places where 0s or 2s should be placed next.

Here is pseudo-code for:

for green_ptr in array and green_ptr < blue_ptr:
  if value at green_ptr == 1
    ++green_ptr
  else if value at green_ptr == 0
    swap(green_ptr, red_ptr)
    ++green_ptr
    ++red_ptr
  else // 2
    swap(green_ptr, blue_ptr)
    ++blue_ptr

Note that green pointer is incremented only after it swapped to move a red ball to its place. Because in this case, we know for sure, only a green ball could replace it, then we can move forward. We know that because if a blue ball was still placed before the green pointer, we should have met it before and placed it to the right. And it cannot be a red neither, because red pointer would have been incremented to the next position. To sum up:
1. Before the green pointer, there are no blue balls!
2. Red pointer cannot points on red balls. It points on next location for a possible red ball to come
However, when swapping and moving blue ball to the right, green ball should not be automatically incremented, because maybe a red ball took its place, and should be moved now to the left (see figures 6, 7, 8 below).

Let’s see it in pictures, with an example, step by step. Here is our array, with its 3 pointers at their initial values.

Figure 1 – Initial array: red and green point to first element, blue to the last
Figure 2 – Swap the 2 at index #0, with the 1 at index #8, then decrement the blue pointer
Figure 3 – Now a 1 is at index #0, so we increment the green pointer until we have a value different from 1
Figure 4 – Swap the 0 at index #3, with the 1 at index 0, then increment the green and the red pointers
Figure 5 – The green pointer moves to the next position not containing a 1.
Figure 6 – Swap the 0 and the 2, then decrement the blue pointer to its new position.
Figure 7 – Pay attention, after a right swap, the green pointer is not incremented, because the new element is not necessarily to its place, as it can be seen here. The 0 needs to be swapped again to the left. See next step.
Figure 8 – Now we can swap the 0 to its right place. Remark how this 0, initially placed passed from the right to the middle, then to the left at its final position. Also see how the green elements (the 1s) are automatically sorted, just by sorting red and blue elements.
Figure 9 – Last 0 here at the green pointer, waiting for its last swap…
Figure 10 – Last swap. We now reached stop condition for this algorithm, when green pointer passes the blue pointer
Figure 11

The code

void sortDutchFlag(int arr[], unsigned int size)
{
	int *left = &arr[0], *mid = &arr[0],*right = &arr[size - 1];

	while(mid <= right) {
		if (*mid == 0) { // 0
			swap(mid, left);
			++left;
			++mid; // after 
		}
		else if(*mid == 1){ // 1
			++mid;
		}
		else { // 2
			swap(mid, right);
			--right;
		}
	}
}

One of this algorithm drawback is it is not stable. Stability in a sorting algorithm means that equivalent elements (with same sorting criteria value) retain their relative positions, after sorting. And here we can clearly see that 2 blue balls will be in inverted order, after the sorting. My question for you. How to improve the algorithm and make it stable? Let me know in the comments…

That’s all for now!! Thanks for reading…See you in next post.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s