Parallel programming in Java (2/4): Ordering the ingredients (Data Race - Mutual Exclusion)

Intro

This article is depending on this course on LinkedIn Learning

This is part 2 out of 4, to scratch the surface of parallel programming in Java, and how we can make use of threads using Java 11+

Part 1: Parallel programming in Java (1/4): In the kitchen (Threading)

In each article, we will have a real-life example to align with the concepts, then we will elaborate on the code.

Ordering the ingredients

Today, we will check the ingredients that we have, and order what we need, we have Chef(A) and Chef(B) in our example, and we have the fridge, and the items list that we need to bring from the market

Data Race

Chef(A) and Chef(B) are checking the list, they need 2 kilos of tomatoes as a stock. To prepare the ketchup, Chef(A) needs to add 1 more kilo, but then he went to answer the phone before replacing the 2 kilos with 3 kilos. Chef(B) needs 2 more kilos to make a red sauce and keep it in the fridge. Chef(B) checks the current value (2 Kilo), then update it (4 Kilo)

Chef(A) is back after the call, removes the current value, and replaces it with (3 Kilos) as it's the last saved value in the memory.

The stock needed is 2, Chef(A) needs 1, and Chef(B) needs 2, so they shall order 5. However, the final result is 3, due to the conflict that happened.

Mutual Exclusion (MutEx)

There has been a huge issue back then, because of the missing 2 kilos, therefore, Chef(A) and Chef(B) decided to have only one pen next to the list, and only one can have access to read and modify at the time. So, if the previous scenario ever happens again, Chef(A) shall take the pen while answering the call, and after the modification, the pen is released for Chef(B) to read and modify.

Since it might be a bit boring for one Chef to wait until the other one releases the pen, they both agreed that they will only hold the pen after deciding what they want, so they hold the pen and change whatever they want, then release the pen.

In the code

This is a very tricky part as it's very hard to detect, because it occurs frequently, not always, so you have to be very focused on this while developing.

Data Race

This occurs when two or more threads are accessing a shared resource. While a specific thread has already read and modified the value, another thread has read the old value and modified it as per its changes, overwriting the changes made by the first thread.

Let's say, we have two threads that are reading and modifying the same value

class ShoppingList extends Thread{
    static int tomatoCount = 0;

    public void run() {
        int loopCount = 10_000_000;

        for (int i = 0; i < loopCount; i++) {
            // This line is doing three things
            // 1- Reading the current tomatoCount
            // 2- Adding 1 to the read value
            // 3- Updating the current tomatoCount
            tomatoCount++;
        }
    }
}

The more the loopCount is, the higher chances of a data race to occur.

public class DataRace {
    public static void main(String[] args) throws InterruptedException {
        // We will make two threads to change the class variable (static variable)
        Thread chefA = new ShoppingList();
        Thread chefB = new ShoppingList();

        chefA.start();
        chefB.start();

        chefA.join();
        chefB.join();
        // Data race occurs here at high values
        System.out.println("The final needed tomato count is: " + ShoppingList.tomatoCount);
    }
}

Mutual Exclusion (MutEx)

This here means that we are protecting the critical section, by applying a lock on it, to prevent two or more threads from accessing and modifying at the same time.

there are multiple ways to achieve this in Java as locks, synchronization and atomic

I. Lock

We will talk about locks more in the next article, but here we are using the (reentrant) lock.

The lock requires using the (lock) method before the critical section, and then using the (unlock) method after we finish

class ShoppingListMutEx extends Thread{
    static int tomatoCount = 0;
    // We define a new variable called pencil to lock the critical section
    static Lock pencil = new ReentrantLock();

    public void run() {
        int loop_count = 5;
        pencil.lock();
        for (int i = 0; i < loop_count; i++) {
            tomatoCount++;
            System.out.println(Thread.currentThread().getName() + " is now thinking.");
            try {
                Thread.sleep(500);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }
        pencil.unlock();
    }
}

Now we are protecting the critical section. However, the protceted part is big, and not all of it is critical, as we only need to protect the (read-write) part, so we will modify it as the following

import java.util.concurrent.locks.Lock;
import java.util.concurrent.locks.ReentrantLock;

class ShoppingListMutEx extends Thread{
    static int tomatoCount = 0;
    // We define a new variable called pencil to lock the critical section
    static Lock pencil = new ReentrantLock();

    public void run() {
        int loop_count = 5;
        for (int i = 0; i < loop_count; i++) {
            pencil.lock();
            tomatoCount++; // Critical section
            pencil.unlock();
            System.out.println(Thread.currentThread().getName() + " is now thinking.");
            try {
                Thread.sleep(500);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }
    }
}

II. Atomic Variables

For simple operations, we can use atomic variables provided by Java instead of locks, as we have AtomicInteger, AtomicBoolean and others.

import java.util.concurrent.atomic.AtomicInteger;


class ShoppingListAtomic extends Thread{
    static AtomicInteger tomatoCount = new AtomicInteger(0);

    public void run() {

        for (int i = 0; i < 10_000_000; i++) {
            tomatoCount.incrementAndGet(); // Equivalent to tomatoCount++
        }
    }
}

III. Synchronized

Every object has its intrinsic lock associated with it (Related to the class, not the instance)

This keyword is protecting its part by keeping other threads from accessing it when it's used by another thread by locking this intrinsic lock before the method starts and unlocking it after it finishes.

There are two usages of synchronized: using it with a method or a block.

1. Synchronized method

Note that we are adding (static) keyword to the method, to force both threads to use the class intrinsic lock, other than using their own instance intrinsic lock (Removing the static keyword will lead to data race)

class ShoppingListSyncMethod extends Thread{
    static int tomatoCount = 0;

    private static synchronized void addTomato() {
        tomatoCount++;
    }

    public void run() {

        for (int i = 0; i < 10_000_000; i++) {
            addTomato();
        }
    }
}

2. Synchronized block

Blocks are a bit easier to read and use as it requires only wrapping your critical section with synchronized block as follows.

class ShoppingListSyncBlockClass extends Thread{
    static int tomatoCount = 0;

    public void run() {
        for (int i = 0; i < 10_000_000; i++) {
            synchronized (ShoppingListSyncBlockClass.class) {
                tomatoCount++;
            }
        }
    }
}

As a best practice, you shall pass the class in the synchronized args, so the block here works as a static synchronized method. However, you can also pass the object that needs to be protected (It shall be a mutable object and not a primitive type)

The following case results in a data race (Same as the non-static synchronized method)

class ShoppingListSyncBlockClass extends Thread{
    static int tomatoCount = 0;

    public void run() {
        for (int i = 0; i < 10_000_000; i++) {
            synchronized (this) {
                tomatoCount++;
            }
        }
    }
}

The following case results in a data race (As Integer object is immutable, so it always changes)

class ShoppingListSyncBlockClass extends Thread{
    static Integer tomatoCount = 0;

    public void run() {
        for (int i = 0; i < 10_000_000; i++) {
            synchronized (tomatoCount) {
                tomatoCount++;
            }
        }
    }
}

In this case, we are passing the object that needs to be protected, without getting a data race as the object is mutable

class Tomato {
    int count = 0;
}

class ShoppingListSyncBlockObject extends Thread{
    static final Tomato tomato = new Tomato();

    public void run() {
        for (int i = 0; i < 10_000_000; i++) {
            synchronized (tomato) {
                tomato.count++;
            }
        }
    }
}


// Or by using encapsulation 

class Tomato {
    private int count = 0;

    public int getCount() {
        return count;
    }

    public void setCount(int count) {
        this.count = count;
    }
}

class ShoppingListSyncBlockObject extends Thread{
    static final Tomato tomato = new Tomato();

    public void run() {
        for (int i = 0; i < 10_000_000; i++) {
            synchronized (tomato) {
                tomato.setCount(tomato.getCount() + 1);
            }
        }
    }
}

Conclusion

Although working with multiple threads in parallel may seem a bit disturbing or not safe, knowing the basics and how to correctly manipulate them will give you a huge advantage while using them.

You can check the code here on GitHub in the package (data_race):