Embracing functional programming in C# - Part 5

In this concluding post, we will delve into how functional programming adeptly addresses the challenges of concurrency.

Why is concurrency crucial ?

  • Concurrency refers to the ability of a system to execute multiple tasks or processes simultaneously, allowing for overlapping periods of execution.
  • Parallelism refers to the simultaneous execution of multiple tasks or processes, where each task is broken down into subtasks that are executed concurrently.

Concurrency and parallelism involve the use of multiple processors or cores to execute tasks simultaneously, aiming to enhance overall performance and efficiency. These concepts are not inherently novel, and their effectiveness has been recognized for quite some time. However, in recent years, they have gained increased prominence, largely driven by advancements in hardware.

Hardware evolution is also important: CPUs aren't getting faster at the same pace as before, so hardware manufacturers are moving toward combining multiple processors. Parallelization is becoming the main road to computing speed, so there's a need to write programs that can parallelized well.
Functional Programming in C# (Buonanno)

What does the term "parallelize well" signify ?

If we execute our code in parallel, the outcome may vary. If the execution consistently produces the expected results, we say that our code "parallelizes well". However, if it leads to unexpected or erratic behaviors, we conclude that our code does not "parallelize well". The effectiveness of parallelization depends on how the code is structured and whether it can gracefully handle simultaneous execution without introducing unforeseen issues.

Pure functions parallelize well

Consider the following code in which we attempt to calculate the sum of the first 1000 natural numbers, each multiplied by 2.

1static void Main(string[] args)
2{
3    var list = Enumerable.Range(1, 1000);
4    var res = list.Select(Double).ToList();
5
6    Console.WriteLine($"The sum is: {res.Sum()}");
7}
8
9static int Double(int x) => 2 * x;

It's evident that 'Double' qualifies as a pure function, as its output is solely determined by its inputs. Can we leverage parallelism to boost performance ? Absolutely, with just a minor adjustment.

1static void Main(string[] args)
2{
3    var list = Enumerable.Range(1, 1000);
4    var res = list.AsParallel().Select(Double).ToList(); // AsParallel()
5
6    Console.WriteLine($"The sum is: {res.Sum()}");
7}
8
9static int Double(int x) => 2 * x;

We have incorporated the directive AsParallel() to instruct the compiler to parallelize the operation. The result is still correct.

Hence, with pure functions, a straightforward adjustment allows us to parallelize the process without altering the outcome. What about impure functions ?

Impure functions do not parallelize well

Consider the following code.

 1static int sum = 0;
 2
 3static void Main(string[] args)
 4{
 5    var list = Enumerable.Range(1, 1000);
 6    var res = list.Select(Double).ToList();
 7
 8    Console.WriteLine($"The sum is: {sum}");
 9}
10
11static int Double(int x)  { sum = sum + 2 * x; return sum;}

The objective remains the same: calculate the sum of all integers from 1 to 1000, each multiplied by 2 and this code returns a correct result.

However, in this scenario, 'Double' is not a pure function due to its side effects (the output is influenced not only by its input but also by a global static variable sum). However, the result is correct (even if the code is more obscure). What would be the consequences if we try to parallelize it ?

 1static int sum = 0;
 2
 3static void Main(string[] args)
 4{
 5    var list = Enumerable.Range(1, 1000);
 6    var res = list.AsParallel().Select(Double).ToList();
 7
 8    Console.WriteLine($"The sum is: {sum}");
 9}
10
11static int Double(int x)  { sum = sum + 2 * x; return sum;}

We can observe that the result is incorrect.

Parallelizing functions with side effects may thus result in unintended race conditions, data corruption, or other concurrency-related issues and therefore it is crucial to exercise caution when parallelizing functions that involve side effects, as the order of execution and shared resource management become critical factors that may impact the correctness and reliability of the program.

Conclusion

For our code to seamlessly accommodate new advancements, especially parallelization, it is imperative to predominantly code in the paradigm of pure functions. This approach is foundational in functional languages and should be rigorously applied in languages like C#, distinctly segregating sections of our applications into pure functions and impure functions with side-effects.

Final thoughts

If you wish to delve deeper into this topic, acquire the following book, which encompasses all the concepts emphasized in this series and delve into more advanced ones (monads, core patterns, lazy computation, asynchrony and so forth).

Functional Programming in C# (Buonanno)

Do not hesitate to contact me shoud you require further information.