Why functional languages rock with multi-core
Cores are cheaper nowadays. Almost all new computers are shipped with 2 or more cores. Datacenter computers usually ship with more – 4, 8, 32… Again the hardware industry has left the software industry behind; if only Moore’s law would work the same way for software…
But there is hope!
Functional programming languages used to be a niche for the small group of programing languages advocates or emacs fanatics I, for one, studied functional languages such as Lisp and ML at school but have never thought I’d ever use them in “real life” industry.
Things have started to change and in my opinion this is greatly due to the multi-core / huge datacenter / cloud computing shift in the software industry. The software industry has come to realize a few key points:
- One core can not handle the load, no matter how strong and fast is this core is. In the old days of IBM deep blue and the super-computer age there was hope that with the advance of hardware industry cores would get infinitely stronger and every other day there would be another contestant on the fastest/stronger/capable core of the day. However, in the last 10 or so years we have come to realize that hardware has its limitations and clock-rate is one of them. The cores, at least as far as we can tell today, can not grow infinitely and we need a different solution. The solution is multi-core CPUs. The challenge to the software industry is taking the right advantage over the multi-core business. It used to be easy when programming for a single core – you only had to come up with an O(good) algorithm and leave the real time issues to the hands of the hardware guys. But with multiple cores programmers (and not the hardware guys) need to take responsibility over utilizing theh cores; and that’s hard. Increasing CPU clock-rate is just not going to do, we’ve decided to go for the multi-core solution.
- Cloud computing. Cloud computing is here to stay. I won’t talk about the user added value of cloud computing, you can find plenty of this in other media, but what I will talk about is the new challenges it creates for programmers. There are quite a few new challenges, most of them are about scale, such as scaling your database, scaling your users sessions etc, but one of the most significant challenges is scaling your algorithm by parallelizing it. When dealing with multiple cores the way to scale your algorithm is to make it run in parallel – and this is hard; this is truly hard. One of the challenges in parallelizing the algorithm is synchronizing effectively over state. Java and other modern programming languages have created built-in constructs to assist in program synchronization, such as the synchronized block. This gives you the possibility to take advantage of multiple threads running on multiple cores of the same CPU, but it has two downsides – one, is that it doesn’t yet let you take advantage of multiple CPU datacenter and two, is that it’s very hard to program without creating botttlenecks. In many cases what you’d see is over-synchronization which results bottlenecks, poor performance in execution (not to mention the actual cost of the JVM going into the synchronized block itself) and at the end of the day, you might actually run your program faster if it was single threaded. The problem, just to make it clear, is that current imperative languages, such as Java and C++ all keep state. The state is in their variables. The thing is that when you want two or more threads to access the same variable, or the same state (and modify it) you need to synchronize them; Creating correct sycnronization is hard and many times results in bottlenecks.
Just to make that clear, when I say Multi-Core I mean two things actually: multiple cores on the same CPU, e.g. the same physical machine as well as multiple CPUs (machines) running in a datacenter.
Here’s where functional languages come to our rescue: They don’t keep state! Pure functional languages only present functions, which are pure computation and never keep state. Even in the not-so-pure functional languages, such as Scala, where the language does keep state, still programmers are encouraged not to use it and are given the correct constructs to use it less and use more pure functions which do not modify state (and simply return value) instead. Now, when you don’t keep state, you don’t need to synchronize state (there is always a bit of synchronization needed, but it lets you keep it to the minimum). Functional languages presnt pure-computation, a stateless computation; when computation is stateless it’s easy to run it in parallel on different parts of your data.
Now, I’m not saying it’s impossible to run parallel computations in imperative languages, that’s obviously not true, what I am saying that it’s hard, it’s very hard. In functional languages its easier.
It’s not surprise therefore that recently several functional (or semi-functional) languages have given rise, such as Scala and Erlang, as well as functional-like programming models for other imperative languages, such as Google’s Map-Reduce.
Now let’s put things in a historical context. Imperative programming languages such as C++ and Java is what’s currently driving the software industry. I think this is about to change. Both imperative and functional languages have co-existed for a long time, but it appears that the imperative family of languages have had the lead for the past 30 years or so and why is that? Performance, that’s why! On single cores there’s nothing like good-old C to run a fast program. You may agree or disagree on how “pretty” the program is, but you can not disagree that performance-wise the imperatives win, and performance is what counts, it allows for superb user experience, nice new and complex, computation intensive features etc. But history is moving fast and things have started to change. Today what’s becoming more and more mainstream is cloud computing and its massive amounts of data and computation. It used to be the case that programs were written for client side installation and clients used to be single-core with growing clock-rate. Now two things have happened simultaneously – one is that increasing CPU clock-rate has its limits, so the hardware industry is going towards the multiple core business, and two is that as networks become faster and with higher availability cloud computing has given the rise and it presents us with new challenges of massive amounts of data and massive amounts of users. With these two in place and with the realization that it’s very hard to utilize imperative languages for writing parallel programs the software industry has started its shift towards functional languages.
In the future increased parallelism, rather than clock rate, will be the driving force in computing and for this task functional languages are in the best position to take the lead.