Tuesday, June 10, 2008

Erlang: parallelism without modifying code

The Erlang programming language has been getting popular among programmers, for the capability of handling massive number of connections, such as the chat server used in Facebook.

One of the basic ideas of Erlang are that minimizing the side effect of assignments by prohibiting multiple assignments into the same variable. At first this looks a rigid restriction and memory-consuming practice, but once you learn the native list structure of Erlang (mostly the same as in the Lisp language) and the tail-recursion control flow, writing the Erlang code is not too difficult, while retaining the conciseness and the performance.

I've written a code of parallel mapping of a function to a list. The word mapping means applying the same function to each member of the source list and obtaining the results which retains the same sequence as in the source list. If the function does not have side effects (such as changing the values of shared data structures), the operation can be parallelized by splitting the list into the smaller sublists and invoking the mapping process for each sublist.

Erlang has the supporting libraries of invoking a process in a distributed Erlang node running in multiple computers, so the parallel mapping function is the simplest but very powerful tool to experience the collective computing power of parallelism.

The code is available here as a tar archive, of one file of Erlang source code.