Why Make Erlang a Functional Language?

Why Does Erlang Have Weird Syntax?

I’ve heard the argument many times. People “don’t like Erlang’s syntax so [they] don’t like Erlang.” I, for instance, didn’t understand the block terminator syntax when I was first learning Erlang, so I asked Yariv Sadan about it:

Erlang syntax came from Prolog. The ‘end’ keyword is used to end code blocks that are contained inside the body of a function, and the ‘.’ symbol is used to close top-level function definitions.

Yariv Sadan

But what is the purpose of such a strange syntax? Is Erlang’s syntax really that weird or is it the single-assignment semantics which people don’t like? Erlang’s SAS derives from the functional programming paradigm which begs the real question: “Why is Erlang a functional language?”

The answer in a nutshell: To attain more language features, like message passing in Erlang’s case, the language incurs some overhead so efficiency gains must be made to compensate. It’s the same concept behind data structures: In order to know more about data, semantic restrictions can be imposed. In this case, the data is the program source itself and the imposed restriction is data immutability.

Erlang’s data immutability can be split into two levels: intra-process variable immutability and inter-process data isolation.

Why Doesn’t Erlang Have Share Data?

Concurrency is obviously the justification for Erlang’s isolation memory model, with the alternative being locking to synchronize destructive shared data operations. Lock complexity makes scaling hard because locks aren’t composable and are unassociated with the shared data. If data isolation is enforced between concurrent components (what Erlang calls processes), composibility can be maintained because destructive shared data modifications do not occur. Composability, or building instructions with instructions, is an inherent concept of programming and it’s why sequential programs are desirable.

This example excerpt from an older java.lang.StringBuffer class illustrates a common locking misconception. The author wanted to compose the length and getChars calls into one atomic operation to ensure critical data wouldn’t be modified between them. The author attempted to compose the calls by wrapping them in a synchronized method which locks this, not sb. Unfortunately, another thread could still modify sb in between the calls.

public final class StringBuffer {
  public synchronized StringBuffer append(StringBuffer sb) {
      int len = sb.length();
      ... // other threads may change sb.length(),
      ... // so len does not reflect the length of sb
      sb.getChars(0, len, value, count);
      ...
    }
    public synchronized int length() { ... }
    public synchronized void getChars(...) { ... }
    ...
}

Software transactions and message passing are fundamentally better synchronization abstractions than locks because of, among other things, composibility and accuracy, at the cost of greater overhead. STM can be built from message passing and I consider both to be acceptable replacements for lock based synchronization.

Why Does Erlang Have Data Immutability?

It seems that intra-process data isolation would be enough for a clean synchronization scheme, yet this is not entirely true. The immutable shared data model can be made even more complete when applied at the local sequential data level to achieve even more benefits:

  • Immutability efficiency gains compensate for message passing overhead.

    Immutability of data can, in many cases, lead to execution efficiency in allowing the compiler to make assumptions that are unsafe in an imperative language.

    Wikipedia

    To reiterate what I said earlier, the restriction of data immutability allows the compiler to know more about the program and as a result, regain some efficiency.

  • Immutability simplifies type policies for message passing.

    In Scala, [an OOP language with message passing,] you can send between actors pointers to mutable objects. This is the classic recipe for race conditions, and it leaves you just where you started: having to ensure synchronized access to shared memory.

    Yariv Sadan

    To expand on the type policy issue a bit more, it’s simpler to make all data immutable, both local and shared. This reduces the need for special cases intended to disallow sending pointers, or data structures containing pointers, through message passing. This congruency keeps message passing at the core of Erlang’s semantic focus.

  • Immutability makes Garbage Collection simpler, faster, and soft real-time.

    The generational [Erlang] collector is simpler than in some languages, because there’s no way to have an older generation pointing to data in a younger generation (remember, you can’t destructively modify a list or tuple in Erlang).

    James Hague

    Again, more restrictions means different ways of efficiency gains.

Downsides to Erlang’s Non-destructive Memory Model

Many common algorithms are designed with destructive memory in mind. As a result, Erlang doesn’t have some of the same types of data structures and libraries as imperative languages. This, combined with the confusion of learning a completely new programming paradigm, can be a large deterrent for learning functional languages. But what libraries Erlang lacks is largely made up for by an extensive actor based library platform.

Virtually all “big” Erlang libraries use Erlang’s features concurrency and fault tolerance. In the Erlang ecosystem, you can get web servers, database connection pools, XMPP servers, database servers, all of which use Erlang’s lightweight concurrency, fault tolerance, etc.
Yariv Sadan

Erlang is clearly a domain specific language, focusing on problems that can be solved with parallelism. Erlang’s libraries favor this same domain and Erlang holds it’s own against even C in the targeted many-thread/process arena. Note: the “thread-ring” test on the last line of the chart:

note the thread-ring test

An Erlang web server was also compared to Apache as another more practical comparison (again, intentionally a problem in Erlang’s domain). “Apache dies at about 4,000 parallel sessions. Yaws is still functioning at over 80,000 parallel connections.” So, Erlang may be hard to use because of it’s lack of conventional libraries or it’s different syntax, but as I’ve hopefully shown, the differences were chosen by design with good reason. Erlang is clearly one of the strongest languages in it’s parallelism domain and after a little over 20 years, is a very mature language.

    None Found
  • EStau
    nice
  • Luke, I'm just curious, do you happen to live near Antwerp (Belgium) ?

    The name 'Hoersten' was mentioned during one of my employer's recent gatherings and I was curious if it's you. You know, the world's a small place sometimes. :-)
  • That is not me though my family is from Germany. If we've come as far as the US I imagine we've spread to Belgium as well.
  • Tony Arcieri, creater of Reia, a Python-like scripting language built on the Erlang VM, has written a great article about single assignment variable myths. He definitely clears up a lot of things I had trouble explaining in my article. I don't have any argument against multiple assignment variables, only pointers which are pointers.
  • Banador
    Your site has beautiful layout and it's readable. Thank's also for information!

    Too bad Erlang sucks in the shootout. It gives some wrong signals about the language to newbies.
  • Thanks for the comment. Unfortunately, I know what you mean. Programmers seem to be conditioned to look for a catch-all language and languages that fit a specific domain are easily cast aside for the wrong reasons.
  • Greg M
    Nice article, the core point is very sound, but a couple of minor flaws: Firstly the syntax of Erlang comes largely from Prolog as you said, but Prolog is not a functional language (and one might argue that putting so much syntax from a logic-programming language into a functional language is a big part of what people are complaining about with Erlang syntax). And secondly I don't believe that too much work has gone into using the immutability properties of Erlang code as a basis for compile-time optimization, although in theory it's certainly possible. Unfortunately Erlang only has the nice properties of functional languages at the smallest scale, the explicit fine-grained concurrency makes it harder to reason about on a larger scale.
  • I agree with you on both points. The first one I've addressed countless times already in response to other comments.

    The second point, I believe, is partly why Erlang has been so popular in industry: it's a small and simple language. Simple fine-grained explicit concurrency works well and composes well enough with a very shallow learning curve. This was done by focusing on a smaller problem set and cutting some of the functional features available in academic languages. Implicit concurrency, for instance, is still very academic and doesn't have the pragmatic efficiencies that I tried to highlight in the conclusion of my post. There are already many fully-featured functional languages out there and you see how well they've done in industry. Point taken, though.

    Most functional languages start with abstraction and cut away until the hardware level is reached. Imperative languages tend to start with hardware and add abstraction. Erlang is one of the few functional languages that took the lessons of the highly abstracted functional languages and made it machine aware. Hybrids are the key and are why languages like Python and Ruby are becoming so popular.
  • Is this the library you mentioned in our conversation? I am pretty familiar with these modules--I even used a couple in my final project (pg stands out). Unfortunately there is nothing in here that I saw could be forced into a priority queue.

    I seem to remember going through these modules with you to see if anything would work. It was so long ago so I may remember wrong :)
  • There is no direct implementation of a priority queue in the Erlang stdlibs as far as I know but it has everything you need to build one. It's pretty reminiscent of C++'s stdlibs with sets, queues, trees, and dictionaries. Googling "erlang heap" turned up a heap sort which looks comparable to any imperative language. You wanted something pre-made though, correct?
blog comments powered by Disqus