Why Erlang Matters

atemerev · on April 7, 2016

AFAIK, Erlang is still (as of 2016) the only distributed actor model implementation with preemptive scheduler.

All other major implementations (including Akka) have cooperative scheduling, i.e. forbidding blocking code in actors. Erlang allows it. This is huge.

And actor supervision is the best way to write reliable systems. I have wrote some code in Akka without much effort and testing (streaming market data aggregation), and it is still running with a few years uptime.

devbug · on April 7, 2016

This unfortunately breaks down with (bad acting) NIFs. Thankfully you can mark 'em as the dirty evil little things that they are (with negligible overhead): ERL_NIF_DIRTY_JOB_CPU_BOUND. [1]

I implore anyone interested in Erlang or its surrounding languages, to read its source code. [2] More specifically, the BEAM. I'll warn you that it's very 80's hackeresque, but in a good way. Incredibly pragmatic. The way they achieve their world-class scheduling? Tip: not via a perfect theoretical design cooked up in some comp-sci lab. [3] If you've been following game engine design within the last 9 years or so, you'll probably have a good idea of how they do it. [4][5]

[1]: https://medium.com/@jlouis666/erlang-dirty-scheduler-overhea...

[2]: https://github.com/erlang/otp/tree/maint/erts/emulator/beam

[3]: https://hamidreza-s.github.io/erlang/scheduling/real-time/pr...

[4]: http://blog.molecular-matters.com/2012/04/05/building-a-load...

[5]: There's a hell of a lot of depth to this problem domain. Charles Bloom's guiding principles for Oodle are spot, and well worth a look.

rdtsc · on April 8, 2016

> I'll warn you that it's very 80's hackeresque, but in a good way. Incredibly pragmatic.

I consider the BEAM VM as one of the marvels of software engineering.

You know it is good, when you explain to other programmers that you can have something like an isolated memory process just like an OS process, with preemption and only a few Ks of memory, with a low latency GC, with distribution across machines built in -- and they don't believe me.

Even experienced senior developers are skeptical saying stuff like "well yeah but those are green threads then and they have to yield explicitly -- nope, they don't", "so you have callbacks then somehow, no, not callbacks", "what do you mean separate memory spaces, it is a single OS process right?" and so on. It sounds like magic -- this stuff shouldn't exist, right, but the awesome thing is, it does.

Moreover, it is not a hacked up version from a lab some place or a PhD disertation proof of concept -- this is what powers banking, databases, messaging, and probably more than 50% of smartphone access to the internet today.

mbrock · on April 8, 2016

I'll just put this here in case anyone's interested.

https://github.com/mbrock/HBEAM

It's a Haskell executor of .beam programs, in very early prototype stage, and I abandoned working on it 5 years ago (apparently).

Of course it's not meant to be competitive in any way, it's basically just for fun, and because I wanted to learn more about how Erlang works.

The function `interpret1` in the middle of this file has the main opcode switch.

https://github.com/mbrock/HBEAM/blob/master/src/Language/Erl...

I think it's about the smallest subset needed to run a factorial program, but also to implement the very basics of mailboxes with send/receive/timeout.

It uses GHC's shared transactional memory for the mailboxes:

https://github.com/mbrock/HBEAM/blob/master/src/Language/Erl...

Someone, fork it and finish it! :-)

fantasticsid · on April 8, 2016

I'm interested. How does the preemptive scheduling work with blocking system calls? So if an Erlang process tries to read from standard input using the read syscall (suppose we haven't set non blocking mode on the fd), why does it not block the scheduler that it runs on? Or does Erlang implements its own set of syscall wrappers that uses epoll under the hood?

rdtsc · on April 8, 2016

There is an IO thread pool for some blocking operations like file IO, and there is also epoll for sockets, for example if I see this in the prompt on my laptop:

   $ erl
   Erlang/OTP 18  ... [smp:4:4] [async-threads:10] ...

The async-threads indicates it has started 10 IO threads. So if a process needs to read from a file on a slow disk, it will dispatch that request to one of those threads and then it will be descheduled (put to sleep).

The smp:4:4 says there are 4 schedulers configured and enabled. Usually there is a scheduler (as an OS thread) which runs on each core (also highly configurable, with custom topoligies, affinities etc). Those schedulers will pick and run Erlang processes.

Typically to ensure fairness, each process is allowed to run a set number of reductions (think of them like equivalent bytecode, but say an internal C driver like a regex parser should also periodically yield and report it consumed a given number of reductions, so it can be descheduled).

Funny enough all that sounds a bit like an operating system, and that's a good way to conceptualize it. Erlang/OTP is like an OS for your code. A modern OS is expected to be resilient against bad processes messing with other processes' memory, multiple applications should run and be descheduled preemptively etc.

theseoafs · on April 8, 2016

You've got it: processes are not allowed to run blocking syscalls under the hood. Transparently that work is handled by backing IO threads.

tomp · on April 8, 2016

I'm interested as well! How does it work then?

(1) How does it preempt threads (my guess: it doesn't actually preempt threads, the interpreter yields after a certain number of instructors, or (if compiled) the compiler inserts conditional yields in each loop/function call/return)?

(2) How are memory spaces isolated (my guess: they aren't really, it's just that the memory allocator doesn't mix memory allocated by different threads)?

rdtsc · on April 8, 2016

(1) In the normal case it just counts a certain number of VM instructions each process runs before it gets to rescheduled. But it gets interesting as well with internal modules or C modules (for example regex matcher), in that case the C module periodicaly as it works through the data, reports that it consumed some number of reductions and is possibly told to yield now. (On that note: in 19.0 we'll have dirty schedulers by default so in that case blocking long running C code will be handled much better).

(2) Erlang VM instance (it is called a node) is an OS process (plus some helper processes, but they are not important in this case) is of course one heap from kernel's point of view. But internal allocator keep spaces separated for BEAM's data. It is not always as basic as in some cases for binary blocks and sub-blocks it can actually share and ref-count them.

In the new release I like that it could have mailbox outside the main heap of each Erlang process. That could be an interesting parameter to play with.

SEMW · on April 8, 2016

> (1) How does it preempt threads (my guess: it doesn't actually preempt threads, the interpreter yields after a certain number of instructors, or (if compiled) the compiler inserts conditional yields in each loop/function call/return)?

Yep. The interpreter lets each thread do 2000 reductions (roughly == function calls), or until it waits for new messages if that's sooner.

davidw · on April 7, 2016

> This unfortunately breaks down with (bad acting) NIFs.

Where possible, I think the Erlangy thing to do would be to just have a separate executable and communicate with that. Also ensures that things keep running if there's a segfault or something.

devbug · on April 7, 2016

Exactly! Well, kinda.

You can always throw the NIFs onto other nodes. You can also write ports that are glorified external processes.

Other times, you just need low-latency and high throughput. In that case, you expect failure and design accordingly.

technion · on April 8, 2016

     (bad acting) NIFs. Thankfully you can mark 'em as the dirty evil little things that they are (with negligible overhead):

As a side issue, is there an easy method to determine if a NIF is problematic in this regard? I've used jiffy[0] in several codebases, but I keep reading these warnings and wondering whether I should be doing so.

[0] https://github.com/davisp/jiffy

dozzie · on April 8, 2016

The warnings are mostly about NIFs you write yourself, which you typically avoid if possible. And jiffy itself goes quite a long way to cooperate with Erlang VM's internals (at least so I've heard).

jeremyjh · on April 7, 2016

There is one other. Haskell has green threads, a preemptive scheduler, and it has a pretty decent implementation of Erlang-inspired multi-node concurrency primitives and higher-level framework including supervisors, gen_server equivalent etc in the Cloud Haskell project. It does NOT have Erlangs deployment base and track record but it is still a very promising framework and very appealing if you like Haskell's type system.

http://haskell-distributed.github.io/

logicchains · on April 8, 2016

Just a nitpick, but Haskell's scheduler (like Go's) is "less preemptive" than Erlang's. If you have e.g.

    i = 0
    for{
        i += 1
    }

or equivalent in Go/Haskell, then the HS thread/goroutine running it cannot be interrupted, as interruption can only take place at certain points (e.g. allocation or function calls).

Erlang on the other hand assigns a certain amount of time units to each process, and each pure-Erlang operation consumes these, so an Erlang process can always be interrupted as it will eventually use up this time allocation and yield (unless it's actually calling a C function etc).

jeremyjh · on April 8, 2016

Haskell's scheduler preempts on memory allocation. Memory allocation is ubiquitous in Haskell; the example you are giving cannot really be written in Haskell as values are immutable. You could try to do something with IORef and bang patterns to avoid thunks being allocated but you are now in a very tiny corner case, much smaller than the surface area of NIFs that can cause you problems in both Erlang and Haskell.

it · on April 9, 2016

Here's a minimal example:

  package main

  import "fmt"

  func main() {
    go hog()
    for i := 0; ; i++ {
      fmt.Println(i)
    }
  }

  func hog() {
    for {
    }
  }

It stalls after about three seconds on go version 1.6.

Here's the issue on github:

https://github.com/golang/go/issues/543

it · on April 9, 2016

Here's a modified version that doesn't stall:

  package main

  import "fmt"
  import "runtime"

  func main() {
    go hog()
    for i := 0; ; i++ {
      fmt.Println(i)
    }
  }

  func hog() {
    for {
      runtime.Gosched()
    }
  }

sa1 · on April 8, 2016

Admittedly, Haskell is less preemptive than Erlang, but definitely not at the same level as Go. A mutating variable is not what you commonly use in Haskell, so the idiomatic equivalent of your code would have memory accesses anyway, and it would be possible to interrupt it.

thesz · on April 8, 2016

Haskell scheduler switches at memory allocation.

When your CPU intensive code does not allocate, it will not switch to another green thread.

So point in the example above is absence of memory allocation. And, having decent codegenerator, Haskell could produce the code above where registers are reused for new values (effectively, mutation).

ngrilly · on April 8, 2016

Yes, an Erlang process can be interrupted at any point, unlike Go/Haskell, but this is at the cost of bytecode interpretation. I'm not saying it is bad, but there is a tradeoff.

mistercow · on April 8, 2016

I've been confused for some time as to why people get excited about green threads. From what I've read, the main advantage seems to be that you can have threads on hardware that doesn't support threads natively, which is cool if you're on that kind of hardware. There's also some spin-up advantages I guess? But they don't get load-balanced across cores, right?

I feel like I'm missing something important.

mbrock · on April 8, 2016

Green threads aren't just a substitute for when you don't have system level threads. Instead, they're a way of structuring code that allows you to express highly concurrent programs without requiring the heavy overhead of launching and switching between operating system threads.

Linux switches between threads at some frequency, I think it used to be 100 Hz. It involves swapping out the process registers, doing some kernel bookkeeping, etc—this is called a "context switch" and it's quite costly. Also, Linux threads allocate at least one memory page (4 KB) for the stack. [If I'm wrong about these details, please correct me!]

Basically, the cost associated with an operating system thread comes from the fact that it has to be isolated from other system threads on a low level... whereas language runtimes that offer green threads impose their own safety via language construction, e.g., Erlang processes can't reach in and mess with other processes memory (without C hacks).

So green threads can be much more efficient, but they require some care in the implementation, especially to support I/O, and to have fair and efficient load balancing, etc. Then you run N operating system threads to get balancing across cores, and distribute green thread work.

logicchains · on April 8, 2016

The advantage is you can have many more green threads than you could have OS threads (hundreds of thousands vs thousands), due to green threads being more lightweight. This allows a programming model based on message passing between green threads, which many people consider nicer to reason about. E.g. for a web server you could have a goroutine/Erlang process for each client with 50k clients and still have excellent performance, whereas if you had 50k OS threads you'd likely suffer performance issues and use a heap more ram.

EdHominem · on April 8, 2016

And this is a huge benefit in the ease of coding.

If you only have two or four or sixteen threads you write overtly threaded code. But when you have millions you don't. Your program looks single-threaded and yet operates better and more safely.

hornetblack · on April 8, 2016

One advantage is spinning up new green threads can be very quick. Starting a new Kernel thread requires at least 1 syscall.

For example: on a network service, you have 1 thread listening for new connections, when a new connection is made. It starts a new thread, which calls the handler. The listener thread then goes back to listening for new connections.

Now the advantages can depend on your green threading implementation. If a listener thread blocks on reading from Disk or a DB. Then the listener thread can still wait for new connections and other connection handlers can still operate. Making you network application responsive, without increasing latency on clone syscalls.

Of course you can achieve this in other ways.

coldtea · on April 8, 2016

First, you can run TONS of them, which is an enabler for program designs that native threads doesn't work well with.

Second, they are much lighter on memory (well, comes with the first point, but still).

Third, the supervising VM/environment has more fine-grained control over them than with native threads.

And in any decent implementation, they are absolutely load balanced across cores, why wouldn't they be?

lostcolony · on April 8, 2016

Specific to Erlang processes -

As others have indicated, they're extremely lightweight, cheap to create and throw away, and are load balanced across cores.

But also, each has isolated memory, with share nothing semantics (with a couple caveats) which means that an exception in one won't affect others -unless you want it to-. That's huge.

But as has also been mentioned, you can create many of them. Someone else threw out 50k; nevermind that, try a million of them on a single box. That kind of concurrency opens up an entirely new paradigm of coding. One that is actually very useful, because it turns out, a lot of problems are naturally concurrent problems, that we've been trained to think about in sequential patterns because of how hard concurrency is.

An example I like to give is from the real world - task scheduling. We had to write some simple task scheduling for an application. Each task was multiple steps, many of which were time based (i.e., "execute this command, wait X amount of time, execute another command, wait Y amount of time, execute a third command, once that is successful execute a fourth command"). The traditional way of doing this would be some sort of priority queue, with tasks weighted by how long from now until they were to be done. You check how long until the next event, sleep until then, fire it, then repeat. Simple, right?

Except...each event leads to more events. And event timings can change. And events can happen simultaneously, so you actually need a pool of threads to actually execute the events on. Locks everywhere. Task logic is very hard to isolate from the execution logic (i.e., the bit that says "do X, and create an event to execute at time Y" is hard to keep entirely separate from the "pull event from priority queue, throw to a new thread to execute on, sleep until next event", since there are so many interactions between the two that can affect one another, changing when events happen, and when the queue puller needs to wake up).

In Erlang though? Trivial. Write your entire task as a single job. I.e., do x, wait, do y, wait, do z, wait for a message that z has completed, do a, etc. Then spin up one of those for each task that you need and let the VM handle the concurrency aspects of it. Even additional complexity, like "in the event of a message, change the amount of time until the next event to be half of what it was" is trivial; it's all contained in the same module, it all describes the same lifecycle of a single task. All the concurrency, the running of many of those tasks, and their interactions, and ensuring none is blocked, etc, is -free-.

This sounds like an obvious, ideal example once explained, and yet, every person where I worked who was unfamiliar with Erlang (and even some of those who had coded a little in it, but hadn't come to grasp the paradigm as well), who was explained what we were trying to do, described it as "easy, we just need to use a priority queue and pull from it!"

EdHominem · on April 8, 2016

I think it's not as much green threads as very many threads. And then they're not as much threads as independently threadable parts of the code.

You don't have to intentionally write a work queue and balance the number of readers vs writers, etc. You just let the runtime make things go as fast as possible.

They do get balanced across cores in almost all cases. Erlang is a functional language which really lends itself to this.

im_down_w_otp · on April 7, 2016

+1.

I think a preemptive scheduling model is really the only way to implement that model effectively. Any other scheduling method eventually causes the consistency of the programming model to break down as it's now pushing the cognitive burden about what to run, where, and when onto me as the developer again.

amelius · on April 8, 2016

> AFAIK, Erlang is still (as of 2016) the only distributed actor model implementation with preemptive scheduler.

It is one of the open issues with Golang.

https://github.com/golang/go/issues/11462

User aclements notes:

> @GoranP, you can think of the Go scheduler as being partially preemptive. It's by no means fully cooperative, since user code generally has no control over scheduling points, but it's also not able to preempt at arbitrary points. Currently, the general rule is that it can preempt at function calls, since at that point there's very little state to save or restore, which makes preemption very fast and dramatically simplifies some aspects of garbage collection. Your loop happens to contain none of these controlled preemption points, which is why the scheduler can't preempt it, but as @davecheney mentioned, this is fairly uncommon in real code (though by no means unheard of). We would like to detect loops like this and insert preemption points, but that work just hasn't happened yet.

atemerev · on April 8, 2016

And goroutines are not distributed, while actors can be transparently deployed across remote physical nodes, both in Akka and Erlang.

amelius · on April 8, 2016

How does this work with state? Isn't it inefficient to transfer the state of the actor to another node compared to just letting the actor run on the same machine?

atemerev · on April 8, 2016

The API only defines a way to start a new actor remotely.

Now, each actor per actor model convention has to survive restarts. In distributed environment, the state can be persisted e.g. in distributed memory cache / storage (Hazelcast, Cassandra).

atemerev · on April 8, 2016

Also, coroutines (while also cool; I admire Go and its efforts to bring concurrency to the masses) are not actors. Actors can be persistent and have lifecycle.

angersock · on April 7, 2016

It is specifically because of the preemptive scheduling that I find it worthwhile to write code that would otherwise be better suited to a different language in Elixir.

Numerical and statistical stuff, for example, is never going to be performant--but that's okay, because I'd much rather have a whole bunch of slow actors making incremental progress safely than a couple of blazing fireballs that could take down my system using NIFs or whatever.

thund · on April 7, 2016

So is preemptive or cooperative better in your opinion ? I couldn't figure it out from your comment.

findjashua · on April 7, 2016

preemptive. Cooperative scheduling runs the risk that a long-running actor might starve the other actors

davidw · on April 7, 2016

Note that "better" means better for some kinds of things. Preemptive scheduling has a cost, just like running a program on top of, say, Linux costs more than if you coded it specifically to run on the chip itself with no OS.

strmpnk · on April 7, 2016

Exactly. In terms of fault tolerance, a buggy actor is isolated in regards to scheduling others. Bad behavior has a more limited scope.

This isn't to say that one can't create a process that eventually has harmful consequences to the entire system, it just means it's much less likely. These little points of isolation add up. Scheduling, memory management, lifecycle, crashes, &c... It's why it's a bit funny when I hear Akka compared. Akka is quite impressive but I'd never consider it a true alternative to an Erlang or BEAM style runtime.

atemerev · on April 8, 2016

Akka is quite good in terms of reliability (I developed in both Akka and Erlang). All actors in Akka are supervised by default, and restarted if any exception happens.

This allows skipping most of error handling in Akka code. If something's wrong, your actor will be restarted. Works great!

strmpnk · on April 8, 2016

This assumes failure is clean. Some failures are tougher, for example, imagine code going into an infinite loop. It either system, you have waste but the scale of the disruption in Erlang is kept isolated where Akka would fail to release the thread. Otherwise, I'd definitely agree that Akka handles vanilla failures just fine by using supervision trees.

Blackthorn · on April 8, 2016

Cooperative scheduling can work fine so long as the scheduler can detect blocking and pull them out of the run queue. This still counts as cooperative.

pron · on April 8, 2016

Quasar has preemptive scheduling for fibers/actors on the JVM.

logicchains · on April 8, 2016

Could a function with a body like just

    for(;;){
    }

be preempted?

pron · on April 8, 2016

You could, but we removed time-slice-based preemption a few versions ago b/c it doesn't make sense for fibers if you also have access to heavyweight threads, so now we only preempt on IO/sleep etc.. The thing about time-slice preemption is that it just doesn't help with lightweight threads anyway, as your machine can only support a tiny number of execution threads that require time-slice preemption while you can have hundreds of thousands, or even millions, of lightweight threads. So runtimes like Erlang that don't give you direct access to kernel threads have no choice but to support time slice preemption, but on the JVM we realized that it serves no purpose, so we took it out.

jeremyjh · on April 9, 2016

>now we only preempt on IO/sleep etc

The IO you are talking about - that is I assume IO methods/libraries that are specifically written/adapted for use in Quasar, yes? If you just use an off-the-shelf JDBC provider, you would block the entire thread when one fiber calls out to the database right?

If so, that is not preemption, that is cooperative multi-tasking.

pron · on April 9, 2016

It's not cooperative because the fibers don't explicitly yield control; they are preempted as soon as they perform an operation that does not require use of the CPU.

As to the choice of libraries, that is an artificial distinction. Runtimes like Erlang and Go also require libraries specifically written/adapted for the runtime; calling an off-the-shelf IO library from Erlang/Go would also block the entire thread. As integrating a library with Quasar does not require changing its interface, the question of whether to trap calls to specific implementations (which is trivial on the JVM) or require the use of fiber-friendly implementation (wrappers) is a question of design. So far, we've opted for the latter simply for the sake of "least surprise", but we may choose to do the former, too. Also, unlike Erlang/Go (I believe), any accidental blocking of the kernel thread is automatically detected by Quasar at run time, and reported with a stack-trace.

Quasar operates in the exact same manner as Erlang/Go, only that we've disabled time-slice preemption once we realized it neither helps nor is it required on the JVM. The only real difference is one the nature of the ecosystem: while all pure-Erlang libraries are designed to work with fibers, most JVM libraries aren't, and so require a thin integration layer that is provided separately. OTOH, thanks to the size of the JVM ecosystem and the standard interface approach, I believe it is the case that today there are more IO libraries that support Quasar fibers (e.g. all servlet servers) than those supporting Erlang processes.

crazydoggers · on April 8, 2016

I'd like to mention Celluloid here as well. Generally I prefer Erlang/Elixir, but in Ruby centered apps, Celluloid (especially run under the JVM with JRuby) solves a lot of concurrency problems easily.

> All other major implementations (including Akka) have cooperative scheduling, i.e. forbidding blocking code in actors. Erlang allows it.

Curious as to what people's thoughts on Celluloid:IO are with respect to solving the problem of blocking code in actors. Celluloid:IO provides an event driven reactor that works in parallel with the actors. I haven't used it myself, but it seems an interesting approach.

wocram · on April 8, 2016

Care to elaborate on 'streaming market data aggregation'?

atemerev · on April 8, 2016

Nothing too complicated — connecting to various FX liquidity providers (LMAX, Currenex, CFH etc.) via FIX protocol and merging their individual offers using VWAP (Volume weighted average price).

thesz · on April 8, 2016

Cloud Haskell allows blocking code in actors.

incepted · on April 8, 2016

> And actor supervision is the best way to write reliable systems.

Need some citation there, because we've been writing extremely reliable systems (multiple nines) for decades in various languages (mostly C, C++ and Java).

All these languages (including Erlang) come with pros and cons to write such systems, but so far, Erlang has failed to deliver on most of the promises that its advocates repeatedly make.

dang · on April 8, 2016

Please don't do programming language flamewars on HN. By that I mean please don't make sweeping statements that are inflammatory and contain no actual information.

daveguy · on April 8, 2016

> but so far, Erlang has failed to deliver on most of the promises that its advocates repeatedly make.

Speaking of citations...

incepted · on April 8, 2016

Well, Erlang is a niche programming language, isn't that evidence enough that all these claims that Erlang will save us from the multithreaded hell are vastly overblown?

We're doing multithreaded programming just fine in a lot of very varied languages.

BuckRogers · on April 8, 2016

Facebook, Whatsapp, almost every phone call you make for decades, and much of the backend for League of Legends were all built on Erlang.

Not that the others are unsuccessful from an engineering or financial standpoint, but the last one, Riot Games' League of Legends recently hit 1 billion in revenue. It's a fact that it's the most popular video game on Earth today.

If it didn't work or "failed to deliver on its promises"- none of these world-class, top-tier services would be where they are. To repeat myself so it sinks in. Yes. It is a fact that every phone call you make hits Erlang code somewhere.

Because people actually stand in the face of Erlang and spout this nonsense to this day... I'll take the claim one step further and will back it up if challenged as if it's not already self-evident.

Erlang has proven itself better, for a longer period of time than Java. Many have tried to replace Java. No one, and I mean no one has tried nor has anyone succeeded in replacing Erlang.

Only C rivals it for carrying the world on its shoulders.

DasIch · on April 8, 2016

Erlang was developed for developing telecommunication systems and is still used heavily in that area today. When was the last time you heard of someone unable to make a phone call because a phone network went down?

incepted · on April 8, 2016

That would be a powerful argument if you could prove the code path enabling this uses Erlang.

According to Wikipedia, shortly after Armstrong was let go of Ericsson, the company quickly ripped out Erlang from all its products and replaced it with C and C++.

DasIch · on April 8, 2016

That's not what Wikipedia says at all.

> In 1998 Ericsson announced the AXD301 switch, containing over a million lines of Erlang[...].[8] Shortly thereafter, Ericsson Radio Systems banned the in-house use of Erlang for new products, citing a preference for non-proprietary languages. The ban caused Armstrong and others to leave Ericsson.[9] The implementation was open-sourced at the end of the year.[5] Ericsson eventually lifted the ban; it re-hired Armstrong in 2004.[9][...]

> Erlang has now been adopted by companies worldwide, including Nortel and T-Mobile. Erlang is used in Ericsson’s support nodes, and in GPRS, 3G and LTE mobile networks worldwide.[10]

https://en.wikipedia.org/wiki/Erlang_%28programming_language...

lostcolony · on April 8, 2016

Not to mention

"In 1998 Ericsson announced the AXD301 switch, containing over a million lines of Erlang and reported to achieve a high availability of nine "9"s"

That was the first commercial use of the language. The first commercial use of a 'niche' language, with over a million lines of code, achieved a downtime of just over half a second over 20 years. That's total downtime, too, not just 'unplanned'. And even if you take into account the numbers touted by critics of that quote, of 5 nines...that's still considered world class. For the first damn commercial product.

That's delivering.

im_down_w_otp · on April 8, 2016

You literally have no idea what you're talking about, said the guy who knows several people on the current OTP team at Ericsson.

rdtsc · on April 8, 2016

> Erlang has failed to deliver on most of the promises that its advocates repeatedly make.

Need some citation there,

atemerev · on April 8, 2016

I am sorry for you being so downvoted, since your comment is legitimate. Indeed, Erlang is a niche language.

Nonetheless, it powers WhatsApp and Facebook Messenger, with their billions of users. Quite a proof, I'd say.

jandrese · on April 7, 2016

I'm a very green Erlang noob, but given what I have seen from it I find articles like this kind of strange. Sure concurrent programming is difficult and we need to think hard about how to make programs run quickly in a multiprocessor environment, but the fundamental architecture of Erlang seems to be in conflict with big data and high speed computing. It seems like a language that can scale much better, but has such enormous constant time penalties that the scaling can't overcome the hurdle until you're talking about thousands of processors.

Every single state change requires a function call, every function call involves a pattern matching algorithm that has to account for all of your program arguments (effectively almost the entire state of your thread!) and isn't even strongly typed. And then there is synchronization and data sharing between threads, which seems a bit handwavy and effectively requires a database running in your program that the threads can poll.

If your problem set is lots and lots of small independent tasks that don't have to finish overly quickly, then Erlang is fantastic. Stuff like switching voice circuits for example. But I'm trying to imagine manipulating a 1TB dataset with thousands of worker threads if you have to make a copy on the stack for every change every worker makes.

I know people do some big data stuff with Erlang, so these have to be solved problems somehow, but I can't help but to suspect that they have to compromise some of the ideals espoused in this article to make it work.

Jtsummers · on April 7, 2016

> even strongly typed.

A nit: Erlang is strongly typed. You cannot (ok, excluding numbers) almost transparently convert a string into a number into a tuple like you could in C, to which everything is merely memory so casting allows trivial but potentially erroneous conversions. Erlang is dynamically typed, not statically typed. Dialyzer + type annotations (and some inferred) allows for static analysis, but it's not directly part of the compiler so it can't be strictly enforced (automatically).

EDIT:

Also, consider that erlang processes are a dual of objects in the OO sense. They can receive various messages and respond by changing state and/or transmitting messages. This interaction is similar to the way objects (instances of classes) behave in OO languages. The difference being that they're all running concurrently. So you could put a massive amount of state into one process. OR you could have a handful of processes that act together like you have a handful of classes in an OO language.

If I write a server for playing a game, I don't include all of a player's state and their connection in one process. I have a process that handles the network connection. It sends messages to a process (or processes) that handle player state. Which connect to some game state processes, and on until the network handler sends messages back to the player or gets more messages from the player.

jandrese · on April 7, 2016

You've hit on the tradeoff though. By decentralizing your state, you've increased your inter-process synchronization requirements. In the worst case everything ends up being tightly bound and your application runs like a single threaded application because everything is always blocked waiting for the state update from a remote thread.

Huge parallelism is easy if your data and processes are largely independent, but the real world is rarely so kind.

rdtsc · on April 8, 2016

> By decentralizing your state, you've increased your inter-process synchronization requirements.

But you've also increased your reliabily as well. Who cares if the tight single threaded application with a shared heap cand handle 100K connections, if as soon as one of those connection leads to a segfault, all the other 99999 crash as well.

jeffdavis · on April 8, 2016

A single Erlang server can handle 2M connections[1].

[1] https://blog.whatsapp.com/196/1-million-is-so-2011?

catnaroek · on April 7, 2016

> In the worst case everything ends up being tightly bound (...)

Your tradeoff is between “tightly bound by shared data structures” vs. “tightly bound by process synchronization”. I don't see how either is better than the other.

> the real world is rarely so kind

In the real world, from what I've seen, while everything is interconnected to everything else, not all the connections are equally strong or important. If you want to compute exact results, without possibility of failure, no matter what the computational cost, then sure, you need to take all the connections into account. If you can trade some accuracy for performance gains in the average case, you'll probably want to find ways to prevent minor failures from bringing down the entire system.

catnaroek · on April 7, 2016

Erlang processes are objects in their own right, not "dual" to them.

The dual of object types (records of methods) are sum types: Object types are defined by how you can eliminate them (calling a method on an object), whereas sum types are defined by how you can introduce them (applying a constructor to suitable arguments).

pmontra · on April 7, 2016

He probably means that an Erlang process is equivalent to an object and message passing among processes is equivalent to method calls.

In my experience with Elixir, objects, method calls and mutable data are easier to write than processes, messages and immutable data. It's not the mutable and immutable part, it's more about the boiler plate of spawning processes, receiving messages and matching them to dispatch them to the appropriate functions. Ruby's and Python's class and method definitions are much more compact and less error prone. Unfortunately they are not as good at parallelism. I wonder if it could be possible to have an OO language with an automatic Erlang process per object, automatic immutable data (just don't let reassign values to variables, like Erlang), and a way to define supervisor trees.

jrobn · on April 8, 2016

There is Pony (http://www.ponylang.org).

pmontra · on April 8, 2016

Interesting language, thanks. What's the amount of adoption to date?

jrobn · on April 9, 2016

Close to 0%. It's still in development. I would love to see this development into a high performance alternative / complement to erlang.

Jtsummers · on April 7, 2016

Yes. In the same sense that closures and objects are equivalent.

http://c2.com/cgi/wiki?ClosuresAndObjectsAreEquivalent

eggy · on April 8, 2016

Or use LFE - Lisp Flavoured Erlang by one of the designers of Erlang, Robert Virding. All the BEAM/OTP goodness with the meta-programming and homoiconicity of Lisp. Worth checking out.

pramodliv1 · on April 7, 2016

>> I wonder if it could be possible to have an OO language with an automatic Erlang process per object, automatic immutable data (just don't let reassign values to variables, like Erlang), and a way to define supervisor trees.

Is it possible to implement it using the metaprogramming features in Elixir?

raattgift · on April 8, 2016

> "(ok, excluding numbers)"

That's a pretty big exclusion. There are times one really doesn't want type promotion from a particular number representation for performance or accuracy reasons, and if your dynamic typing system does not allow contracts to reject numerical types that will implicitly promote, you can run into huge performance problems.

An example: suppose you want to divide a list of numerators by a single denominator producing a new list. Each division is not guaranteed to produce an whole number. What's the type of the resulting list? What is the type of each member of the resulting list? What is the type of a sum of the resulting list? Is the per-division cost roughly constant or does it depend on the particular numerator, denominator pair?

Bear in mind not just differences between integer and real numbers, but that real numbers may be exact or inexact, and that there may be various representations of inexact numbers (from native floating point to various types of bigfloat).

Really strict typing -- whether in a dynamically typed language with contracts or similar mostly runtime mechanisms, or in a statically typed language with type checking at compile time -- forces you to deal with these questions explicitly. In this case, you might specify that the source list of divisors is a list of positive integers, and that each division produces a single floating point result (and thus you wind up with a list of floats; and you'd specify that the sum of the resulting list should be a float too). That sacrifices precision for performance, which will tend to matter depending on the content and length of the source list.

There are plenty of dynamically typed languages that get fiddly trying to avoid turning some or all of the operations into much more precise types than single float (or even doing exact representations, rational number style), and the performance impact can be dramatic.

Conversely, you may not want to lose precision as you move away from +-0.0f, so you may want to specify that exact arithmetic will be used in the operations in this example instead.

eproxus · on April 8, 2016

Because Erlang is a dynamically typed language, in your example, dividing a list of numerators can give you any type back. It could be a list of numbers, integers, floats, tuples, strings. It might not even be a list at all. This is just something you deal with in dynamically typed languages. You're making a case for statically typed languages, which is fine. But that's not Erlang (or Python, or Ruby etc. etc.).

Erlang doesn't have contracts, but it has pattern matching. You can write a function in Erlang that is guaranteed to return only a list of integers let's say, by either converting them, or not accepting lists of floats in later part of your code. On top of this, you have Dialyzer which can warn you when your code doesn't do this (if you typed your functions correctly).

raattgift · on April 9, 2016

There are quite a few dynamically typed languages which let you restrict the numeric types that polymorphic operators can consume and produce. A long heritage of that is in Common Lisp (see the Type Specifiers section in CLtL) and several other Lisp-family languages allow this too (e.g. Racket, which has a pretty substantial contracts sytem).

So it's not new. It doesn't make Lisp any less dynamically typed as a language, and it is wholly optional. Newer dynamic languages also let one do some partial or "gradual" typing where a programmer wants to use it.

There is a cost to this at function calling time, but then there is also a cost if one manually programs in a check on the type of an argument, for instance. Good compilers, however, can prove that functions that aren't of (or exported to) global scope will never be called with anything other than the specified types, and will omit the type-checking code.

Additionally, there is plenty of research into interoperation between code in statically typed languages and dynamically typed ones.

Naturally you can always convert back after your arithmetic operator produces the wrong type. But that can be expensive in itself, and it hurts more when the arithmetic operator could have performed a much cheaper operation.

A way around this of course is to eventually de-polymorphize the potentially-expensive operator and program in a hopefully cheap type-check by hand, in write-generically-first/optimize-(or even make correct)-after fashion.

elihu · on April 7, 2016

For years, I didn't see what the point of Erlang was. It might scale well, but if it takes an Erlang process running on a hundred processors to equal the performance of a single-threaded C application, what's the point?

When I learned a bit more about Erlang, I realized that the point is that there are a lot of applications that are IO bound rather than computation bound, and for those apps Erlang performs very well and is easy to program in. Crunching numbers the fastest isn't always the most important thing.

rdtsc · on April 8, 2016

One of the main points of Erlang is fault tolerance. Everyone forgets about that and talks actors and scalability, which is fine. But without fault tolerance and proceses with isolated heaps it doesn't matter how fast the C code is, it can process millions of transaction a second, and then segfault, and the availability and transaction rate goes to 0 on that node.

The other advantage is power of abstraction. With Erlang it is easy to describe and program distributed system with lots concurrent components. Sure you can do it with C++, Java, and Node.js etc, but those things are awkward there -- either threads share memory and have to deal mutexes, or you in are in callback/promise hell.

adrusi · on April 7, 2016

I think you might be underestimating Erlang's performance. Running on the same single machine, Erlang's perfomance is comparable to python on benchmarks [1] [2]. Its performance obstacles aren't really that different from scheme's, and Racket does a little better than Erlang on benchmarks, so there's definitely room for improvement [3].

But Erlang isn't really for doing computation, it's for communication. If you have an existing erlang system that has a lot of data and you want to perform some computations on all of it, you'd probably use erlang to get the data onto the appropriate servers and then spawn another process, that can compute efficiently, with numpy or just C or similar.

Distributed systems aren't just a way of scaling performance beyond the number of CPUs you can have using the same memory. It's about reliability. Erlang's design enables you to create network services that have decades of uptime. It achieves this through its concurrency model, through its error handling approach, by allowing hotswapping code (an operation that's relatively easy to reason about without mutable state), and probably more ways that I as an outsider am not aware of.

[1] http://benchmarksgame.alioth.debian.org/u64q/measurements.ph...

[2] http://benchmarksgame.alioth.debian.org/u64q/measurements.ph...

[3] http://benchmarksgame.alioth.debian.org/u64q/measurements.ph...

igouy · on April 8, 2016

Also

http://benchmarksgame.alioth.debian.org/u64q/compare.php?lan...

jlouis · on April 7, 2016

The interesting aspect of scaling up is that it doesn't matter how fast you are at individual single-core computation. Fast single core computation, or even SIMD GPU processing, is largely an "easy" problem: get a stream of data going, or get a chunk of data into the system, and work away on it.

What makes scaling up hard is moving data around. Once you have more than a single computer, there is no way you can easily share memory between them, so you have to impose some kind of copying for the system to work. If you want to demux a stream for multiple workers, you have to distribute work to the workers. If you have massive amounts of data in a cluster, you have to move the computation to the nodes in the cluster on which the data resides.

Moving data around requires you to have good orchestration of "mostly stateless" computations, with a couple of pinches of persistence strewn in as well. You can do this well in any language, but what makes Erlang well suited for it is that it provides some decent primitives for you with a lot of time sunk into the architecture. Beating this architecture in any other system requires you to spend some time doing that. And chances are it isn't as general, so when the world around you change, the framework you used is left behind.

Before Erlang, Tandem systems built hardware/software with many of the same ideas in them. They built these systems primarily for fault tolerance and robustness, but they found, somewhat to their surprise, the same architecture is good at scaling. The reason I believe, after 10 years of Erlang programming, is mostly that the computation model of isolated services forces you to think distribution into the system from day one. Your solution naturally gravitates toward the distributed model, and this in turn means it is easier to scale out later. The model also makes it hard to accidentally build a part of the system which can slow down everything. I think this should be given more credit than it is normally given.

And once you have your problem distributed, you call into that CUDA GPU code on the node to obtain the high computation speed. Or you call into your FPGA or DSP ASIC. Any problem on the CPU is slow because of its general purpose behavior (the exception: You are Fabrice Bellard)

signa11 · on April 8, 2016

> Before Erlang, Tandem systems built hardware/software with many of the same ideas in them.

Indeed, Jim-gray's (from tandem) paper 'why computers stop and what we can do about it' is an quite good. It contains a detailed report of machine failure including s/w and h/w and details techniques for reducing the mtbf by these.

Erlang's language and runtime seems to have picked seminal ideas from here...

qohen · on April 9, 2016

Indeed, Jim-gray's (from tandem) paper 'why computers stop and what we can do about it' is an quite good

For your convenience: http://www.hpl.hp.com/techreports/tandem/TR-85.7.pdf

shortlyportly · on April 9, 2016

Wow, when I first read about OTP (via learning Elixir) my immediate thought was this sounds like the Tandem systems I coded in the early 90's. Glad I'm not the only one to make the connection.

jallmann · on April 7, 2016

> the fundamental architecture of Erlang seems to be in conflict with big data and high speed computing

Because they are different problems. Concurrency, meet parallelism.

Concurrency: many smaller tasks that can be multiplexed over one core. The core doesn't need to be particularly fast; it just needs to be able to handle multiple tasks in-flight at once. Web serving, etc.

Parallelism: one big, honking task that can be split over multiple cores (or nodes), and each core needs all the horsepower you can eke out, so it can finish its part of the task quicker. Big Data, etc.

Erlang is not particularly fast (bad for parallelism), but it excels at concurrency.

> I know people do some big data stuff with Erlang

Not sure who is doing big data with Erlang, or why anyone would want to -- unless they have some very fundamental misconceptions about the problem at hand and the tools available.

> compromise some of the ideals espoused in this article to make it work

The article was really nonsense. Most of the terms it throws around have nothing to do with Erlang specifically, and proclamations like Erlang being used for a hypothetical "data center on a chip" are... what? The author's gist basically boils down to this: Use supervision trees with unikernels. I have had the same fantasy, for what it's worth.

adrusi · on April 7, 2016

I know you weren't aiming for maximum accuracy in your definitions, but I think this is worth bringing up.

Parallelism and Concurrency are not mutually exclusive terms. All parallelism is concurrent.

Concurrency is whenever more than one thing is happening at the same time conceptually. Everything from iterators to threads.

Parallelism is whenever more than one thing is happening at the same time physically. From a user-space perspective, this means threads or a coprocessor (like a GPU).

jallmann · on April 7, 2016

Yes. The intuition is a good one of parallelism being a deliberate way of designing a system to run its parts simultaneously (physically), with concurrency being a property of a system that may or may not have its parts running simultaneously (conceptually, 'overlapping', whether physically or logically).

There are many, many ways to think about this difference, and it's fun (and beneficial!) to do so every once in a while.

> From a user-space perspective, this means threads

Interestingly enough, before SMP and multicore, thread-level "parallelism" was actually disguised by a time-sharing concurrent implementation. That remains true for most threading libraries in languages with a GIL, and any time you have threads exceeding the number of physical cores. In fact, the primary purpose of threads was (and still mostly is) to get a semblance of concurrency... Even the Erlang implementation was single-threaded 2008 or so.

adrusi · on April 7, 2016

Well obviously threads aren't always running at the same time since each cpu can do only one thing at a time (or a finite number, if we're counting hardware threads) and you can have more threads than cpus.

It's the fact that they might run at the same time. Also from the perspective of the programmer, there's no difference between two threads running at the same time or by time sharing, since preemptive multitasking is non-deterministic and has the same implications as true parallelism.

jallmann · on April 8, 2016

I think we're talking about different things? You seem to be referring to parallelism in the literal, general sense of the word, while I'm referring to computational parallelism. Basically, if Amdahl's Law doesn't apply, then you're looking at concurrency. So this statement

> from the perspective of the programmer, there's no difference between two threads running at the same time or by time sharing

is incorrect when you're talking about computational parallelism (as OP was), because you're not going to realize any speedups with time sharing. In that case, you're using threads as a concurrency mechanism -- not for parallelism.

qohen · on April 9, 2016

Not sure who is doing big data with Erlang, or why anyone would want to

Nokia created an open-source Hadoop-replacement called Disco [0] that used Erlang for coordination/orchestration -- an underappreciated strength of the language -- of map-reduce jobs, where the jobs were written in Python (and later OCaml, etc.). They've shown that it can handily outperform Hadoop (at least in the canonical wordcount example shown in this talk[1] -- there may be other examples, I haven't actually watched the talk yet). They've used it to mine terabytes of logs, daily, as described in this talk[2] and others apparently have used it as well.

From the abstract[3] describing the first talk, about the project:

We will describe our experiences using Erlang within Nokia to build Disco, a lean and flexible MapReduce framework for large-scale data analysis that scales to large clusters and is used in production at Nokia. Disco is an open-source project that was started in 2008 when attempts to use Hadoop to analyze data proved to be a painful experience. The MapReduce step formed only a portion of the analytics stack, and it was felt that it would be faster to write a custom implementation that would integrate well, than adapt Hadoop with the amount of internal Hadoop expertise available. Among the crucial tasks of such an implementation would be to deal with cluster monitoring, fault- tolerance, and the management and scheduling of a large number of concurrent and distributed jobs. To keep the implementation simple, the use of a platform that provided first-class support for distribution and concurrency was imperative. This motivated the choice of Erlang/OTP to implement the core control plane of Disco. It bears stressing that this choice was driven primarily by pragmatic concerns, as opposed to any beliefs about the superiority of functional programming languages in general or Erlang in particular.

The project's homepage [0] has information, a link to its Github, etc.

[0] http://discoproject.org/

[1] https://youtu.be/IjOGUC-iR_Q

[2] http://vimeo.com/23550705

[3] http://cufp.org/2011/disco-using-erlang-implement-mapreduce-...

bojo · on April 8, 2016

> Not sure who is doing big data with Erlang, or why anyone would want to

Riak?

jallmann · on April 8, 2016

Are people actually running heavy analytic workloads using Riak map-reduce?

This assumes a pedantic qualification of "big data" handling -- does storage count, as opposed to the actual processing of the data? But I think that's an important distinction in this context, as it strikes the heart of the dichotomy between concurrency and parallelism.

(Moreover, I've never really been sure how much Riak is being used for actual 'big data', as opposed to being a master-less, highly available repository for 'regular data' (distinction between 'big' and 'regular' deliberately left vague). More so since Riak is key-value as opposed to columnar, but I suppose that depends on your workload. But that's all besides the point here.)

runchberries · on April 7, 2016

I don't see why it is confusing that applications for high concurrency would solve problems differently than big data.

I am also an Erlang noob, but I don't think pattern matching has to account for all parameters. To my understanding erlang pattern matching is very efficient, and anything you would pattern match on function parameters, you would have to do some type of logic in the function if you had no pattern matching, so I don't see how pattern matching would negatively effect performance.

ETS tables to share data between threads actually make a lot of sense if you think of your program with the idea that no matter what "x" is, this process should properly handle it, because you can't guarantee when the message you sent will be processed.

An ETS table is a DB, but you can replicate it by having a single genserver hold that data, and use calls to retrieve and manipulate it. It gives you one spot for your data to be. If you had that same data in messages or function call stacks, by the time the process executes it, it could be stale.

Its useful if you need one "true source" of individual pieces of data that needs to be synchronized. Its about decoupling data from processes.

jerf · on April 7, 2016

Erlang may be useful for coordinating computation tasks, but, yes, even with HIPE it is not a good numerical language on its own.

It would be interest to re-engineer a language today that tries to fit into Erlang's niche but has a stronger performance focus. Rust, Cloud Haskell, and Go all sort of cluster in the area I'm thinking about, but none are quite what I'm thinking of. Cloud Haskell is probably closest but writing high-performing Haskell can be harder than you'd like. Rust shows you don't have to clone Erlang's immutability and all the associated issues it brings with it for inter-process safety, but Rust of course is "just" a standard programming language next to what Erlang brings for multi-node communication.

aaron-lebo · on April 7, 2016

This is an easily solved problem. You've been around so what I'm about to tell you is nothing new, but...

In the aughts Ruby and Python were really slow so if you had computational-heavy problems you had to drop into C.

It worked but C isn't great. The thing is - we have a lot of languages that can do computational problems easily now - Rust, Go, Nim, and the list goes on....

It becomes relatively trivial to create libraries which could wrap these languages for Erlang/Elixir. In the few cases where Elixir isn't fast enough, just drop down to something else. Write a small piece of code in Rust (you may as well call Elixir/Rust peanut butter and jelly). Optionally, you could create some kind of DSL which compiles down to another language in Elixir. I did the same with xjs (Elixir syntax, Javascript semantics) [0], and I must say, for a 200-line hack it works really well.

This isn't even considering the fact that a lot more work could be invested in Erlang's VM. Yes, Ericsson is its corporate sponsor but imagine if you had companies trying to make it fast the way they try with Ruby, Python, JS, or another of other more complicated languages.

I think this is relatively low-hanging fruit. You can't tell me that Erlang and Elixir are harder to make fast than other dynamic languages (in most contexts).

[0] https://news.ycombinator.com/item?id=11444499

eggy · on April 8, 2016

The drive for fault-tolerance, distributed computing and on the other hand, speed, are all becoming prominent leaving Erlang in a good place right now, but in need of other major changes in the computing world. Wrapping languages or 'dropping down to C' is no longer going to cut it, even if it is low-hanging fruit. Rust or Pony's guarantees only hold if you stay in their pen. We need a way of marrying PLs like Erlang/LFE/Elixr and Pony to newer hardware paradigms to take advantage of all those multicores, and potentially custom FPGA vs. ASIC chips rigs that will be arriving to market. Why Erlang matters is that it showed you can allow for failure, albeit brief and inconsequential failure, to succeed. No zero-risk or failure here. Acceptable bounds that are easy to see now, but revolutionary at the time. Custom hardware is already in use at HFT and bitcoin mining companies. The U.S. is going to try and beat China's Tianhe-2, that is currently the world's fastest supercomputer. I'm not sure why, since the Chinese scientists say it would take a decade of programming to utilize the potential of the Tianhe-2's hardware. If you think I'm calling the spirit of Lisp Machines from the dead, you're close ;) I think the von Neumann HW architecture, and its straddled type of OS, are straining at the edges of high-stakes usage, not the common user. We don't need supercomputers, we need new hardware architectures at a lower-level than 'super', that can be programmed in months not decades. Programming languages in the OTP/BEAM category, old, battle-tested languages like APL and J, which have always dealt with the array as their unit of computation, will be the basis for new languages, or they will be adapted in an new one. The money, big data, and mission-critical business needs will drive it to market.

wcummings · on April 7, 2016

You could write NIFs in Rust (well not sure if you can now, but I don't see any reason it couldn't be supported) for the high perf bits and use Erlang to coordinate, I figure. At least Rust code is less likely to explode and bring down the whole VM than C.

kibwen · on April 7, 2016

There seem to be at least a few people looking to build NIFs in Rust, e.g. https://github.com/hansihe/Rustler (found at https://news.ycombinator.com/item?id=11220615).

vvanders · on April 7, 2016

That looks really nice. Codegen + panic catchers makes is pretty compelling.

im_down_w_otp · on April 7, 2016

Hey, I'm one of those people!

And yes, I am doing this, and the personal reason is for fault-tolerance and the professional reason (or how I get time to do it) is based on security criteria.

arh68 · on April 7, 2016

Any idea if a network-level FFI has been started? I'm thinking along the lines of the Haskell erlang-ffi [0].

    Speaks the Erlang network protocol and impersonates 
    an Erlang node on the network. Fully capable of 
    bi-directional communication with Erlang.

NIFs still limit you to < ~1ms computations, from what I understand, but impersonating a node (on another machine, even) seems a lot more flexible. Just wondering; NIFs in Rust are still a great idea.

[0] https://hackage.haskell.org/package/erlang

toast0 · on April 8, 2016

Have you seen c nodes[1]? That may be what you're looking for?

[1] http://erlang.org/doc/tutorial/cnode.html

wcummings · on April 8, 2016

There's support for "dirty NIFs" in new versions, R19 will make it the default. Dirty NIFs allow for long running NIFs managed by the VM. In older versions, you can use nif_create_thread to create background workers, and your NIF will only block for as long as it takes to acquire a lock for your queue.

You can also use c nodes (or the JVM interface, which is pretty similar to c nodes I think).

... You can also use ports which define an interface for communicating with external processes.

The world is your oyster!

rdtsc · on April 8, 2016

Not anymore with dirty schedulers you will be able to (as of 19.0 I think) have any long running C code as a NIF!

adrusi · on April 7, 2016

You can't really achieve Erlang's goals with a statically typed language, at least it would be very hard to make it easy to intuit whether a live reload will be sound.

A language that allows mutable state aside from at process scope is also a no-go. In Erlang you're not supposed to think about what code will be running on which machines as you write the business logic, so you have to assume that every process is running on a seperate machine with no shared memory, so no mutable state above the process level. And mutable state at local scope makes hot swapping code messier, although that's easier to work with than static typing.

Nothing about Erlang is inherently slow. Someone could hire a bunch of developers from v8 or Spidermonkey or maybe just Mike Pall to write a better Erlang runtime.

jerf · on April 7, 2016

Live reload is a funny thing with Erlang. When I claim it's a feature, but describe the pain it is to use, I get told nearly nobody uses it. When I claim that nobody uses it, I get told that lots of people use it. I'm not sure it's something that an Erlang competitor would have to get right. And it would be valid to use a different mechanism for live reloads, perhaps something that explicitly migrates state between OS processes instead. At the very least I think the Erlang community would have to agree that it's a dodgy, improvable process.

"A language that allows mutable state aside from at process scope is also a no-go."

Well, I did say the language I'm spec'ing probably doesn't exist. Rust is an interesting example of what can be done to make it so that not actually copying memory is safe, but you'd still have to do some work to make it do the copying more transparently across nodes, which is why I said it's "just" a regular language from Erlang's point of view.

"Nothing about Erlang is inherently slow."

I now believe that dynamically-typed languages that are not built from speed from the beginning (LuaJIT being pretty much the only reason I even have to add that parenthetical) are inherently slow. I've been hearing people claim for 20 years that "languages aren't slow, only implementations are", I've even echoed this myself in my younger days, yet (almost) none of the dynamic languages go faster than "an order of magnitude slower than C with a huge memory penalty" even today, after a ton of effort has been poured into them. Some of them still clock in in the 15-20x slower range. Erlang is a great deal simpler than most of them, and I don't know whether that would net an advantage (fewer things to have to constantly dynamically check, although "code reloading" mitigates against some of these) or disadvantage (less information for the JIT to work with). Still, at this point, if someone's going to claim that Erlang could go C speed or even close to it in the general case, I'm very firmly in the "show me and I'll believe it" camp.

At some point it's time to just accept the inevitable and admit that, yes, languages can be slow. If there is a hypothetical JS or Python or PHP interpreter that could run on a modern computer and be "C-fast", humans do not seem to be capable of producing it on a useful time scale.

rdtsc · on April 8, 2016

> When I claim it's a feature, but describe the pain it is to use, I get told nearly nobody uses it. When I claim that nobody uses it, I get told that lots of people use it.

Heh it is funny. I might have an idea why. It is hard to use regularly to perform day to day releases. Simply because building correct apups and so on take a lot of time. Most systems are already decomposed into separate nodes and cand handle single node maintenance, so that is what we do at least. Take a node down, upgrade, bring it back up. Care has to to taken to have mixed version in a cluster but that is easier than proper 100% clean hot upgrade.

But having said that, I have used hotpatching by hand probably 5-6 times in the last couple of months. Once on a 12 node live cluster. That was to fix an one-off bug, for that customer before having to wait for a full release, another time was to catch an function_clause error that was crashing gen_server and so on. It was is very valuable having that ability.

> till, at this point, if someone's going to claim that Erlang could go C speed or even close to it in the general case, I'm very firmly in the "show me and I'll believe it" camp.

It doesn't matter if it goes C speed, it has the fault tolerance, expressive language, it is battle tested, it has good runtime inspection and monitoring capability, if someone came one day and said you lose all those but you gain C speed, I wouldn't make that trade.

eproxus · on April 8, 2016

I think you are seeing two definitions of "live reload".

One is where you live upgrade a full running release, including all applications and version, where you mutate state that had its format changed. All this in production, without any downtime. This is incredibly hard to get right. Erlang gives you a lot of tools (OTP & friends) to achieve this, but it is still very complex.

The other is reloading Erlang code in a runtime system. I.e. recompiling and reloading one or several modules in a runtime system. This is usually done during development (see Phoenix for Elixir for example) or perhaps even in production when you know what you're doing. This is relatively easy, with some risks of course if you are doing it in production.

roller · on April 8, 2016

I haven't seen it used directly, but it seems like Elixir macro based code could be altered and recompiled based on runtime configuration.

An example would be changing log level settings. Normally Elixir log blocks can be compiled entirely out when running in production mode. But it should be possible to fairly safely recompile with debug logs enabled and reload without missing a beat.

eproxus · on April 11, 2016

We do this in our project, for two reasons. One for logging (as you mentioned) and the other for configuration (compiling configuration into a module for efficiency reasons). The Elixir primitives makes this a breeze.

adrusi · on April 8, 2016

I don't know if live reload is widely used or not, but the other features of erlang allow you to create systems that see decades of uptime. But without code hot swapping your uptime is limited by how often you ship new code. Typically a distributed application will be made of independent programs on various machines that you can upgrade and spawn and kill at your leisure. In erlang your entire distributed application is kind of like just one program, and what was an executable in the traditional model is now a module, so in order to match the capabilities of a traditional system, you need to be able to upgrade modules without killing everything.

---

There are a number of other dynamically typed languages that have fast implementations. Javascript has a few; Common Lisp, Self, Julia, to name some others. They'll never be as fast as C, certainly not when comparing highly optimized programs, but they're fast. It looks like most dynamically typed languages can be made to run 10x slower than C. Compare that to CPython and HiPE which are more like 100x slower.

I don't think code reloading would hurt JIT performance too much. The prerequisites for runtime specialization of procedures basically accomodate eveything hot swapping would need. I also think the way people use Erlang's type system is probably more ammenable to conservative type inference than the existing fast dynamic languages, and that's one of the more important metrics.

losvedir · on April 8, 2016

> yet (almost) none of the dynamic languages go faster than "an order of magnitude slower than C with a huge memory penalty" even today, after a ton of effort has been poured into them

Interesting. What are the exceptions you have in mind to warrant the (almost) hypothetical? You mention LuaJIT. I've also heard that Q/KDB are quite fast. Anything else?

pcwalton · on April 7, 2016

Agreed. It'd be really interesting to see such a language.

One language that I think is woefully underappreciated for how ubiquitous it is is GLSL. With OpenGL 4 Compute Shader you get surprisingly close to general-purpose use for the type of tasks that benefit from massive parallelism. And GLSL is really quite a nice language; driver bugs are the main things holding it back.

vvanders · on April 7, 2016

Eeeeeeh. I don't know.

Maybe for highly SIMD stuff but it seems like pull in a bunch of baggage around GPUs, etc.

On the other hand it would force you to do your data partitioning right up-front(much like the SPUs on the PS3).

pcwalton · on April 7, 2016

Sure, it's not very good for task parallelism (though some of the extensions that AMD is introducing for APUs are very interesting!) But if you've got an embarrassingly data-parallel problem, you can't beat its performance.

vvanders · on April 7, 2016

That performance is largely dependent on drivers + HW though, right?

Then again I'm used to mobile GPUs where any conditional statement used to cause the shader to be evaluated 2^n for each and the gathered at the end(aka forget about any branching).

For my 2c I'm a fan of Elixir + Rust, Rust has a nice C ABI that should make it easy to embed.

zem · on April 7, 2016

pony [http://www.ponylang.org/] follows the actors-everywhere model, and has a strong performance focus. not sure how well it fits into erlang's niche, but at least the built-in actor model means a lot of erlang idioms and design patterns should port over easily.

atemerev · on April 7, 2016

No actor supervision there. Without supervision, there is no famous Erlang reliability.

actsasbuffoon · on April 7, 2016

Erlang excels at high throughput and fault tolerance. Number crunching, even in parallel, isn't really the goal.

rb808 · on April 7, 2016

I'm really tempted to use Erlang/Elixir for a project at work but unsure of its traction. Is it leading edge or trailing edge? I don't even know, but don't want to saddle the firm with a white elephant - even one that is impeccably fault-tolerant.

Is Erlang too esoteric?

dmix · on April 7, 2016

Having spent quite a bit of time searching for Erlang packages on Github recently, I've found quite a few Erlang packages haven't had commits since around 2013. I believe there was a spike in adoption around 2012-2013 which resulted in a bunch of activity on Github. Which seemed to have tapered off recently. But you still find many core packages that are active - things like JSON parsers, templating libraries, and web frameworks are all lively. And OTP and Erlang itself are always being improved.

Plus there is an up-to-date package manager/global repo (https://hex.pm/) which is used by both Erlang and Elixir. Plus Rebar3, the primary build tool for Erlang projects, is actively developed.

But Elixir packages are very active in general since it's relatively new and these packages can be used within Erlang apps relatively easily as well. Plus the Erlang packages from 2013 that I've used have all been pretty stable. The quality of libraries available on Github have all been very high from my experience. I believe Erlang attracts experienced developers, which is reflected in the typical code quality and documentation.

My only wish is for the Erlang website (http://www.erlang.org/) to get a redesign. It gives the language an esoteric apperance to newbies. Which is a shame because I absolutely love the language and think it's far superior for many of the types of projects for which people have been using Node.

rdtsc · on April 8, 2016

I am excited for a continued improvement and enhancement to the language.

Erlang 18 brought maps, people complained about that for years. Erlang 18 also brought (via a feature flag) dirty schedulers so can have long running C function embedded in without messing up process scheduling also an often requested features. Probably the best thought-out handling of time in any language I've seen so far ( synchronization, warping, moving backwards: http://erlang.org/doc/apps/erts/time_correction.html )

Erlang 19 is exciting as well -- 10x faster tracing, dirty schedulers turned on by default, a new state machine OTP module, an external plugin (with LevelDB as one example) for Mnesia storage also something people complained, 2x-3x faster spawning external processes + many others.

The impressive part is that these changes are done to a 30 year old language.

simoncion · on April 8, 2016

mnesia_leveldb is scheduled for OTP 19? That's great news!

I can't tell you how many times I've cursed at the overfilled-hours-ago-but-mnesia-didnt-care DETS table shard. Getting a on-disk backend that can store more than 2GB at a time will be great!

It'll be even better if the writer code actually notices failures to write to the backing store and aborts transactions when they happen! [0] :)

[0] Seriously, who thought it was a good idea to ignore the return value from dets:insert/2? :(

rdtsc · on April 8, 2016

I think this is the PR so far:

https://github.com/erlang/otp/pull/858

simoncion · on April 7, 2016

Man, whoever redesigns erlang.org needs to think really hard about how they're going to re-work the documentation section. It's really good as is:

* Color scheme and fonts are easy to read.

* One can switch between API reference for the current module and Users Guide for its containing application with one click.

* On the left-hand-side one has a scrollable tree of modules in the current application, each of which is expandable to reveal the API calls within any given module.

* Sensible URL scheme: doc/man/$MODULE.html for a module's API docs and doc/apps/$APP/users_guide.html for an application's Users Guide.

RobertKerans · on April 8, 2016

It doesn't need much, just very slight tweaks to the typography so it's a bit cleaner + a bit more visual separation between function descriptions + slightly better handling of the navigation. It's just all a little bit jammed together atm, IMO. As I say, wouldn't take much.

One thing I've found I love are the PDF documentation downloads; I wasn't expecting much, but (for example, the xmerl one) they're great, step by step useful examples that I can shove on an ereader to go through. Really solid.

aaron-lebo · on April 7, 2016

I think it depends on your hiring process.

Do you hire people who know how to code in language X and are really good at coding in language X but nothing else?

Or, do you hire people who are go-getters, want to use the best tool for the job, and want to learn?

Because if the latter, Erlang/Elixir might be esoteric, but Elixir specifically is a simpler language than currently more popular languages like Python or Ruby. It's also more elegant. I like to think of it as Scheme + Ruby syntax + pattern matching + concurrency. If that excites you, carry on. If that scares you, go for something more traditional (and no hard feelings) :).

debacle · on April 7, 2016

I would confidently state the bigger problem would be salary. If you hire cheap programmers you probably can't afford to do Erlang development.

nvarsj · on April 7, 2016

Seems like a strange view to have. The best paying work is generally in boring, enterprise approved languages. It's probably a competitive advantage for a small company to pick a niche language - I wager they would hire better quality engineers for less money.

debacle · on April 8, 2016

The best paying work is not in boring, enterprise approved languages. I'm not sure where you heard that.

namelezz · on April 7, 2016

Sometime I wonder why Elixir tries too hard to have Ruby syntax.

Erlang:

loop_through([H|T]) ->

  io:format('~p~n', [H]),

  loop_through(T);

loop_through([]) ->

ok.

It seems convenient to use ';' to separate multiple definitions of a function compared to using 'end' in Elixir(function definition continues in the next block)

Elixir:

def loop_through([h|t]) do

  IO.inspect h

  loop_through t

end

def loop_through([]) do

:ok

end

aaron-lebo · on April 7, 2016

I believe, like Steve Yegge has said elsewhere, that programmers are lame, (and now paraphrasing) because they refuse to touch things that look odd.

Erlang's syntax may be more efficient, but a lot of people won't touch it becuase the syntax is unfamiliar. Elixir has a huge advantage in this. It might not matter to you specifically, but in the realm of adoption that is huge.

And honestly I really enjoy the syntax...

bpicolo · on April 7, 2016

I mostly enjoy the syntax. Kind of hate that they made parentheses for function calls optional, especially since no-parentheses has ambiguous cases that behave weirdly (like with chaining syntax). Should be enforced.

Also, all the different import-related keywords are pretty confusing. Maybe that's improved though

vvanders · on April 7, 2016

Does do/end make it much easier to build macros with Elixir?

I don't find either syntax to be off-putting but that's just me.

rdtsc · on April 8, 2016

True. I personally like Erlang's syntax more. I like single assignment variables, I like the structure and feel of it and so on. But Elixir is great too. I am glad it is there and is taking advantage of the BEAM VM. It definetily appeals to many Ruby-ist or those who used Python and are otherwise scared by a different syntax.

crymer11 · on April 8, 2016

Still pretty new to Elixir, so I welcome any suggestions, but from my understanding (and based on Elixir's guides [1]), I think that the more idiomatic way to approach this is to avoid writing such a loop_through/1 function and instead just use:

  Enum.each(<the list>, &IO.inspect(&1))

or likely

  <series of data transformations resulting in list> |> Enum.each(&IO.inspect(&1))

Both return the :ok at the end of the loop, so Enum.each/2 looks to be functionally equivalent.

1 - http://elixir-lang.org/getting-started/recursion.html

yelnatz · on April 8, 2016

Much of the Ruby influence is because the creator of Elixir (José) was previously from the Rails core team.

SpacemanSpiff · on April 7, 2016

Embedded systems programmer here. I'm learning Elixir and am very excited about the nerves project: http://nerves-project.org/ https://www.youtube.com/watch?v=kpzQrFC55q4 Edit: There also seems to be a good Elixir/Erlang community here in Berlin Germany where I live at the moment.

ccozan · on April 7, 2016

Do you have some link or contacts in/to this community? I am in process to develop a quite interesting application on Erlang and I am struggling to find a community (in Germany) where I can either ask questions or have a pool of possible people to hire.

SpacemanSpiff · on April 8, 2016

Hi, I'm quite new to the community myself, so I don't have any established contacts to share at the moment. I recommend checking here: http://www.meetup.com/Elixir-Berlin/ This company also has an office in Berlin now: https://www.erlang-solutions.com/contact.html

PetrolMan · on April 7, 2016

We've pushed a couple of Elixir apps in the past six months. There was definitely a bit of worry that no one would be able to maintain it, but I have to say that the code is very readable. So many languages integrate some functional aspects that the code doesn't look foreign in most cases.

Also, the Phoenix Framework (if you're building a web app) is really, really nice.

tombert · on April 7, 2016

It depends on your location, I think.

Here in New York, it's not too hard to find Erlang enthusiasts, particularly ones that work on Wall Street.

That said, no one seemed to have even heard of the language in Dallas when I lived there.

So the long-story-short of this is that if you don't mind doing some remote-hiring, it's not too hard to find workers, and things like Ejabberd have a ton of tutorials on writing modules.

amsha · on April 7, 2016

Any programming language can do IPC, but Erlang/Elixir is the only language I've seen that makes it really easy. Erlang IPC is transparent whether you're sending messages between to local processes or remote processes. Most languages have trouble with local IPC (usually because of race conditions in data). Every other language I know of, including Go, needs custom code to handle remote IPC.

MichaelGG · on April 8, 2016

It just seems that all the effort going into Erlang would be better spent on making the OTP stuff as a library for a popular, cross-language platform. Like the JVM (or maybe .NET.)

querulous · on April 8, 2016

sure. we can start when the jvm can do preemptive scheduling of processes

erlang isn't great because of a single feature or library, it's great because it's designed from the ground up for one very particular role. you can't just port the otp api to go or the jvm. you need the whole foundation

waf · on April 8, 2016

I'm really interested in BEAM languages, but the fault-tolerance / supervisor aspect of it doesn't speak to me. Aren't all modern application fault-tolerant, as long as you don't design something really poorly?

For example, I've never had a single HTTP request bring down an entire website -- that's already isolated. Same with message-queue listening processes. For general batch applications, I've always had them short-lived and running periodically, e.g. every minute, so even a complete crash there is isolated between runs.

One powerful aspect is how it strongly encourages you to design loosely-coupled message-passing systems that should be easier to scale out. But I'm not convinced that's enough to warrant a switch.

ngrilly · on April 8, 2016

Erlang's fault-tolerance becomes really useful when you wrote a server that manage hundreds of thousands of simultaneous connections (a chat server being the typical example). With Erlang, each connection is managed by its own lightweight process (no callbacks, no promises, etc.). If a lightweight process fails, it doesn't bring down the other processes. Moreover, BEAM can signal other processes about the failed process (Erlang' supervision trees are based on this mechanism).

In a traditional architecture, you would use one thread for each connection (let's ignore the issue of the memory used by each thread), but when one thread fails, it would bring down all connections instead of just the failing one.

rubyn00bie · on April 8, 2016

It's more so, you don't need to write defensive code and you're actively encouraged not to... The mantra is akin to "let it fail, it will recover."

Once you start doing it, it becomes more apparent what the difference is... And that's not to say you can't design fault tolerant systems, it's just "easier" to do so with Erlang/BEAM.

jlarky2012 · on April 8, 2016

How fault-tolerant is your web server when datacenter has power outage? You can't build fault-tolerant system with one computing node by definition.

That means that if you planning to provide proper availability you want to work with system of applications, not just one. That means you should look into creating networking architecture that can handle all of that.

Most of the time it means that people just use tools that solve that for you, like load balancers. But it doesn't mean that somehow all modern applications are immune to failures.

laut · on April 10, 2016

The fault tolerance allow you to have processes and state to be available reliably for longer than the duration of a HTTP request.

You can have continuously running processes without relying on something outside of the language. You can more easily distribute such code as an Elixir package. The code can work without relying on e.g. cron or redis being available and configured.

rtpg · on April 8, 2016

So a side effect of the fault tolerance is that you can also easily redeploy small parts of your app. So if you have a small logic bug, you can circumvent bringing down the entire app to fix it.

There's also some more serious stuff like your workers getting killed by the OS for whatever reason and you might need to go in and restart it.

You can do this through most queue-based systems in other languages but having everything be built-in is useful.

tombert · on April 7, 2016

You know, even if Erlang didn't have great SMP scaling, or wonderful distributed properties, I think its fault-tolerant, actor-model-ey nature would make it wonderful anyway.

The fact that you program expecting failures, and forced isolation of everything allows for incredibly "sturdy" code. The other features are fantastic, but they're just gravy as far as I'm concerned.

atemerev · on April 7, 2016

The future is already here: http://erlangonxen.org/

thenewwazoo · on April 7, 2016

Ling is cool stuff, but I have the impression that it's a clean-sheet Erlang VM implementation, not a port of BEAM. BEAM is time-tested and battle-proven, and I'd really like to see BEAM itself rely less on the underlying OS (e.g. epmd as a separate OS process, quirks in how it uses select/poll). I know Peer Strizinger did a lot of work on this[0], but hasn't (to my knowledge) yet released any of his work, sadly.

[0] http://www.grisp.org

abrookewood · on April 7, 2016

Except they haven't really released anything in ages ... I keep checking back and it's very quiet.

bitmadness · on April 7, 2016

Erlang is glacially slow. Even on a 20 core machine, a multithreaded Erlang implementation will usually be trounced by a good singlethreaded C++/Go/Java implementation. All this stuff about multicore scaling is baloney - who cares it is scales and is still slow?

rdtsc · on April 8, 2016

> multithreaded Erlang implementation will usually be trounced by a good singlethreaded C++/Go/Java implementation

And Go/Java implementation can be trounced by hand written assembly and ASIC accelerators probably.

> All this stuff about multicore scaling is baloney

You say baloney I say money in the pocket. I've seen it scale, I've seen it work reliably in large clusters, I've been able to inspect, debug and hotpatch running systems while they are still running. I have seen systems which had non-critical components crash and auto-restart for days without impacting customers and needed teams of "devops" to babysit it.

Moreover I've see single and multi-threaded C++ and Java applications with threads and and data races which take weeks or months to find. Or they are screaming fast until they take a nosedive and segfault (also in some minor stupid new feature which nobody uses). You know what the transaction processing rate of a segfaulted process is? - 0 tps.

That's why teams like Whatsapp could get by with only 10 or so back-end engineers handling billions of messages / day from various devices new and old, while other companies need 10x or even 20x more than that.

It is not just being able to run fast. Assembly runs very fast. It is also about being able to have the right tools and abstraction to define a problem. Erlang has those and they come built-in (the OTP library, the distribution protocol etc), C++ doesn't, so have to start from STL and boost and so on, then get serialization, monitoring, supervision, etc bootstrapped.

querulous · on April 7, 2016

speed isn't a scalar. sometimes you care about latency. sometimes you care about throughput. sometimes you care about arithmetic.

http://www.phoenixframework.org/blog/the-road-to-2-million-w...

statictype · on April 8, 2016

Because at some point the volume of 'work' you process will be more than what a single CPU can handle.

I think of it like big corps vs startups. Startups may have more efficient engineers and development cycles and produce higher quality work because of the selectivity of their employees, but the sheer amount of work a big Corp can do is significantly more.

(Just an example, not saying big corps don't have talent - far from it)

When designing your system you need to decide which model it needs to follow. There are places for both.

eddd · on April 7, 2016

To me the power of Erlang comes from a combination of powerful VM and OTP. In the era of hype for SOA, OTP

You can run multiple apps on the same VM and they will run concurrently and communicate with each other using protocols that are core of the language.

I agree, erlang seems weird at first, simply because there is not anything like it. It also solved todays problems with software a decade ago.

d33 · on April 7, 2016

One word: ejabberd. That's why erlang matters to me :)

d33 · on April 7, 2016

Also, it has a great free book on it: http://learnyousomeerlang.com/

simoncion · on April 7, 2016

There are at least two great free books: http://www.erlang-in-anger.com/ ! :)

rvirding · on April 8, 2016

Written by the same author no less. :-)

ssmoot · on April 7, 2016

This seems like it could've been called "Why Actor Systems Matter".

If you're on the JVM, I'm not sure what Erlang buys you in practice. It's slower and more obscure. It has a much smaller ecosystem. While process-safety is frequently touted, in the real world this is a non-issue among non-issues. It's just not an actual thing. It's not like the JVM goes around Segfaulting all the time.

I'm totally sold on Actor Systems and think it's something more programmers should expose themselves to. I'm just not sure there's much of an argument for Erlang vs the JVM unless you're completely sold on the notion of process isolation for some reason.

Xixi · on April 7, 2016

It's not at all about segfaults. Assuming the Erlang VM is as likely/unlikely to crash as the JVM, then it's a complete wash: Erlang processes are green-threads, not OS processes, so you don't gain anything there.

It's about the share-nothing architecture of Erlang: it means your processes are isolated and can be stopped/restarted independently, crash/hang without writing into the memory of another process, have their own garbage collection, and be moved to a different physical computer (or even data center) with no impact at the logical level. Erlang forces you to architecture your program as a collection of nano-services. You can of course emulate some of it on the JVM, but you can't go lazy and cut corners with Erlang.

ssmoot · on April 7, 2016

Akka checks most of those boxes AFAIK. And there's really not much you can do in the way of cheating unless I misunderstand you. Or at least idiomatically you wouldn't cheat in Scala anyways.

Speaking of which, I just realized the AtomicLong I'm using in my IdGenerationActor (performs an atomic increment of a processId counter in the database, then uses that with the AtomicLong as the input to Hashids; fast, in-process, cluster-safe short-Id generation that will leave popular Redis based solutions in the dust) is completely unnecessary. Copied and pasted from non-Actor code without consideration.

I guess Actors still can't keep you from doing dumb things yet. ;-)

querulous · on April 7, 2016

this is just not true. every other week i have production issues because akka managed to exhaust the thread pool with long running/nonresponsive actors

ssmoot · on April 8, 2016

You've done something wrong then? Or maybe using experimental features? I might play with them a bit, and it can be frustrating to wait, but I've never launched anything on -experimental before.

I was referring to "cheating", with the idea that you might pollute your Actors with... I dunno. Programmatic connection pooling for your database driver? Passing mutable messages around?

Both of those things would be very unusual in Actors written in Scala since pretty much everything is immutable by default. Seeing that sort of thing should at least raise some eyebrows.

As far as non-responsive, I've never seen that. But I do take care to use Futures where appropriate, and pipeTo. Maybe it's good habits, or maybe I'm just very lucky.

One of my first Akka projects is still running, still processing content it's notified of by Postgres through LISTEN/NOTIFY, still posting that content into Cloudant (basically a managed Lucene deployment in this case).

And it's been running since September 2014 without a restart AFAIK. Which would have never happened with a previous non-Actor solution we might have used if for no other reason that a network blip might detach the listener, whereas here the Actor just restarts.

My experience has been overwhelmingly positive. Despite doing the wrong thing occasionally.

If you have Actors that are hung, I guess the first thing to try is figuring out which one(s). After that, you could just schedule the supervisor to routinely PoisonPill them, and forcefully stop them after a grace period.

I'm not sure that OTP is going to help with this sort of issue either. You have an apparently blocking process that makes an Actor non-responsive. The fact that it's single threaded within the Actor and stalls it's mailbox is kind of the point of Actor systems AFAIK.

I guess one thing I feel like helps me is to keep my Actors small and doing one thing. It's hard to do too much damage when you only have a dozen lines of actual message handling. "One thing" is sometimes coordination/aggregation BTW. Which means I might have a GetRequestActor, PutRequestActor, DeleteRequestActor, etc. Which do the one thing. But then I also have a DatabaseActor that basically just coordinates/forwards for all those so you don't actually have to deal with them yourself.

And from there if I decided that no GetRequestActor should take longer than 10 seconds to do it's thing, I can easily schedule the DatabaseActor to forcefully stop any task exceeding it (without the need for an actual watch if you choose). Which you can then log an ERROR for and work out why. And maybe some requests just take longer and that's OK. So maybe you then write a LongRunningRequestPathExtractor and put that pattern before the normal one. And now you can have multiple timeouts for different paths.

Or maybe you just record the epoch for different requests, and if you've had 1,000 updates since the last View request, you know the next one is going to trigger a reindex. So you give it extra time. Or you set it to allow stale results and reschedule the same call to occur again in 1 minute to minimize the disruption of indexing the changes. Just some thoughts.

coldtea · on April 8, 2016

>You've done something wrong then?

Well, the idea is with Erlang you can't.

ssmoot · on April 8, 2016

I don't believe Erlang goes around magically imposing it's own timeouts on message handling.

querulous · on April 8, 2016

it doesn't, but an erlang process that is waiting on a message won't block other processes from running ever

coldtea · on April 8, 2016

The magic is called "preemptive multitasking".

vvanders · on April 7, 2016

My Stop-The-World GCs would like to have a word with you.

ssmoot · on April 7, 2016

And maybe in a project with truly massive memory requirements that's an issue. I've never encountered it (as an issue that is) But I only got into the JVM on 1.6 with Jruby, made the switch to Scala around 1.7, and have been on 1.8 pretty much since release.

A 3GB heap is pretty typical, but that's mostly just room for cache.

If you're writing web-apps, there's a ridiculous amount of opportunity for using Actors to good effect.

On the other hand I try to keep an eye on resources. I try not to throw millions of messages at an Actor. If I'm batch processing, I'm probably work-pulling, mapping and prefetching my Sources to attempt to minimize latency. Even if I could tune mailbox sizes to avoid all that I prefer not to code in upper bounds on input.

I would be surprised to hear this is a widespread problem in people's work. You don't often hear the same complaints from Rubyists for example and they're much worse off in that regard. So I feel like this case is massively overstated.

But if you have that problem I can certainly see how that might figure into your equation.

vvanders · on April 7, 2016

Not everything is web-apps either. If you've got soft realtime requirements(<30ms) GC will hurt you. We spend a lot of time doing twists-and-turns in Java to not allocate anything to avoid that Gen-0 GC.

Erlang's GC model is just incredibly elegant. Almost like pooled allocators that we used to use in GameDev but in a completely transparent and intuitive way.

ssmoot · on April 7, 2016

I'll concede that. But you have to admit this is an incredibly niche problem space.

OTOH the JVM will have much higher throughput on the other 99% of apps.

Don't get me wrong. I like Erlang. It's definitely in my top.. Uhm... 3 languages of interest. Being the only other language I know with Pattern Matching is a huge huge part of that.

For most developers I'd just be careful in making that jump. Because most of what this article is about isn't Erlang specific. It's more like a pretty small club of platforms/languages. IMO.

vvanders · on April 7, 2016

Yeah, but real-time is about latency not throughput :).

FWIW a lot of game engines use message passing between components/actors due to the low coupling and architectural benefits so it's not quite as small of a domain as you think.

wcummings · on April 7, 2016

Erlang has pretty decent JVM integration. I assume this is why there's a nice Erlang IntelliJ plugin.

http://erlang.org/doc/apps/jinterface/jinterface_users_guide...

strictfp · on April 7, 2016

I would love process isolation on the JVM (several smaller heaps), and I'm not even into the actor model.

Plus I can imagine accidental mutability really being a problem. In my Akka course much effort was spent on explaining how to avoid side effect of it (no pun intended).

ssmoot · on April 7, 2016

I guess the accidental mutability is a good point. Working in Scala, I've never encountered that.

You do learn not to close over local scopes in futures pretty quickly though.

simula67 · on April 8, 2016

Please let me ask some noob questions.

> Erlang matters today because it demonstrates how these semantics can be elegantly packaged in one language, execution model, and virtual machine.

Why is it so important to demonstrate that these semantics can be packaged into such a homogeneous environment ? Is it even a good idea ? Is this way of doing things superior to having micro-services that talk to each, all managed by a supervisor system ? Wouldn't that allow us to take advantage of the unique upsides of multiple programming languages, virtual machines and execution models ?

Tomte · on April 8, 2016

You may underestimate the difficulty of writing a bullet-proof and feature-rich "supervisor system". That has pretty high complexity and lots of nasty edge-cases.

Erlang happens to nail that part with OTP.

zwischenzug · on April 8, 2016

I made an argument that we were reinventing many of the ideas of Erlang in the DCOS model (Docker, Kub, and cloud etc).

Text is here (CTRL-F erlang):

https://zwischenzugs.wordpress.com/2015/11/17/dockerconeu-20...

Video here:

https://www.youtube.com/watch?v=-qHwL8C9UoA

tmerrifi · on April 8, 2016

"Cache coherence doesn't scale."

This is a controversial statement, and an opinion that is not shared by many respected computer science researchers: http://research.cs.wisc.edu/multifacet/papers/tr2011-1_coher...

brazeon · on April 8, 2016

It does scale when being considered by software running on it. Even more with some alternative approaches like directories. What OP means is that it does not scale for software to be oblivious to underlying cache coherence and to operate on "one flat shared RAM" assumption. E.g. false cache line sharing etc.

ragnar123 · on April 7, 2016

Linked article "power-wall" (http://daimi.au.dk/~zxr/papers/treewalls.pdf) seems to be unavailable.

Does anyone working link to this article?

rekoros · on April 7, 2016

Updated to point to https://en.wikipedia.org/wiki/Multi-core_processor#Technical...

dboreham · on April 7, 2016

Doesn't the existence of Golang remove most of the reasons to use Erlang these days?

rubiquity · on April 7, 2016

Personally, for me it was the other way around. Discovering Erlang (and Elixir) removed all uses of Go that I had. I primarily used Go for writing networked applications. I never needed crazy CPU throughput on a single point of execution. Erlang's way of modeling concurrency and the built-in facilities of OTP made me way more productive than re-inventing the wheel via channels.

tombert · on April 7, 2016

Nah, Golang is wonderful and I use it for a lot of stuff, but it's sort of a different thing than Erlang.

Go is more of a systems-ish language that has an emphasis on concurrency, but also on being relatively C-like. It's much more general-purpose than Erlang.

Erlang is more about fault-tolerant, concurrent applications, and more specifically servers. It's not designed to be especially fast, it's designed to make it very simple to write systems with a ton of throughput, that can handle outages, and perform passably in the process.

They're different things, and the only thing they have in common is that they both advertise concurrency, really.

jlouis · on April 7, 2016

I was wondering this when Go came out. But time has shown they solve problems very differently, so they are not really direct competitors.

Erlangs primary difference over Go is that it can gracefully handle the case where a programmer has made an error in the program by accident. A typical Go program cannot gracefully recover from an error which were unforseen and never considered by the programmer. The Erlang equivalent Erlang program, however, has mechanisms to safely clean up resources for the faulty part and then get on solving work. This is the reason Erlang has a nice robustness story.

andy_ppp · on April 7, 2016

https://github.com/thejerf/suture

Supervisors for golang. It looks excellent. I personally have fallen for Elixir and think the syntax, immutability, community and class functionality like Phoenix.Presence make me bet that it'll be bigger than Python for jobs in 5 years.

jlouis · on April 7, 2016

Supervisors is only half the game here. Suppose A sends a message to B and decides to wait around for the answer. B now divides by zero.

In Go, your whole program is in consistency trouble and you have to write code to handle the case.

In Erlang, your monitor on B means you get told it is dead via an async exception delivered into your mailbox.

I know what kind of system I want to work with here :)