How does the "your concurrent program can never have data races" claim go along ...

0cachecoherency · on May 3, 2015

Data races are a subset of all race conditions. We don't claim that it's impossible to wait on a network message that never arrives, of course!

However, Pony makes a messaging order guarantee that's much stronger than is typical for the actor model. It guarantees causal messaging. That is, any message that is a "cause" of another message (i.e. was sent or received by an actor prior to the message in question) is guaranteed to arrive before the "effect" if they have the same destination.

That guarantee is achieved with no runtime cost, which is pretty fun.

Pony still allows unbounded non-determinism, of course, but causal messaging gives a very comfortable guarantee to the programmer.

jlouis · on May 3, 2015

Your causal message ordering guarantee sounds exactly like the Erlang one. If a pair of processes (P1,P2) is given and P1 sends m1 then m2, then P2 will see them in that order, perhaps interleaved with messages from other Pids.

But your claim might be stronger, though I'm not sure in what sense it is stronger.

n3otec · on May 3, 2015

In Erlang, that only holds for a pair of processes, as you point out.

In Pony, it holds for the entire program. So, if A1 sends a message to A2, and then A1 sends a message to A3, and in response to the second message A3 sends a message to A2... in Pony, A2 is guaranteed to get the message from A1 before the message from A3.

jlouis · on May 3, 2015

Hmm, thinking some more on this, this holds locally in Erlang, but not in the distributed communication. If we have m1: A1 -> A2, m2: A1 -> A3 and then in m3: A3 -> A2, then the message ordering will always be that m1 comes before m3 in A2 (locally).

Locally, a message pass never fails and always delivers the message directly. It can't be in-flight, so m1 is placed in the mailbox before m2 happens, and thus m3 will always come later.

In the distributed setting, A1, A2 and A3 might be living on three different nodes, and then m1 might get delayed on the TCP socket while m2 and m3 goes through quickly. Thus we observe the reverse ordering, m3 followed by m1. This is probably why the causal ordering constraint is dropped in Erlang. Is pony distributed?

n3otec · on May 4, 2015

not yet, but it will be soon, and with the same causality guarantee

lostcolony · on May 4, 2015

Isn't that going to kill performance (and possibly even kill the system, in the event of partial netsplits or similar)?

e12e · on May 4, 2015

That sounds... interesting. So if A1 is on a different segment, but has a faster link (less congestion?) to A3, than A2, even if it takes 1000t time to send the first message from A1 to A2, and only 10t+t time = 11t time to send a message, from A1 to A3, and from A3 to A2 -- the plan is to guarantee ordering of messages m1, m3 at A2 (delaying m1 at least 989t)?

saryant · on May 4, 2015

How will Pony achieve this in a distributed context? Isn't this essentially asking for exactly-once delivery?

jlouis · on May 4, 2015

How do you intend to achieve this?

jlouis · on May 3, 2015

I suspected as much. It is a pretty good guarantee. Have you found any use for it yet in programs? That is, a program where the causal message ordering is preferred and removes code from the program?

Matthias247 · on May 4, 2015

That's just what I was thinking about: Where would this be useful?

On the other hand I could imagine that this could also cause problems: If you have a large program, could then this causality cause some messages to be delayed for a (too) long and not directly visible time in order to achieve the causality guarantee? But the claim was that there is no runtime overhead.

kaeluka · on May 4, 2015

Use: refactoring! I imagine:

You have an actor A13 that communicates with actor A2. The guarantee allows you to break up A13 into the two actors A1 and A3 that now both communicate with A2 and be done with it.

In erlang, you might not've done so, because you know nothing about message ordering in non-tree-shaped topologies.

xjia · on May 3, 2015

Any research paper or documentation on "That guarantee is achieved with no runtime cost"?

lostcolony · on May 3, 2015

Especially if they're trying to include distribution (per a comment elsewhere). Pretty sure you can't have linearalizable messages between nodes without additional runtime cost compared to the non-linearalized case.

AnkhMorporkian · on May 4, 2015

Distribution probably (almost definitely) kills the runtime cost, I can't imagine how it couldn't. I'm much more interested in the causality guarantee in the context of the distributed systems.

lostcolony · on May 5, 2015

Oh, agreed. But it's doable, provided you don't mind losing availability (which is...an interesting choice for a programming language), and you don't mind the runtime cost.

But I agree, exactly what guarantees they're trying to provide (and is it opt in?) is an interesting question.

kaeluka · on May 4, 2015

I found: Clebsch, S., & Drossopoulou, S. (2013). Fully Concurrent Garbage Collection of Actors on Many-core Machines. ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, 553–570. http://doi.org/10.1145/2509136.2509557

See p558, "Example 3"