Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> The problem is, half the point (and the difficulty) of a crdt is constructing it so incomplete combinations of state updates kinda still make sense.

The real problem is that we're trying to solve this problem at the wrong level of abstraction.

I think CRDTs can be reformulated as a subset of a larger replicated data type model that explicitly includes conflicts as part of the model, that any chosen set of transformations can be programmatically projected into a corresponding set of conflicts, that the choices of resolutions to those conflicts can be pre-packaged and composable, that some of those resolutions may need to be 'add a new transformation that choses between conflicts created by two other conflicting transformations' (basically exposing the problem up into the model of the data type), and that a composition of chosen resolutions can be proved to be fully conflict-free.

The biggest benefits of this approach are: 1. It enables modeling domains where conflicts are an unavoidable part of the domain itself. 2. It allows tuning an initially pessimistic model where all conflicts are exposed to the user by gradually adjusting the conflict resolution choices so that it becomes more conflict-free over time; especially by composing known conflict-free subsets of the model into larger conflict-free subsets.



Author here. This looks like a very useful perspective to have! i.e CRDTs with unresolved conflicts as part of the model with composable properties.

Can you talk a bit more on this stuff? Like with a basic example maybe? Thanks!


Thanks for considering! I've been watching the CRDT space for the past couple years and every article, talk, and thread makes me more convinced that we need a broader model and a layer of abstraction to make this work accessible to a larger pool of programmers. I think the comment above is the most precise version of this claim that I've made so far.

One nice example is from a prior thread where I aim at the same idea ( https://news.ycombinator.com/item?id=28717848#28724977 ):

> edit A sets an invoice status to paid with a payment of 100, edit B changes the invoice amount from 100 to 120

Here, the merged result should be a conflict state that needs additional attention to resolve. Maybe this happens once and the resolution is to call the customer to work it out, or maybe this happens a lot and there's a whole conflict resolution tree that applies based on the details. This kind of conflict is not possible to "eliminate by model"; its presence is integral to the problem domain itself.

But even if there is a conflict we should still be able to represent it in our model, and be able to apply one of a number of pre-packaged resolutions. For example: if the invoice amount changed due to adding a new item to the order, then the resolution could branch with a choice: 1. split the item into a separate order, 2. remove the item and notify the person that made the change to the order, 3. set the invoice status to partially-paid and notify the customer of the remaining balance and reason why it changed. If instead the price changed then the resolutions could branch differently: 1. backdate the order to before the price change, 2. give the customer a discount, 3. set the invoice to partially-paid and notify the customer of the remaining balance, etc.

Note that these choices are concerned with business processes and customer relationship which have no "right" answer and could change business to business, customer to customer, or even order to order. We can still get desirable CRDT-like high-level guarantees out of this system (like "every invoice is paid in full or billed for the remaining balance") while modeling conflicts in it. Another benefit: two customers could request different default conflict resolutions, maybe one wants to always split into a new order and the other always wants to bill the added balance; with the conflict included as part of the model interactions with both of these customers could be represented in the same system with different custom resolution strategies.

I'm not sure how to continue advancing this idea, or if I've missed something important that invalidates it. I have a feeling that this could be layered on top of existing CRDT work with a rich enough data model. My email is in my profile if you want to talk more. :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: