Lies I was told about collaborative editing, Part 2: Why we don't use Yjs
https://www.moment.dev/blog/lies-i-was-told-pt-2I’ve spent 3+ years fighting the same problems while building DocNode and DocSync, two libraries that do exactly what you describe.
DocSync is a client-server library that synchronizes documents of any type (Yjs, Loro, Automerge, DocNode) while guaranteeing that all clients apply operations in the same order. It’s a lot more than 40 lines because it handles many things beyond what’s described here. For example:
It’s local-first, which means you have to handle race conditions.
Multi-tab synchronization works via BroadcastChannel even offline, which is another source of race conditions that needs to be controlled.
DocNode is an alternative to Yjs, but with all the simplicity that comes from assuming a central server. No tombstones, no metadata, no vector clock diffing, supports move operations, etc.
I think you might find them interesting. Take a look at https://docukit.dev and let me know what you think.
And for real "action" there should be a delay/pause button to simulate conflicts like the ones described in the blog
The feedback about the delay/pause button is also good, thanks!
BUT, since you mention it, I'll say a bit here. It sounds like you have your own experience, and we'd love to hear about that. But OUR experience was: (1) we found (contrary to popular belief) that OT actually does not require a centralized server, (2) we found it to be harder to implement OT exactly right vs CRDTs, and (3) we found many (though not all) of the problems that CRDTs have, are also problems in practice for OT—although in fairness to OT, we think the problems CRDTs have in general are vastly worse to the end-user experience.
If there's interest I'm happy to write a similar article entirely dedicated to OT. But, for (3), as intuition, we found a lot of the problems that both CRDTs and OT have seem to arise from a fundamental impedance mismatch between the in-memory representation of the state of a modern editor, and the representation that is actually synchronized. That is, when you apply an op (CRDT) or a transform (OT), you have to transform the change into a (to use ProseMirror as an example) valid `Transaction` on an `EditorState`. This is not always easy in either case, and to do it right you might have to think very hard about things like "how to preserve position mappings," and other parts of editor state that are crucial to (say) plugins that manage locations of comment marks or presence cursors.
With all of that said, OT is definitely much closer to what modern editors need, in my opinion at least. The less-well-known algorithm we ended up recommending here (which I will call "Marjin Collab", after its author) is essentially a very lightweight OT, without the "transformation" step.
https://svn.apache.org/repos/asf/incubator/wave/whitepapers/...
> This is not always easy in either case, and to do it right you might have to think very hard about things like "how to preserve position mappings," and other parts of editor state that are crucial to (say) plugins that manage locations of comment marks or presence cursors.
Maintaining text editor state is normal. Yes you do need to convert the OT messages into whatever diff format your editor requires (and back), but that's the standard glue code.
The nice thing about OT is that you can just feed the positions of marks into the OT algorithm to get the new positional value. Worst case, you just have the server send the server side position when sending the OT event and the client just displays the server side position.
If you are using a centralized server and ProseMirror, there are several OT and pseudo-OT implementations. Most popularly, there is prosemirror-collab[4], which is basically "OT without the stuff you don't need with an authoritative source for documents." Practically speaking that means "OT without T", but because it does not transform the ops to be order-independent, it has an extra step on conflict where the user has to rebase changes and re-submit. This is can cause minor edit starvation of less-connected clients. prosemirror-collab-commit[5] fixes this by performing the rebasing on the server... so it's still "OT without the T", but also with an authoritative conflict resolution pseudo-T at the end. I personally recommend prosemirror-collab-commit, it's what we use, and it's extremely fast and predictable.
If you just want something pedogocically helpful, the blessed upstream collaborative editing solution for CodeMirror is OT. See author's blog post[1], the @codemirror/collab package[2], and the live demo[3]. In general this implementation is quite good and worth reading if you are interested in this kind of thing. ShareJS and OTTypes are both very readable and very good, although we found them very challenging to adopt in a real-world ProseMirror-based editor.
[1]: https://marijnhaverbeke.nl/blog/collaborative-editing-cm.htm...
[2]: https://codemirror.net/docs/ref/#collab
[3]: https://codemirror.net/examples/collab/
[4]: https://github.com/ProseMirror/prosemirror-collab
[5]: https://github.com/stepwisehq/prosemirror-collab-commit
DocSync, which is the sync engine mainly designed with DocNode in mind, I would say is a bit less mature.
I’d love it if you could take a look and see if there’s anything that doesn’t convince you. I’ll be happy to answer any questions.
That is one hot take!
It is very true that there are nuances you have to deal with when using CRDT toolkits like Yjs and Automerge - the merged state is "correct" as a structure, but may not match your scheme. You have to deal with that into your application (Prosemirror does this for you, if you want it, and can live with the invalid nodes being removed)
You can't have your cake and eat it with CRDTs, just as you can't with OT. Both come with compromises and complexities. Your job as a developer is to weigh them for the use case you are designing for.
One area in particular that I feel CRDTs may really shine is in agentic systems. The ability to fork+merge at will is incredibly important for async long running tasks. You can validate the state after an agent has worked, and then decide to merge to main or not. Long running forks are more complex to achieve with OT.
There is some good content in this post, but it's leaning a little too far towards drama creation for my tast.
In theory you can write better bindings yourself. In practice, if the official path falls over under normal editing, telling people to just do more integration work sounds a lot like moving the goalposts.
I think it's defensible to say that this point in particular is not indicting CRDTs in general because I do say the authors are trying to fix it, and then I link to the (unpublicized) first PR in that chain of work (which very few people know about!), and I specifically spend a whole paragraph saying I hope that I a forced to write an article in a year about how they figured it all out! If I was trying to be disingenuous, why do any of that?
EDIT: I live in Seattle and it is 12:34, so I must go to bed soon. But I will wake up and respond to comments first thing in the morning!
There seems to be a conflict of interest with describing Yjs's performance, which basically does the same thing along with Automerge.
But the product seems much more narrow than an actual tool run the whole business in markdown. I was hoping to see Logseq on steroids, and it feels like a tool builder primarily. I love the tool building aspect, but the fundamentals of simply organizing docs (docs, presentations, assets etc, the basics of a business) are either not part of the core offering or not presented well at all.
I love the idea of building custom tools on top of MD and it's part of my wishlist, but I feel little deceived by your tagline so I wanted to share that :)
If you are open to it, I'd love the opportunity to hear more. Here or email (alex@moment.dev) or our Discord (bottom right of our website) or Twitter/X... or whatever you prefer.
The logic that makes sense is you are using your own framing (Moment.dev will later be paid and people will be customers) to interpret Yjs.
Moreover, the 'social proof' posted by the following later on by 'auggierose' and 'skeptrune': - https://news.ycombinator.com/item?id=47396154 - https://news.ycombinator.com/item?id=47396139
Appears, to me, to be manufactured. The degree of consolidation in this 'SF/Bay Area tech cult' which I've noticed, although I am unsure if others are aware, that tries to help other members at the expense of quality, growing network wealth through favoritism rather than adherence to quality, is counterpoint to users whose interest is high quality software without capture.
While you may not like me describing this, it is not in your own interest to do this because it catabolizes the base layer that would sustain you. Social media catabolizes actual social networks, as AI catabolizes those who write information online. Behavior like this ruins the public commons over time.
It's hard to tell, but I think you also might be saying that criticizing the FOSS foundations of our product actually hurts the ecosystem. I actually am very open to that, and it's why we took so much time writing it since part 1 came out. But the Yjs-alternative technology we use is all also F/OSS, and we also do directly support it, with actual money from our actual bank account. All I'm recommending here is that others do the same. Sorry if that was not clear.
The rest of your reply, I'm not sure I grok. I think you might be suggesting that we are sock-puppeting `auggierose` or `skeptrune`, and that we are part of some (as you put it) "cult" of the Bay area! Let me be clear that neither of these things true. I don't know anyone at Mintlify personally, and in any event we are from Seattle not the Bay!
const result = step.apply(this.doc);
if (result.failed) return false;
I suspect this doesn't work.In general, the client implementation of collab is pretty simple. Nearly all of the subtlety lies in the server. But it, too, is generally not a lot of code, see for example the author's implementation: https://github.com/ProseMirror/website/tree/master/src/colla...
But there's another issue that the author hasn't even considered, and possibly it's the root cause why the prosemirrror (which I'd never heard of before btw) does the thing the author thinks is broken... Say you have a document like "请来 means 'please go'" and independently both the Chinese and English collaborators look at that and realise it's wrong. One changes it to "请走 means 'please go'" and the other changes it to "请来 means 'please come'". Those changes are in different spans, and so a merge would blindly accept both resulting in "请走 means 'please come'" which is entirely different from the original, but just as incorrect. Depending on how much other interaction the authors have, this could end up in a back and forth of both repeatedly changing it so the merged document always ended up incorrect, even though individually both authors had made valid corrections.
That example seems a bit hypothetical, but I've experienced the same thing in software development where two BAs had created slightly incompatible documents stating how some functionality should work. One QA guy kept raising bugs saying "the spec says it should do X", the dev would check the cited spec and change the code to match the spec. Weeks later, a different QA guy with a different spec would raise a bug saying "why is this doing X? The spec says it should do Y", a different dev read the cited spec, and changed the code. In this case, the functionality flip-flopped about 10 times over the course of a year and it was only a random conversation one day where one of them complained about a bug they'd fixed many times and the other guy said "hey, that bug sounds familiar" and they realised they were the two who'd been changing the code back and forth.
This whole topic is interesting to me, because I'm essentially solving the same problem in a different context. I've used CRDT so far, but only for somewhat limited state where conflicts can be resolved. I'm now moving to a note-editing section of the app, and while there is only one primary author, their state might be on multiple devices and because offline is important to me, they might not always be in sync. I think I'm probably going to end up highlighting conflicts, I'm not sure. I might end up just re-implementing something akin to Quill's system of inserts / deletes.
I also tried out the behaviour of their example. Slowing the sync time down to 3 seconds, and then typing "Why not" and then waiting for it to sync before adding " do this?" on client A and " joke?" on client B. The result was "Why not do this? joke?" when I'd have hoped that this would have been flagged as a conflict. Similarly, starting with "Why not?" and adding both " do this" and " joke" in the different clients produced "Why not do this joke?" even though to me, that should have been a conflict - both were inserting different content between "t" and "?".
Finally, changing "do" to "say" in client A and THEN changing "do" to "read" in client B before it updated, actually resulted in a conflict in the log window and the resultant merge was "Why not rayead this joke?" Clearly this merge strategy isn't that great here, as it doesn't seem to be renumbering the version numbers based on the losing side (or I've misunderstood what they're actually doing).
EDIT: I will say I'm not against AI writing tools or anything like that. But, for better or worse, that's just not what happened here.