Lies I was told about collaborative editing, Part 2: Why we don't use Yjs

https://www.moment.dev/blog/lies-i-was-told-pt-2

136antics | 3 days ago | 73 | HN

loading story #47400641

I remember reading Part 1 back in the day, and this is also an excellent article.

I’ve spent 3+ years fighting the same problems while building DocNode and DocSync, two libraries that do exactly what you describe.

DocSync is a client-server library that synchronizes documents of any type (Yjs, Loro, Automerge, DocNode) while guaranteeing that all clients apply operations in the same order. It’s a lot more than 40 lines because it handles many things beyond what’s described here. For example:

It’s local-first, which means you have to handle race conditions.

Multi-tab synchronization works via BroadcastChannel even offline, which is another source of race conditions that needs to be controlled.

DocNode is an alternative to Yjs, but with all the simplicity that comes from assuming a central server. No tombstones, no metadata, no vector clock diffing, supports move operations, etc.

I think you might find them interesting. Take a look at https://docukit.dev and let me know what you think.

antics10 hours ago | parent | next

Hello again Germán! Since the product we make is, basically, a local-first markdown file editor, I would humbly suggest that the less-well-known algorithm we recommend is thus also local-first. But, I fully believe that you do a ton of stuff that we don't, and if we had known about it at the time, we very definitely would have taken a close look! We did not set out to do this ourselves, it just kind of ended up that way.

freekh11 hours ago | parent | next

Cool! We also build client-server sync for our local-first CMS: https://github.com/valbuild/val Just as your docsync, it has to both guarantee order and sync to multiple types of servers (your own computer for local dev, cloud service in prod). Base format is rfc 6902 json patches. Read the spec sheet and it is very similar :)

huksley8 hours ago | root | parent

Looks really cool, I would love to use it in my DollarDeploy project. Documentation could be a bit better still, it is not clear, are content is pure markdown or it is typescript files? Which GitHub repo it synchronizes to? I prefer monorepo approach.

eviks10 hours ago | parent

Tiny fail at undo: insert 1 before E, Ctlr+Z, move left/right: left editor moves around E, right editor moves around the nonexistent 1

And for real "action" there should be a delay/pause button to simulate conflicts like the ones described in the blog

GermanJablo8 hours ago | root | parent

Yes, the undo issue is a known bug in the website demo because it's messing with Lexical's undo functionality. It's not actually a DocNode bug. I'll fix it soon.

The feedback about the delay/pause button is also good, thanks!

samlinnfer12 hours ago | parent | next

Just use OT like normal people, it’s been proven to work. No tombstones, no infinite storage requirements or forced “compaction”, fairly easy to debug, algorithm is moderate to complex but there are reference open source implementations to cross check against. You need a server for OT but you’re always going to have a server anyway, one extra websocket won’t hurt you. We regularly have 30-50k websockets connected at a time. CRDTs are a meme and are not for serious applications.

antics11 hours ago | parent | next

Author here, I did not specifically mention OT in the article, since our main focus was to help people understand the downsides of the currently-most-popular system, which is built on CRDTs.

BUT, since you mention it, I'll say a bit here. It sounds like you have your own experience, and we'd love to hear about that. But OUR experience was: (1) we found (contrary to popular belief) that OT actually does not require a centralized server, (2) we found it to be harder to implement OT exactly right vs CRDTs, and (3) we found many (though not all) of the problems that CRDTs have, are also problems in practice for OT—although in fairness to OT, we think the problems CRDTs have in general are vastly worse to the end-user experience.

If there's interest I'm happy to write a similar article entirely dedicated to OT. But, for (3), as intuition, we found a lot of the problems that both CRDTs and OT have seem to arise from a fundamental impedance mismatch between the in-memory representation of the state of a modern editor, and the representation that is actually synchronized. That is, when you apply an op (CRDT) or a transform (OT), you have to transform the change into a (to use ProseMirror as an example) valid `Transaction` on an `EditorState`. This is not always easy in either case, and to do it right you might have to think very hard about things like "how to preserve position mappings," and other parts of editor state that are crucial to (say) plugins that manage locations of comment marks or presence cursors.

With all of that said, OT is definitely much closer to what modern editors need, in my opinion at least. The less-well-known algorithm we ended up recommending here (which I will call "Marjin Collab", after its author) is essentially a very lightweight OT, without the "transformation" step.

loading story #47397641

samlinnfer9 hours ago | root | parent | next

Having a central server is not necessary, but we have one anyway and we use it, especially if you have a permissions system. It lets us use the "Google wave" algorithm which vastly simplifies things.

https://svn.apache.org/repos/asf/incubator/wave/whitepapers/...

> This is not always easy in either case, and to do it right you might have to think very hard about things like "how to preserve position mappings," and other parts of editor state that are crucial to (say) plugins that manage locations of comment marks or presence cursors.

Maintaining text editor state is normal. Yes you do need to convert the OT messages into whatever diff format your editor requires (and back), but that's the standard glue code.

The nice thing about OT is that you can just feed the positions of marks into the OT algorithm to get the new positional value. Worst case, you just have the server send the server side position when sending the OT event and the client just displays the server side position.

DonHopkins9 hours ago | root | parent

Josh eloquently explains how Google Wave's DACP (Distributed Application Canceling Protocol) works:

https://www.youtube.com/watch?v=4Z4RKRLaSug

ianhorn8 hours ago | root | parent

I always mentally slotted prosemirror-collab/your recommended solution in the OT category. What’s the difference between the “rebase” step and the “transformation” step you’re saying it doesn’t need?

MrJohz10 hours ago | parent | next

Are there any major libraries for OT? I've been looking into this recently for a project at work, and OT would be completely sufficient for our use case, and does look simpler overall, but from what I could tell, we'd need to write a lot of stuff ourselves. The only vaguely active-looking project in JS at least seems to be DocNode (https://www.docukit.dev/docnode), and that looks very cool but also very early days.

antics9 hours ago | root | parent | next

Author here. I think it depends what you're doing! OT is a true distributed systems algorithm and to my knowledge there are no projects that implement true, distributed OT with strong support for modern rich text editor SDKs like ProseMirror. ShareJS, for example, is abandoned, and predates most modern editors.

If you are using a centralized server and ProseMirror, there are several OT and pseudo-OT implementations. Most popularly, there is prosemirror-collab[4], which is basically "OT without the stuff you don't need with an authoritative source for documents." Practically speaking that means "OT without T", but because it does not transform the ops to be order-independent, it has an extra step on conflict where the user has to rebase changes and re-submit. This is can cause minor edit starvation of less-connected clients. prosemirror-collab-commit[5] fixes this by performing the rebasing on the server... so it's still "OT without the T", but also with an authoritative conflict resolution pseudo-T at the end. I personally recommend prosemirror-collab-commit, it's what we use, and it's extremely fast and predictable.

If you just want something pedogocically helpful, the blessed upstream collaborative editing solution for CodeMirror is OT. See author's blog post[1], the @codemirror/collab package[2], and the live demo[3]. In general this implementation is quite good and worth reading if you are interested in this kind of thing. ShareJS and OTTypes are both very readable and very good, although we found them very challenging to adopt in a real-world ProseMirror-based editor.

[1]: https://marijnhaverbeke.nl/blog/collaborative-editing-cm.htm...

[2]: https://codemirror.net/docs/ref/#collab

[3]: https://codemirror.net/examples/collab/

[4]: https://github.com/ProseMirror/prosemirror-collab

[5]: https://github.com/stepwisehq/prosemirror-collab-commit

loading story #47397493

GermanJablo8 hours ago | root | parent | next

Author of DocNode here. Yes, it’s still early days. But it’s a very robust library that I don’t expect will go through many breaking changes. It has been developed privately for over 2 years and has 100% test coverage. Additionally, each test uses a wrapper to validate things like operation reversibility, consistency across different replicas, etc.

DocSync, which is the sync engine mainly designed with DocNode in mind, I would say is a bit less mature.

I’d love it if you could take a look and see if there’s anything that doesn’t convince you. I’ll be happy to answer any questions.

loading story #47397466

samlinnfer9 hours ago | root | parent

https://github.com/josephg/sharejs

https://github.com/FirebaseExtended/firepad

https://github.com/Operational-Transformation/ot.js

https://github.com/ottypes/docs

Really nice demo: https://operational-transformation.github.io

loading story #47397324

chrisweekly11 hours ago | parent

"CRDTs are a meme and are not for serious applications."

That is one hot take!

argee11 hours ago | root | parent

Let's balance the discussion a bit.

https://josephg.com/blog/crdts-are-the-future/

loading story #47397925

loading story #47398769

loading story #47399426

auggierose10 hours ago | parent | next

And let's not forget that the official paper on Yjs is just plain wrong, the "proofs" it contains are circular. They look nice, but they are wrong.

breakingcups9 hours ago | parent

Could you elaborate on that or share a source? It sounds like it'd be not just interesting but important to learn.

loading story #47397408

samwillis11 hours ago | parent | next

It's disingenuous to suggest that "Yjs will completely destroy and re-create the entire document on every single keystroke" and that this is "by design" of Yjs. This is a design limitation of the official y-Prosemirror bindings that are integrating two distinct (and complex) projects. The post is implying that this is a flaw in the core Yjs library and an issue with CRDTs as a whole. This is not the case.

It is very true that there are nuances you have to deal with when using CRDT toolkits like Yjs and Automerge - the merged state is "correct" as a structure, but may not match your scheme. You have to deal with that into your application (Prosemirror does this for you, if you want it, and can live with the invalid nodes being removed)

You can't have your cake and eat it with CRDTs, just as you can't with OT. Both come with compromises and complexities. Your job as a developer is to weigh them for the use case you are designing for.

One area in particular that I feel CRDTs may really shine is in agentic systems. The ability to fork+merge at will is incredibly important for async long running tasks. You can validate the state after an agent has worked, and then decide to merge to main or not. Long running forks are more complex to achieve with OT.

There is some good content in this post, but it's leaning a little too far towards drama creation for my tast.

hrmtst938378 hours ago | parent | next

You can split CRDT libs and compose them however you want, but most teams never get past the blessed bindings, because stitching two moving targets together by hand is miserable even if you know both codebases. Then you're chasing a perf cliff and weird state glitches every time one side revs.

In theory you can write better bindings yourself. In practice, if the official path falls over under normal editing, telling people to just do more integration work sounds a lot like moving the goalposts.

11 hours ago | parent | next

{"deleted":true,"id":47395895,"parent":47395826,"time":1773644719,"type":"comment"}

antics11 hours ago | parent

Author here, sorry if this was not clear: that specific point was not supposed to be an indictment of all CRDTs, it was supposed to be much more narrow. Specifically, the Yjs authors clearly state that they purposefully designed its interface to ProseMirror to delete and recreate the entire document on every collab keystroke, and the fact that it stayed open for 6 YEARS before they started to try to fix it, does in my opinion indicate a fundamental misunderstanding of what modern text editors need to behave well in any situation. Not even a collaborative one. Just any situation at all.

I think it's defensible to say that this point in particular is not indicting CRDTs in general because I do say the authors are trying to fix it, and then I link to the (unpublicized) first PR in that chain of work (which very few people know about!), and I specifically spend a whole paragraph saying I hope that I a forced to write an article in a year about how they figured it all out! If I was trying to be disingenuous, why do any of that?

loading story #47398491

antics11 hours ago | parent | next

Hi folks, author here. I thought this was dead! I'm here to answer questions if you have them.

EDIT: I live in Seattle and it is 12:34, so I must go to bed soon. But I will wake up and respond to comments first thing in the morning!

drpotato11 hours ago | parent

Just wanted to say thanks! This is a great write up and resonates with issues I encountered when trying to productionise a yjs backed feature.

loading story #47397890

dsnr7 hours ago | parent | next

It should be noted that this is about text editing specifically, and for other use-cases YJS is using other code pathways/algorithms, but you have to be careful how you design your data structure for atomic updates.

anentropic8 hours ago | parent | next

I'm curious how these approaches compare with MRDTs implemented in Irmin

https://gowthamk.github.io/docs/mrdt.pdf

skeptrune10 hours ago | parent | next

we're about to implement collaborative editing at Mintlify and were considering yjs so this couldn't have come at a better time

antics10 hours ago | parent

Author here, my personal mission is for people implementing this to have clear, actionable advice. Which is something we did not when we started. If you want to chat about it I'm happy to help, just email me: clemmer.alexander@gmail.com

presspot3 days ago | parent | next

Replacing CRDT with 40 lines of code. Amazing.

kaiwenwang12 hours ago | parent | next

It appears Moment is producing "high-performance, collaborative, truly-offline-capable, fully-programmable document editor" - https://www.moment.dev/blog

There seems to be a conflict of interest with describing Yjs's performance, which basically does the same thing along with Automerge.

antics11 hours ago | parent

Author here. To be clear, we do not in ANY WAY compete with Yjs! We are a potential customer of Yjs. This article explains why we chose not to be a customer of Yjs, and why we don't think most people building real-time collaborative text editors should be, either.

crashabr10 hours ago | root | parent | next

You have an amazing tagline. This is the first time I read a tagline and thought: this is exactly what I was looking for.

But the product seems much more narrow than an actual tool run the whole business in markdown. I was hoping to see Logseq on steroids, and it feels like a tool builder primarily. I love the tool building aspect, but the fundamentals of simply organizing docs (docs, presentations, assets etc, the basics of a business) are either not part of the core offering or not presented well at all.

I love the idea of building custom tools on top of MD and it's part of my wishlist, but I feel little deceived by your tagline so I wanted to share that :)

antics9 hours ago | root | parent

This is great feedback, thank you. I will say that IS our goal... but we only really launched last week and are still figuring out what resonates with people and what they really want! It sounds like you're saying that the organization aspects are not there, which is very helpful to know... I am not quite sure I understand if you also think the toolbuilding is lacking?

If you are open to it, I'd love the opportunity to hear more. Here or email (alex@moment.dev) or our Discord (bottom right of our website) or Twitter/X... or whatever you prefer.

kaiwenwang10 hours ago | root | parent

That doesn't make sense. If you are a customer that implies you pay for it, so people can be users of Yjs which is free and open-source, but not customers.

The logic that makes sense is you are using your own framing (Moment.dev will later be paid and people will be customers) to interpret Yjs.

Moreover, the 'social proof' posted by the following later on by 'auggierose' and 'skeptrune': - https://news.ycombinator.com/item?id=47396154 - https://news.ycombinator.com/item?id=47396139

Appears, to me, to be manufactured. The degree of consolidation in this 'SF/Bay Area tech cult' which I've noticed, although I am unsure if others are aware, that tries to help other members at the expense of quality, growing network wealth through favoritism rather than adherence to quality, is counterpoint to users whose interest is high quality software without capture.

While you may not like me describing this, it is not in your own interest to do this because it catabolizes the base layer that would sustain you. Social media catabolizes actual social networks, as AI catabolizes those who write information online. Behavior like this ruins the public commons over time.

antics10 hours ago | root | parent

I'm not sure I fully understand, but to be clear, we actually do voluntarily pay for the Free and OSS software we use. For example, we support `react-prosemirror` directly with monetary compensation. And if we used Yjs, we would have paid for that too. So in that sense, I do think of us as customers!

It's hard to tell, but I think you also might be saying that criticizing the FOSS foundations of our product actually hurts the ecosystem. I actually am very open to that, and it's why we took so much time writing it since part 1 came out. But the Yjs-alternative technology we use is all also F/OSS, and we also do directly support it, with actual money from our actual bank account. All I'm recommending here is that others do the same. Sorry if that was not clear.

The rest of your reply, I'm not sure I grok. I think you might be suggesting that we are sock-puppeting `auggierose` or `skeptrune`, and that we are part of some (as you put it) "cult" of the Bay area! Let me be clear that neither of these things true. I don't know anyone at Mintlify personally, and in any event we are from Seattle not the Bay!

loading story #47397689

bawolff12 hours ago | parent | next

Reminds me a bit of google-mobwrite. I wonder why that fell out of favour.

minikomi8 hours ago | parent | next

Component library page in the docs gives 404

loading story #47397569

loading story #47398059

lostmsu9 hours ago | parent | next

From the "40 line CRDT replacement":

    const result = step.apply(this.doc);
    if (result.failed) return false;

I suspect this doesn't work.

antics9 hours ago | parent

Author here. I'll actually defend this. Most of the subtlety of this part is actually in document schema version mismatches, and you'd handle that at client connect, generally, since we want the server to dictate the schema version you're using.

In general, the client implementation of collab is pretty simple. Nearly all of the subtlety lies in the server. But it, too, is generally not a lot of code, see for example the author's implementation: https://github.com/ProseMirror/website/tree/master/src/colla...

ralferoo7 hours ago | parent | next

I just read part 1 as well as part 2, for me it raises an interesting question that wasn't addressed. I correctly guessed the question posed about the result of the conflict, and while it's true that's not the end result I'd probably want, it's also important because it gives me visibility of the other user's change. Both users know exactly what the other did - one deleted everything, the other added a u. If you end up with an empty document, the deleting user doesn't know about the spelling correction that may need to be re-applied elsewhere. Perhaps they just cut and pasted that section elsewhere in the document.

But there's another issue that the author hasn't even considered, and possibly it's the root cause why the prosemirrror (which I'd never heard of before btw) does the thing the author thinks is broken... Say you have a document like "请来 means 'please go'" and independently both the Chinese and English collaborators look at that and realise it's wrong. One changes it to "请走 means 'please go'" and the other changes it to "请来 means 'please come'". Those changes are in different spans, and so a merge would blindly accept both resulting in "请走 means 'please come'" which is entirely different from the original, but just as incorrect. Depending on how much other interaction the authors have, this could end up in a back and forth of both repeatedly changing it so the merged document always ended up incorrect, even though individually both authors had made valid corrections.

That example seems a bit hypothetical, but I've experienced the same thing in software development where two BAs had created slightly incompatible documents stating how some functionality should work. One QA guy kept raising bugs saying "the spec says it should do X", the dev would check the cited spec and change the code to match the spec. Weeks later, a different QA guy with a different spec would raise a bug saying "why is this doing X? The spec says it should do Y", a different dev read the cited spec, and changed the code. In this case, the functionality flip-flopped about 10 times over the course of a year and it was only a random conversation one day where one of them complained about a bug they'd fixed many times and the other guy said "hey, that bug sounds familiar" and they realised they were the two who'd been changing the code back and forth.

This whole topic is interesting to me, because I'm essentially solving the same problem in a different context. I've used CRDT so far, but only for somewhat limited state where conflicts can be resolved. I'm now moving to a note-editing section of the app, and while there is only one primary author, their state might be on multiple devices and because offline is important to me, they might not always be in sync. I think I'm probably going to end up highlighting conflicts, I'm not sure. I might end up just re-implementing something akin to Quill's system of inserts / deletes.

ralferoo6 hours ago | parent

I see someone has downvoted my actually relevant post. Not sure why, but anyway.

I also tried out the behaviour of their example. Slowing the sync time down to 3 seconds, and then typing "Why not" and then waiting for it to sync before adding " do this?" on client A and " joke?" on client B. The result was "Why not do this? joke?" when I'd have hoped that this would have been flagged as a conflict. Similarly, starting with "Why not?" and adding both " do this" and " joke" in the different clients produced "Why not do this joke?" even though to me, that should have been a conflict - both were inserting different content between "t" and "?".

Finally, changing "do" to "say" in client A and THEN changing "do" to "read" in client B before it updated, actually resulted in a conflict in the log window and the resultant merge was "Why not rayead this joke?" Clearly this merge strategy isn't that great here, as it doesn't seem to be renumbering the version numbers based on the losing side (or I've misunderstood what they're actually doing).

loading story #47400903

stainlu11 hours ago | parent | next

[dead]

vinayaksodar9 hours ago | parent | next

[dead]

truetraveller12 hours ago | parent

Very likely AI slop, very hard to read. Too many indications. HN should have another rule: explicitly mention if article was written (primarily) by AI.

antics12 hours ago | parent | next

I'm the author. Literally 0% of this was written with AI. Not an outline, not the arguments, not a single word in any paragraph. We agonized over every aspect of this article: the wording, the structure, and in particular, about whether we were being fair to Yjs. We moved the second and third section around constantly. About a dozen people reviewed it and gave feedback.

EDIT: I will say I'm not against AI writing tools or anything like that. But, for better or worse, that's just not what happened here.

truetraveller8 hours ago | root | parent

[dead]

comex11 hours ago | parent | next

It doesn’t strike me as AI. The writing is reasonably information-dense and specific, logically coherent, a bit emotional. Rarely overconfident or vague. If it is AI then there was a lot more human effort put into refining it than most AI writing I’ve read.

utopiah12 hours ago | parent

Funnily enough I had 2 HN tabs open, this one and https://news.ycombinator.com/item?id=47394004

#visit	13,133,975
#session	74,665
#live-session	0