Hacker News new | past | comments | ask | show | jobs | submit

    > Moving to a monorepo didn't change much, and what minor changes it made have been positive.
I'm not sure that this statement in the summary jives with this statement from the next section:

    > In the previous, separate repository world, this would've been four separate pull requests in four separate repositories, and with comments linking them together for posterity.
    > 
    > Now, it is a single one. Easy to review, easy to merge, easy to revert.
IMO, this is a huge quality of life improvement and prevents a lot of mistakes from not having the right revision synced down across different repos. This alone is a HUGE improvement where a dev doesn't accidentally end up with one repo in this branch and forgot to pull this other repo at the same branch and get weird issues due to this basic hassle.

When I've encountered this, we've had to use another repo to keep scripts that managed this. But this was also sometimes problematic because each developer's setup had to be identical on their local file system (for the script to work) or we had to each create a config file pointing to where each repo lived.

This also impacts tracking down bugs and regression analysis; this is much easier to manage in a mono-repo setup because you can get everything at the same revision instead of managing synchronization of multiple repos to figure out where something broke.

I prefer microservices/microrepos _conceptually_, but we had the same experience as your quoted text - making changes to four repos, and backporting those changes to the previous two release branches, means twelve separate PRs to make a change.

Having a centralized configuration library (a shared Makefile that we can pull down into our repo and include into the local Makefile) helps, until you have to make a backwards-incompatible change to that Makefile and then post PRs to every branch of every repo that uses that Makefile.

Now we have almost the entirety of our projects back into one repository and everything is simpler; one PR per release branch, three PRs (typically) for any change that needs backporting. Vastly simpler process and much less room for error.

Isn't there some middle ground between microservices and monorepos? For example, a few large clearly defined projects that talk to each other via versioned APIs?
Assuming you're just referring to repos: not really IMO.

As soon as you split 1 repo into 2 repos you need to start building tooling to support your 2 repos. If your infrastructure is sufficiently robust with 2 repos then you might as well have 3 or 4 or 10. If it's built to _only_ support 2 repos (or 3 or 4) then it's brittle out of the gate.

The value of a monorepo is that you completely eliminate certain classes of problems and take on other classes of problems. Classic trade off. Folks that prefer monorepos take the position that multirepo problems are much harder than monorepo problems most of the time.

loading story #42072350
My only counter argument here, is when those 4 things deploy independently. Sometimes, people will get tricked into thinking a code change is atomic because it is in one commit, when it will lead to a mixed fleet because of deployment realities. In that world, having them separate is easier to work with, as you may have to revert one of the deployments separately from the others.
That's just an argument for not doing "implicit GitOps", treating the tip of your monorepo's main branch as the source-of-truth on the correct deployment state of your entire system. ("Implicit GitOps" sorta-kinda works when you have a 1:1 correspondence between repos and deployable components — though not always! — but it isn't tenable for a monorepo.)

What instead, then? Explicit GitOps. Explicit, reified release specifications (think k8s resource manifests, or Erlang .relup files), one per separately-deploy-cadenced component. If you have a monorepo, then these live also as a dir in the monorepo. CD happens only when these files change.

With this approach, a single PR can atomically merge code and update one or more release specifications (triggering CD for those components), if and when that is a sensible thing to do. But there can also be separate PRs for updating the code vs. "integrating and deploying changes" to a component, if-and-when that is sensible.

loading story #42070853
loading story #42070373
loading story #42069801
If the version numbers of all services built from the PR are identical, you at least have a pretty clear trail to figuring out WTF happened.

Even with a few services, we saw some pretty crunchy issues with people not understanding that service A had version 1.3.1234 of a module and Service B had version 1.3.1245 and that was enough skew to cause problems.

Distinct repos tend to have distinct builds, and sooner or later one of the ten you're building will glitch out and have to be run twice, or the trigger will fail and it won't build at all until the subsequent merge, and having numbers that are close results in a false sense of confidence.

Isn't a mixed fleet always the case once you have more than one server and do rolling updates?
loading story #42070970
loading story #42070350
I thought one of the whole points behind separate (non-mono)repos was to help enforce loose coupling and if you came to a point where a single feature change required PRs on 4 separate repos then that was an indicator that your project needed refactoring as it was becoming to tightly coupled. The example in the article could have been interpreted to mean that they should refactor the functionality for interacting with the ML model into it's own repo so it could encapsulate this aspect of the project. Instead they doubled down on the tighter coupling by putting them in a monorepo (which itself encourages tighter coupling).
loading story #42070928
I felt the same, the author seemed to downplay the success while every effect listed in the article felt like a huge improvement.
There’s nothing preventing you from having a single pull request in for merging branches over multiple repos. There’s nothing preventing you from having a parent repo with a lock file that gives you a single linear set of commits tracking the state of multiple repos.

That is, if you’re not tied to using just Github of course.

Big monorepos and multiple repo solutions require some tooling to deal with scaling issues.

What surprises me is the attitude that monorepos are the right solution to these challenges. For some projects it makes sense yes, but it’s clear to me that we should have a solution that allows repositories to be composed/combined in elegant ways. Multi-repository pull requests should be a first class feature of any serious source code management system. If you start two projects separately and then later find out you need to combine their history and work with them as if they were one repository, you shouldn’t be forced to restructure the repositories.

loading story #42066395
loading story #42066283
It's not as much of a pain if your tooling supports git repos as dependencies. For example a typical multi-repo PR for us with rust is 1) PR against library 2) PR against application that points dependency to PR's branch, makes changes 3) PR review 4) PR 1 is approved and merged 5) PR 2 is changed to point to new master branch of commit 6) PR 2 is approved and merged

Same idea if you use some kind of versioning and release system. It's still a bit of a pain with all the PRs and coordination involved, but at every step every branch is consistent and buildable, you just check it out and hit build.

This is obviously more difficult if you have a more loosely coupled architecture like microservices. But that's self-inflicted pain

loading story #42068051
ironically was gonna come and comment on that same second block of text.

We went from monorepo to multi-repo at work and it's been a huge set back and disappointment with the devs because it's what our contractors recommended.

I've asked for a code deploy and everything and it's failed in prod due to a missing check in

loading story #42064807
loading story #42066279
The issue you faced stemmed from the previous best practice of "everything in its own repository." This approach caused major issues. Such as versioning challenges and data model inconsistencies you mentioned. The situations it could lead to are comedy sketches, but it's a real pain especially when you’re part of a team struggling with these problems. And it’s almost impossible to convince a team to change direction once they’ve committed to it.

Now, though, it seems the pendulum has swung in the opposite direction, from “everything in its own repo” to “everything in one repo.” This, too, will create its own set of problems, which also can be comedic, but frustrating to experience. For instance, what happens when someone accidentally pushes a certificate or API key and you need to force an update upstream? Coordinating that with 50 developers spread across 8 projects, all in a single repo.

Instead we could also face the problems we currently face and start out wirn a balanced approach. Start with one repository, or split frontend and backend if needed. For data pipelines that share models with the API, keep them in the same repository, creating a single source of truth for the data model. This method has often led to other developers telling me about the supposed benefits of “everything in its own repo.” Just as I pushed back then, I feel the need to push back now against the monorepo trend.

The same can be said for monoliths and microservices, where the middle ground is often overlooked in discussions about best practices.

They all reminded me of the concept of “no silver bullet”[0]. Any decision will face its own unique challenges. But silver bullet solution can create artificial challenges that are wasteful, painful, and most of all unnecessary.

[0]https://en.m.wikipedia.org/wiki/No_Silver_Bullet

loading story #42071065