Hacker News new | past | comments | ask | show | jobs | submit

Show HN: An open-source implementation of AlphaFold3

https://github.com/Ligo-Biosciences/AlphaFold3
This seems really neat!

DeepMind and AlphaFold are clearly moving in a closed-source direction, since they created Isomorphic Labs as a division of Alphabet essentially focused on doing this stuff closed source. In theory it seems nice for academic tools to have an open source version, although I'm not familiar enough with this field to point to a specific benefit of it.

So what's your plan for the company itself, do you intend to continue working on this open source project as part of your business model, or was it more of a one-off? Your website seems very nonspecific about what exactly you intend to be selling.

Our long term goal is to design enzymes for chemical manufacturing. We decided to build AlphaFold3 because we had seen how useful AlphaFold2 had been for the protein design field. No one else was building it fast enough for us, so we decided we should do it ourselves. We are committed to training and open-sourcing the full version with ligand and nucleic acid prediction capabilities as well since it is so useful for the biotech industry.
Have you considered publishing your own paper about your implementation? It would make it easier to cite in the literature later on. Would major journals accept such a paper? I would assume they would if they really had questions about reproducibility.
OpenFold, which was AlphaFold2's open-source implementation was published in Nature Methods. We will prepare a similar publication once the model is more mature and when we have a nice set of experiments showing the model's interesting properties.
Hi, how are predictions verified? Does one still do experimental techniques (X-ray crystallography, cryogenic-em etc.) one you have the prediction? Or are predictions so close to reality you can progress without experiment?
The predictions can be verified by comparing the predicted structure to the experimentally solved structure, either crystal or cryoEM. The model is still training and improving, we will release the benchmarking results after it's complete.
Thanks for releasing this, I've been looking forward to a truly open version I can use in a commercial setting. What a way to launch the company!
You probably want to change the name of this implementation as it's not truly AlphaFold3. I wouldn't be surprised if you got a C&D from DM for using the name.
Yes this is a good point. We are actively speaking with our counsel to check this. Thanks for flagging, though.
loading story #41449014
I did a very brief stint on computational proteomics. That stuff is absolutely next level.
Amazing! What kind of things did you work on?
loading story #41451321
Does this win the Folding@home competition, or is/was that a different goal than what AlphaFold3 and ligo-/AlphaFold3 already solve for?

Folding@Home https://en.wikipedia.org/wiki/Folding@home :

> making it the world's first exaflop computing system

loading story #41465775
Folding@home uses Rosetta, a physics-based approach that is outperformed by deep learning methods such as AlphaFold2/3.
loading story #41465719
If I'm understanding correctly, the model code itself is only a tiny proportion of the challenge. The training compute and training data are far bigger parts.

Google has access to training compute on a scale perhaps nobody else has.

Is that really the case though? Available compute sounds unlikely to be the limiting factor here, compared to data which is way scarcer than what's being used to train LLMs, and I suspect Google used mostly publicly available data for training unless they signed deals beforehand with biotechnology companies which have access to more data. That's possible of course, but that doesn't feel very google-y.
Yes, all data Google used was public. We have enough compute from YC (thanks YC!) to do this. The main thing is the technical infrastructure - processing the data, efficient loading at training time, proper benchmarking, etc. We are building these now.
loading story #41465645
What's your next step? Why did you decide to focus on enzyme design?
We think enzymes are super cool! You can build molecular assembly lines at the atomic scale with them. Many pharmaceuticals are already manufactured with enzymes such as the diabetes drug Januvia. Engineering them is a big bottleneck though - takes years and millions of dollars. We want to speed this up with AI-powered design. Next step is ligand-protein prediction capability of AlphaFold3, which is also super useful for modelling enzyme-substrate interactions.
loading story #41452265
loading story #41453870
loading story #41448800