Hacker News new | past | comments | ask | show | jobs | submit
"But now none of the open source software can compete with AI generative fill, AI denoising, and now AI rotation."

This is a common pattern across many fields. The truly top-end companies are always running ahead of open source.

But that doesn't mean it's a permanent situation. It just means you're looking at it from a point in time where the commercials got there, and open source hasn't yet. Open source will get there, and then Adobe will be ahead on something else.

I've played a bit with "comfyui" over the past few days, a bizarre name for an AI image generation power tool. (And other things, but I have no experience there to know how good it is at those.) It drips with power. The open source world is not generally behind on raw capability. As is often the case, open source's deficiency for generative fill for instance is that A: it offers too much control, too many knobs (e.g., "which of several dozen models would you like to start with?"), and while that's awesome if you know what you're doing, it is not yet at the "circle this and click 'remove'" yet, and B: the motivation and firepower to integrate this all into a slick package is not there. I can definitely do an AI generative fill with open source software, but I'll be exporting an image into comfyui, either building my own generative fill program or grabbing some rando's program online who may or may not be using compatible models or require me to install additional bespoke functionality into comfyui, doing my work, and re-exporting it. The job is done, but it's much more complicated, and most people don't care about the other extra capabilities this workflow yields so for them it's just cost.

It's a very normal pattern in the open source world. Nothing about the current situation particularly gives me cause to worry specially about it.

To be concrete, here's a YouTube video that's to the more advanced side of what you can do in the open source world, which is probably still ultimately simplistic compared to what some people do: https://www.youtube.com/watch?v=ijqXnW_9gzc That entire series is worth a look, and there's more it doesn't cover. You can get incredible control over these tools in the open source world, but it involves listening to some guy on YouTube trying to explain why you might to sometimes use a thing called "dpmpp_2m_sde_gpu"... not exactly normie-friendly.

I mean we've been able to do generative fill and denoising, better in open for a while, its just not as easy (except for video really)

What Adobe does is wrap those things in an easy to use app, and then charge for it, and hopefully not change their licensing again to grab everyones shit again.

Regarding the scheduler (dpmpp) sure adobe doesn't tell you those things, but thats because they found one that worked, removed the options and packaged it up with a bow, comfy and a111 and forge etc, are more complex because they give you EVERYTHING and let you have at it. There are frontends that wipe all that away but they arent successful because like the linux world, people in opensource want to be able to tinker with all the internals and shit, which is why opensource tends to see some groundbreaking optimizations, like taking the Flux model from requiring 30+gb of vram to run to running on 6gb of vram lol