Hacker News new | past | comments | ask | show | jobs | submit
> The biggest problem: developers don’t want GPUs. They don’t even want AI/ML models. They want LLMs. System engineers may have smart, fussy opinions on how to get their models loaded with CUDA, and what the best GPU is. But software developers don’t care about any of that. When a software developer shipping an app comes looking for a way for their app to deliver prompts to an LLM, you can’t just give them a GPU.

I'm increasingly coming to the view that there is a big split among "software developers" and AI is exacerbating it. There's an (increasingly small) group of software developers who don't like "magic" and want to understand where their code is running and what it's doing. These developers gravitate toward open source solutions like Kubernetes, and often just want to rent a VPS or at most a managed K8s solution. The other group (increasingly large) just wants to `git push` and be done with it, and they're willing to spend a lot of (usually their employer's) money to have that experience. They don't want to have to understand DNS, linux, or anything else beyond whatever framework they are using.

A company like fly.io absolutely appeals to the latter. GPU instances at this point are very much appealing to the former. I think you have to treat these two markets very differently from a marketing and product perspective. Even though they both write code, they are otherwise radically different. You can sell the latter group a lot of abstractions and automations without them needing to know any details, but the former group will care very much about the details.

> There's an (increasingly small) group of software developers who don't like "magic" and want to understand where their code is running and what it's doing. These developers gravitate toward open source solutions like Kubernetes

Kubernetes is not the first thing that comes to mind when I think of "understanding where their code is running and what it's doing"...

Indeed, I have to wonder how many people actually understand Kubernetes. Not just as a “user” but exactly all what it is doing behind the scenes…

Just an “idle” Kubernetes system is a behemoth to comprehend…

I keep seeing this opinion and I don't understand it. For various reasons, I recently transitioned from a dev role to running a 60+ node, 14+ PB bare metal cluster. 3 years in, and the only thing ever giving me trouble is Ceph.

Kubernetes is etcd, apiserver, and controllers. That's exactly as many components as your average MVC app. The control-loop thing is interesting, and there are a few "kinds" of resources to get used to, but why is it always presented as this insurmountable complexity?

I ran into a VXLAN checksum offload kernel bug once, but otherwise this thing is just solid. Sure it's a lot of YAML but I don't understand the rep.

“etcd, apiserver, and controllers.”

…and containerd and csi plugins and kubelet and cni plugins and kubectl and kube-proxy and ingresses and load balancers…

loading story #43055789
These components are very different in complexity and scope. Let's be real: a seasoned developer is mostly familiar with load balancers and ingress controllers, so this will be mostly about naming and context. I agree though once you learn about k8s it becomes less mysterious but that also means the author hasn't pushed it to the limits. Outages in the control plane could be pretty nasty and it is easy to have them by creating an illusion everything is kind of free in k8s.
A really simple setup for many smaller organisations wouldn't have a load balancer at all.
No load balancer means... entering one node only? Doing DNS RR over all the nodes? If you don't have a load balancer in front, why are you even using Kubernetes? Deploy a single VM and call it a day!

I mean, in my homelab I do have Kubernetes and no LB in front, but it's a homelab for fun and learn K8s internals. But in a professional environment...

No code at all even - just use excel
loading story #43055713
... and kubernetes networking, service mesh, secrets management
You arent' forced to use service mesh and complex secrets management schemes. If you add them to the cluster is because you value what they offer you. It's the same thing as kubernetes itself - I'm not sure what people are complaining about, if you don't need what kubernetes offers, just don't use it.

Go back to good ol' corsync/pacemaker clusters with XML and custom scripts to migrate IPs and set up firewall rules (and if you have someone writing them for you, why don't you have people managing your k8s clusters?).

Or buy something from a cloud provider that "just works" and eventually go down in flames with their indian call centers doing their best but with limited access to engineering to understand why service X is misbehaving for you and trashing your customer's data. It's trade-offs all the way.

> …and containerd and csi plugins and kubelet and cni plugins (...)

Do you understand you're referring to optional components and add-ons?

> and kubectl

You mean the command line interface that you optionally use if you choose to do so?

> and kube-proxy and ingresses and load balancers…

Do you understand you're referring to whole classes of applications you run on top of Kubernetes?

I get it that you're trying to make a mountain out of a mole hill. Just understand that you can't argue that something is complex by giving as your best examples a bunch of things that aren't really tied to it.

It's like trying to claim Windows is hard, and then your best example is showing a screenshot of AutoCAD.

How’s kubelet and cni are “optional components”? What do you mean by that?
CNI is optional, you can have workloads bind ports on the host rather than use an overlay network (though CNI plugins and kube-proxy are extremely simple and reliable in my experience, they use VXLAN and iptables which are built into the kernel and that you already use in any organization who might run a cluster, or the basic building blocks of your cloud provider).

CSI is optional, you can just not use persistent storage (use the S3 API or whatever) or declare persistentvolumes that are bound to a single or group of machines (shared NFS mount or whatever).

I don't know how GP thinks you could run without the other bits though. You do need kubelet and a container runtime.

kubelet isn't, but CNI technically is (or can be abstracted to minimum, I think old network support might have been removed from kubelet nowadays)
loading story #43055314
loading story #43057337
loading story #43061668
I consider a '60+ node' kubernetes cluster is very small. Kubernetes at that scale is genuinely excellent! At 6000, 60000, and 600000 nodes it becomes very different and goes from 'Hey, this is pretty great' to 'What have I done?' The maintenance costs of running more than a hundred clusters is incredibly nontrivial especially as a lot of folks end up taking something open-source and thinking they can definitely do a lot better (you can.... there's a lot of "but"s there though).
OK but the alternative if you think Kubernetes is too much magic when you want to operate hundreds of clusters with tens of thousands of nodes is?

Some bash and Ansible and EC2? That is usually what Kubernetes haters suggest one does to simplify.

At a certain scale, let's say 100k+ nodes, you magically run into 'it depends.' It can be kubernetes! It can be bash, ansible, and ec2! It can be a custom-built vm scheduler built on libvirt! It can be a monster fleet of Windows hyper-v hosts! Heck, you could even use Mesos, Docker Swarm, Hashicorp Nomad, et al.

The main pain point I personally see is that everyone goes 'just use Kubernetes' and this is an answer, however it is not the answer. It steamrolling all conversations leads to a lot of the frustration around it in my view.

loading story #43062355
loading story #43061227
The wheels fall off kubernetes at around 10k nodes. One of the main limitations is etcd from my experience, google recently fixed this problem by making spanner offer an etcd compatible API: https://cloud.google.com/blog/products/containers-kubernetes...

Etcd is truly a horrible data store, even the creator thinks so.

At that point you probably need a cluster of k8s clusters, no?

For anyone unfamiliar with this the "official limits" are here, and as of 1.32 it's 5000 nodes, max 300k containers, etc.


loading story #43069584
Hey fellow k8s+ceph on bare metaler! We only have a 13 machine rack and 350tb of raw storage. No major issues with ceph after 16.x and all nvme storage though.
loading story #43057694
loading story #43055273
I think "core" kubernetes is actually pretty easy to understand. You have the kubelet, which just cares about getting pods running, which it does by using pretty standard container tech. You bootstrap a cluster by reading the specs for the cluster control plane pods from disk, after which the kubelet will start polling the API it just started for more of the same. The control plane then takes care of scheduling more pods to the kubelets that have joined the cluster. Pods can run controllers that watch the API for other kinds of resources, but one way or another, most of those get eventually turned into Pod specs that get assigned to a kubelet to run.

Cluster networking can sometimes get pretty mind-bending, but honestly that's true of just containers on their own.

I think just that ability to schedule pods on its own requires about that level of complexity; you're not going to get a much simpler system if you try to implement things yourself. Most of the complexity in k8s comes from components layered on top of that core, but then again, once you start adding features, any custom solution will also grow more complex.

If there's one legitimate complaint when it comes to k8s complexity, it's the ad-hoc way annotations get used to control behaviour in a way that isn't discoverable or type-checked like API objects are, and you just have to be aware that they could exist and affect how things behave. A huge benefit of k8s for me is its built-in discoverability, and annotations hurt that quite a bit.

loading story #43057520
loading story #43056940
loading story #43062386
loading story #43057725
loading story #43056798
One may think Kubernetes is complex (I agree), but I haven't seen alternative that simultaneously allows to:

* Host hundreds or thousands of interacting containers across multiple teams in sane manner * Let's you manage and understand how is it done in the full extent.

Of course there are tons of organizations that can (and should) easily resign from one of these, but if you need both, there isn't better choice right now.

But how many orgs need that scale?
Something I've discovered is that if you're a small team doing something new, off the shelf products/platforms are almost certainly not optimized to your use case.

What looks like absurd scale to one team is a regular Tuesday for another, because "scale" is completely meaningless without context. We don't balk at a single machine running dozens of processes for a single web browser, we shouldn't balk at something running dozens of containers to do something that creates value somehow. And scale that up by number of devs/customers and you can see how thousands/hundreds of thousands can happen easily.

Also the cloud vendors make it easy to have these problems because it's super profitable.

You can run single-node k3s on a VM with 512MB of RAM and deploy your app with a hundred lines of JSON, and it inherits a ton of useful features that are managed in one place and can grow with your app if/as needed. These discussions always go in circles between Haters and Advocates:

* H: "kubernetes [at planetary scale] is too complex"

* A: "you can run it on a toaster and it's simpler to reason about than systemd + pile of bash scripts"

* H: "what's the point of single node kubernetes? I'll just SSH in and paste my bash script and call it a day"

* A: "but how do you scale/maintain that?"

* H: "who needs that scale?"

The sad thing is there probably is a toaster out there somewhere with 512MB of RAM.
loading story #43065345
loading story #43059140
loading story #43060354
Most of the ones that are profitable for cloud providers.
loading story #43067295
I agree with the blog post that using K8s + containers for GPU virtualization is a security disaster waiting to happen. Even if you configure your container right (which is extremely hard to do), you don't get seccomp-bpf.

People started using K8s for training, where you already had a network isolated cluster. Extending the K8s+container pattern to multi-tenant environments is scary at best.

I didn't understand the following part though.

> Instead, we burned months trying (and ultimately failing) to get Nvidia’s host drivers working to map virtualized GPUs into Intel Cloud Hypervisor.

Why was this part so hard? Doing PCI passthrough with the Cloud Hypervisor (CH) is relatively common. Was it the transition from Firecracker to CH that was tricky?

This has actually brought up an interesting point. Kubernetes is nothing more than an API interface. Should someone be working on building a multi-tenant Kubernetes (so that customers don't need to manage nodes or clusters) which enforces VM-level security (obviously you cannot safely co-locate multiple tenants containers on the same VM)?
loading story #43055138
Great opportunity for someone ballsy to write a book about kubernetes internals for the general engineering population.

Bonus points for writing a basic implementation from first principles capturing the essence of the problem kubernetes really was meant to solve.

The 100 pages kubernetes book, Andriy Burkov style.

You might be interested in this:


It probably won't answer the "why" (although any LLM can answer that nowadays), but it will definitely answer the "how".

loading story #43059788
loading story #43058000
Kubernetes in Action book is very good.
I actually have the book and I agree it is very good.
> Great opportunity for someone ballsy to write a book about kubernetes internals for the general engineering population.

What would be the interest of it? Think about it:

- kubernetes is an interface and not a specific implementation,

- the bulk of the industry standardized on managed services, which means you actually have no idea what are the actual internals driving your services,

- so you read up on the exact function call that handles a specific aspect of pod auto scaling. That was a nice read. How does that make you a better engineer than those who didn't?

I don't really care about the standardized interface.

I just want to know how you'd implement something that would load your services and dependencies from a config file, bind them altogether, distribute the load through several local VMs and make it still work if I kill the service or increase the load.

In less than 1000 lines.

> I don't really care about the standardized interface.

Then you seem to be confused, because you're saying Kubernetes but what you're actually talking about is implementing a toy container orchestrator.

loading story #43059673
loading story #43063039
If you have a system that's actually big or complex enough to warrant using Kubernetes, which, to be frank, isn't really that much considering the realities of production, the only thing more complex than Kubernetes is implementing the same concepts but half-assed.

I really wonder why this opinion is so commonly accepted by everyone. I get that not everything needs most Kubernetes features, but it's useful. The Linux kernel is a dreadfully complex beast full of winding subsystems and full of screaming demons all over. eBPF, namespaces, io_uring, cgroups, SE Linux, so much more, all interacting with eachother in sometimes surprising ways.

I suspect there is a decent likelihood that a lot of sysadmins have a more complete understanding of what's going on in Kubernetes than in Linux.

> If you have a system that's actually big or complex enough to warrant using Kubernetes (...)

I think there's a degree of confusion over your understanding of what Kubernetes is.

Kubernetes is a platform to run containerized applications. Originally it started as a way to simplify the work of putting together clusters of COTS hardware, but since then its popularity drove it to become the platform instead of an abstraction over other platforms.

What this means is that Kubernetes is now a standard way to deploy cloud applications, regardless of complexity or scale. Kubernetes is used to deploy apps to raspberry pis, one-box systems running under your desk, your own workstation, one or more VMs running on random cloud providers, and AWS. That's it.

loading story #43058548
loading story #43057477
loading story #43063144
Really? There are plenty of valid criticisms of kubernetes, but this doesn't strike me as one of them. It gives you tons of control over all of this. That's a big part of why it's so complex!
IMO, it's rather hard to fully know all of kubernetes and what it's doing, and the kind of person who demands elegance in solutions will hate it.
loading story #43054977
loading story #43054938
loading story #43054811
loading story #43057262
Yeah no I wouldn't touch Kubernetes with a 10' pole. Way too much abstraction.
If my understanding is right, the gist seems to be that you create one or more docker containers that your application can run on, describe the parameters they require e.g. ram size/cuda capability/when you need more instances, and kubernetes provisions them out to the machines available to it based on those parameters. It's abstract but very tractibly so IMO, and it seems like a sensible enough way to achieve load balancing if you keep it simple. I plan to try it out on some machines of mine just for fun/research soon.
It's systemd but distributed across multiple nodes and with containers instead of applications. Instead of .service files telling the init process how to start and and monitor executables, you have charts telling the controller how to start and monitor containers.
It's worth noting that "container" and "process" are pretty similar abstractions. A lot of people don't realize this, but a container is sort of just a process with a different filesystem root (to oversimplify). That arguably is what a process should be on a server.
No, they are not. I'm not sure who started this whole container is just a process thing, but it's not a good analogy. Quite a lot of things you spin up containers for have multiple processes (databases, web servers, etc).

Containers are inherently difficult to sum up in a sentence. Perhaps the most reasonable comparison is to liken them to a "lightweight" vm, but the reasons people use them are so drastically different than vms at this point. The most common usecase for containers is having a decent toolchain for simple, somewhat reproducible software environments. Containers are mostly a hack to get around the mess we've made in software.

Having multiple processes under one user in an operating system is more akin to having multiple threads in one process than you think. The processes don't share a virtual memory space or kernel namespaces and they don't share PID namespaces, but that's pretty much all you get from process isolation (malware works because process isolation is relatively weak). The container adds a layer that goes around multiple processes (see cgroups), but the cgroup scheduling/isolation mechanism is very similar to the process isolation mechanism, just with a new root filesystem. Since everything Linux does happens through FDs, a new root filesystem is a very powerful thing to have. That new root filesystem can have a whole new set of libraries and programs in it compared to the host, but that's all you have to do to get a completely new looking computing environment (from the perspective of Python or Javascript).

A VM, in contrast, fakes the existence of an entire computer, hardware and all. That fake hardware comes with a fake disk on which you put a new root filesystem, but it also comes with a whole lot of other virtualization. In a VM, CPU instructions (eg CPUID) can get trapped and executed by the VM to fake the existence of a different processor, and things like network drivers are completely synthetic. None of that happens with containers. A VM, in turn, needs to run its own OS to manage all this fake hardware, while a container gets to piggyback on the management functions of the host and can then include a very minimal amount of stuff in its synthetic root.

loading story #43059426
> I'm not sure who started this whole container is just a process thing, but it's not a good analogy. Quite a lot of things you spin up containers for have multiple processes (databases, web servers, etc).

It came from how Docker works, when you start a new container it runs a single process in the container, as defined in the Dockerfile.

It's a simplification of what containers are capable of and how they do what they do, but that simplification is how it got popular.

loading story #43059377
> Containers are inherently difficult to sum up in a sentence.

Super easy if we talk about Linux. It's a process tree being spawned inside it's own set of kernel namespaces, security measures and a cgroup to provide isolation from the rest of the system.

If someone doesn't understand "container", I'm supposed to expect them to understand all the namespaces and their uses, cgroups, and the nitty gritty of the wimpy security isolation? You are proving my point that it's tough to summarize by using a bunch more terms that are difficult to summarize.

Once you recursively expand all the concepts, you will have multiple dense paragraphs, which don't "summarize" anything, but instead provide full explanations.

If you throw out the Linux tech from my explanation, it would become a general description which holds up even for Windows.
loading story #43056465
loading story #43060189
loading story #43060221
loading story #43054796
I enjoy the details, but I don’t get paid to tell my executives how we’re running things. I get paid to ship customer facing value.

Particularly at startups, it’s almost always more cost effective to hit that “scale up” button from our hosting provider than do any sort of actual system engineering.

Eventually, someone goes “hey we could save $$$$ by doing XYZ” so we send someone on a systems engineering journey for a week or two and cut our bill in half.

None of it really matters, though. We’re racing against competition and runway. A few days less runway isn’t going to break a startup. Not shipping as fast as reasonable will.

loading story #43054633
loading story #43055061
This is context based dichotomy, not a person-based one.

In my personal life, I’m curiosity-oriented, so I put my blog, side projects and mom’s chocolate shop on fully self hosted VPSs.

At my job managing a team of 25 and servicing thousands of customers for millions in revenue, I’m very results-oriented. Anyone who tries to put a single line of code outside of a managed AWS service is going to be in a lot of trouble with me. In a results-oriented environment, I’m outsourcing a lot of devops work to AWS, and choosing to pay a premium because I need to use the people I hire to work on customer problems.

Trying to conflate the two orientations with mindsets / personality / experience levels is inaccurate. It’s all about context.

This is a false dichotomy. The truth is we are constantly moving further and further away from the silicon. New developers don't have as much need to understand these details because things just work; some do care because they work at a job where it's required, or because they're inherently interested (a small number).

Over time we will move further away. If the cost of an easily managed solution is low enough, why do the details matter?

> The truth is we are constantly moving further and further away from the silicon.

Are we? We're constantly changing abstractions, but we don't keep adding them all that often. Operating systems and high-level programming languages emerged in the 1960s. Since then, the only fundamentally new layer of abstraction were virtual machines (JVM, browser JS, hardware virtualization, etc). There's still plenty of hardware-specific APIs, you still debug assembly when something crashes, you still optimize databases for specific storage technologies and multimedia transcoders for specific CPU architectures...

Maybe fundamentally is an extremely load bearing word here, but just in the hardware itself we see far more abstraction than we saw in the 60s. The difference between what we called microcode in an 8086 and what is running in any processor you buy in 2025 is an abyss. It almost seems like hardware emulation. I could argue that the layers of memory caching that modern hardware have are themselves another layer vs the days when we sent instructions to change which memory banks to read. The fact that some addresses are very cheap and others are not, and the complexity is handled in hardware is very different than stashing data in extra registers we didn't need this loop. The virtualization any OS does for us is much deeper than even a big mainframe that was really running a dozen things at once. It only doesn't look like additional layers if you look from a mile away.

The majority of software today is written without knowing even which architecture the processor is going to be, how much of the processor we are going to have, whether anything will ever fit in memory... hell, we can write code that doesn't know not just the virtual machine it's going to run in, but even the family of virtual machine. I have written code that had no idea if it was running in a JVM, LLVM or a browser!

So when I compare my code from the 80s to what I wrote this morning, the distance from the hardware doesn't seem even remotely similar. I bet someone is writing hardware specific bits somewhere, and that maybe someone's debugging assembly might actually resemble what the hardware runs, maybe. But the vast majority of code is completely detached from anything.

At the company I work for, I routinely mock the software devs for solving every problem by adding yet another layer of abstraction. The piles of abstractions these people levy is mind numbingly absurd. Half the things they are fixing, if not more, are created by the abstractions in the first place.
Yeah, I remember watching a video of (I think?) a European professor who helped with an issue devs were having in developing The Witness. Turns out they had a large algorithm they developed in high level code (~2000 lines of code? can't remember) to place flora in the game world, which took minutes to process, and it was hampering productivity. He looked at it all, and redid almost all of it in something like <20 lines of assembly code, and it achieved the same result in microseconds. Unfortunately, I can't seem to find that video anymore...

Frankly though, when I bring stuff like this up, it feels like I'm being mocked than the other way around - like we're the minority. And sadly, I'm not sure if anything can ultimately be done about it. People just don't know what they don't know. Some things you can't tell people despite trying to, they just won't get it.

It's Casey Muratori, he's an American gamedev, not a professor. The video was a recorded guest lecture for a uni in the Netherlands though.

And it wasn't redone in assembly, it was C++ with SIMD intrinsics, which might as well just be assembly.


See Plato's Cave. As an experienced dev, I have seen sunlight and the outside world and so many devs think shadows puppets in a cave is life.


That really sounds like no one bothered profiling the code. Which I'd say is underengineered, not over.
what's your point exactly? what do you hope to achieve by "bringing it up" (I assume in your workplace)?

most programmers are not able to solve a problem like that in 20 lines of assembly or whatever, and no amount of education or awareness is going to change that. acting as if they can is just going to come across as arrogant.

loading story #43059761
loading story #43057428
> There's still plenty of hardware-specific APIs, you still debug assembly when something crashes, you still optimize databases for specific storage technologies and multimedia transcoders for specific CPU architectures...

You might, maybe, but an increasing proportion of developers:

- Don't have access to the assembly to debug it

- Don't even know what storage tech their database is sitting on

- Don't know or even control what CPU architecture their code is running on.

My job is debugging and performance profiling other people's code, but the vast majority of that is looking at query plans. If I'm really stumped, I'll look at the C++, but I've not yet once looked at assembly for it.

This makes sense to me. When I optimize, the most significant gains I find are algorithmic. Whether it's an extra call, a data structure that needs to be tweaked, or just utilizing a library that operates closer to silicon. I rarely need to go to assembly or even a lower level language to get acceptable performance. The only exception is occasionally getting into architecture specifics of a GPU. At this point, optimizing compilers are excellent and probably have more architecture details baked into them than I will ever know. Thank you, compiler programmers!
> At this point, optimizing compilers are excellent

the only people that say this are people who don't work on compilers. ask anyone that actually does and they'll tell you most compiler are pretty mediocre (tend to miss a lot of optimization opportunities), some compilers are horrendous, and a few are good in a small domain (matmul).

It's more that the God of Moore's Law have given us so many transistors that we are essentially always I/O blocked, so it effectively doesn't matter how good our assembly is for all but the most specialized of applications. Good assembly, bad assembly, whatever, the point is that your thread is almost always going to be blocked waiting for I/O (disk, network, human input) rather than something that a fancy optimization of the loop that enables better branch prediction can fix.
loading story #43056226
loading story #43054965
loading story #43056002
I don’t understand how you could say something like HTTP or Cloud Functions or React aren’t abstractions that software developers take for granted.
These days even if one writes in machine code it will be quite far away from the real silicon as that code has little to do with what CPU is actually doing. I suspect that C source code from, say, nineties was closer to the truth than the modern machine code.
loading story #43058672
The abstraction manifest more on the language level. No memory management, simplified synchronization primitives, no need of compilation.

Not sure virtual machine are fundamentally different. In the end if you have 3 virtual or 3 physical machine the most important difference is how fast you can change their configuration. They will still have all the other concepts (network, storage, etc.). The automation that comes with VM-s is better than it was for physical (probably), but then automation for everything got better (not only for machines).

The details matter because someone has to understand the details, and it's quicker and more cost-effective if it's the developer.

At my job, a decade ago our developers understood how things worked, what was running on each server, where to look if there were problems, etc. Now the developers just put magic incantations given to them by the "DevOps team" into their config files. Most of them don't understand where the code is running, or even what much of it is doing. They're unable or unwilling to investigate problems on their own, even if they were the cause of the issue. Even getting them to find the error message in the logs can be like pulling teeth. They rely on this support team to do the investigation for them, but continually swiveling back-and-forth is never going to be as efficient as when the developer could do it all themselves. Not to mention it requires maintaining said support team, all those additional salaries, etc.

(I'm part of said support team, but I really wish we didn't exist. We started to take over Ops responsibilities from a different team, but we ended up taking on Dev ones too and we never should've done that.)


This blog has a brilliant insight that I still remember more than a decade later: we live in a fantasy setting, not a Sci-fi one. Our modern computers are so unfathomable complex that they are demons, ancient magic that can be tamed and barely manipulated, but not engineered. Modern computing isn't Star Trek TNG, where Captain Picard and Geordi LaForge each have every layer of their starship in their heads with full understanding, and they can manipulate each layer independently. We live in a world where the simple cell phone in our pocket contains so much complexity that it is beyond any 10 human minds combined to fully understand how the hardware, the device drivers, the OS, the app layer, and the internet all interact between each other.

> We live in a world where the simple cell phone in our pocket contains so much complexity that it is beyond any 10 human minds combined to fully understand how the hardware, the device drivers, the OS, the app layer, and the internet all interact between each other.

Try tens of thousands of people. A mobile phone is immensely more complicated than people realize.

Thank you for writing it so eloquently. I will steal it.

loading story #43054831
> why do the details matter?

This statement encapsulates nearly everything that I think is wrong with software development today. Captured by MBA types trying to make a workforce that is as cheap and replaceable as possible. Details are simply friction in a machine that is obsessed with efficiency to the point of self-immolation. And yet that is the direction we are moving in.

Details matter, process matters, experience and veterancy matters. Now more than ever.

I used to think this, ut it only works if the abstractions hold - it’s like if we stopped random access memory and went back to tape drives suddenly abstractions matter.

My comment elsewhere goes into but more detail but basically silicon stopped being able to make single threaded code faster in about 2012 - we just have been getting “more parallel cores” since. And now at wafer scale we see 900,000 cores on a “chip”. When 100% parallel coding runs 1 million times faster than your competitors, when following one software engineering path leads to code that can run 1M X, then we will find ways to use that excess capacity - and the engineers who can do it get to win.

I’m not sure how LLMs face this problem.


As soon as the abstractions leak or you run into an underlying issue you suddenly need to understand everything about the underlying system or you're SOOL.

I'd rather have a simpler system I already understand all the proceeding abstractions about.

The overhead of this is minimal when you keep things simple and avoid shiny things.

That's why abstractions like PyTorch exist. You can write a few lines of Python and get good utilization of all those GPU cores.
loading story #43055040
loading story #43054578
loading story #43054782
loading story #43055017
loading story #43054576
loading story #43054737
loading story #43054759
You described two points in an spectrum in which:

One end is PaaS like Heroku, where you just git push. The other end is bare metal hosting.

Every option you mentioned (VPS, Manages K8S, Self Hosted K8S, etc) they all fall somewhere between these two ends of the spectrum.

If, a developer falls into any of these "groups" or has a preference/position on any of these solutions, they are just called juniors.

Where you end up in this spectrum is a matter of cost benefit. Nothing else. And that calculation always changes.

Those options only make sense where the cost of someone else managing it for you for a small premium gets higher than the opportunity/labor cost of you doing it yourself.

So, as a business, you _should_ not have a preference to stick to. You should probably start with PaaS, and as you grow, if PaaS costs get too high, slowly graduate into more self-managed things.

A company like fly.io is a PaaS. Their audience has always been, and will always be application developers who prefer to do nothing low-level. How did they forget this?

loading story #43054895
loading story #43054498
Aren’t we just continually moving up layers of abstractions? Most of the increasingly small group doesn’t concern itself with voltages, manually setting jumpers, hand-rolling assembly for performance-critical code, cache line alignment, raw disk sector manipulation, etc.

I agree it’s worthwhile to understand things more deeply but developers slowly moving up layers of abstractions seems like it’s been a long term trend.

loading story #43054524
> I'm increasingly coming to the view that there is a big split among "software developers" and AI is exacerbating it.

I don't think this split exists, at least in the way you framed it.

What does exist is workload, and problems that engineers are tasked with fixing. If you are tasked with fixing a problem or implementing a feature, you are not tasked with learning all the minute details or specifics of a technology. You are tasked with getting shit done, which might even turn out to not involve said technology. You are paid to be a problem-solver, not an academic expert on a specific module.

What you tried to describe as "magic" is actually the balance between broad knowledge vs specialization, or being a generalist vs specialist. The bulk of the problems that your average engineer faces requires generalists, not specialists. Moreover, the tasks that actually require a specialist are rare, and when those surface the question is always whether it's worth to invest in a specialist. There are diminished returns on investment, and throwing a generalist at the problem will already get some results. You give a generalist access to a LLM and he'll cut down on the research time to deliver something close to what a specialist would deliver. So why bother?

With this in mind, I would go as far as to frame a scenario backhandedly described as "want to understand where their code is running and what it's doing" (as if no engineer needs to have insight on how things work?) as opposed to the dismissive "just wants to `git push` and be done with it" scenario, can actually be classified as a form of incompetence. You,as an engineer, only have so many hours per day. Your day-to-day activities involve pushing new features and fixing new problems. To be effective, your main skillet is learn the system in a JIT way, dive in, fix it, and move on. You care about system traits, not low-level implementation details that may change tomorrow on a technology you may not even use tomorrow. If, instead, you feel the need to waste time on topics that are irrelevant to address the immediate needs of your role, you are failing to deliver value. I mean, if you frame yourself as a Kubernetes expert who even know commit hashes by heart, does that matter if someone asks you, say, why is a popup box showing off-center?

I'm not entirely certain. Or perhaps we're all part of both groups.

I want to understand LLMs. I want to understand my compiler, my gc, my type system, my distributed systems.

On the other hand, I don't really care about K8s or anything else, as long as I have something that works. Just let me `git push` and focus on making great things elsewhere.

loading story #43055651
I am the former. I also make cost benefit based decisions that involve time. Unless I have very specific configuration needs, the git push option lets me focus on what my users care about and gives me one less thing that I need to spend my time on.

Increasingly, Fly even lets you dip into most complex configurations too.

I’ve got no issue with using Tofu and Ansible to manage my own infrastructure but it takes time to get it right and it’s typically not worth the investment early on in the lifecycle.

>who don't like "magic" and want to understand where their code is running and what it's doing.

I just made this point in a post on my substack. Especially in regulated industries, you NEED to the able to explain your AI to the regulator. You can't have a situation where a human say "Well, gee I don't know. The AI told me to do it."

"Enjoys doing linux sysadmin" is not the same as "Wants to understand how things work". It's weird to me that you group those two kinds of people in one bucket.
I feel like fly.io prioritizes a great developer experience and I think that appeals to engineers who both do and don't like magic.

But the real reason I like fly.io is because it is a new thing that allows for new capabilities. It allows you to build your own Cloudflare by running full virtual machines colocated next to appliances in a global multicast network.

> they're willing to spend a lot of (usually their employer's) money

May just be my naïveté, but I thought that something like ECS or EKS is much cheaper than an in-house k8 engineer.

loading story #43055424
loading story #43055141
> There's an (increasingly small) group of software developers who don't like "magic" and want to understand where their code is running and what it's doing.

That problem started so long ago and has gotten so bad that I would be hard pressed to believe there is anyone on the planet who could take a modern consumer pc and explain what exactly is going on the machine without relying on any abstractions to understand the actual physical process.

Given that, it’s only a matter of personal preference on where you draw the line for magic. As other commenters have pointed out, your line allowing for Kubernetes is already surprising to a lot of people

> I'm increasingly coming to the view that there is a big split among "software developers" and AI is exacerbating it

This is admittedly low effort but the vast majority of devs are paid wages to "write CRUD, git push and magic" their way to the end of the month. The company does not afford them the time and privilege of sitting down and analyzing the code with a fine comb. An abstraction that works is good enough.

The seasoned seniors get paid much more and afforded leeway to care about what is happening in the stack, since they are largely responsible for keeping things running. I'm just pointing out it might merely be a function of economics.

I call this difference being a developer who is on call vs being a developer who is not on call
I don't think this is entirely correct. I'm working for a company that does IT Consulting and so I see many Teams working on many different Projects and one thing I have learned the hard way is that Companies and Teams that think they should do it all themselves are usually smaller companies and they often have a lot of Problems with that attitude.

Just an example I recently came across: Working for a smaller company that uses Kubernetes and manages everything themselves with a small team. The result: They get hacked regularly and everything they run is constantly out of date because they don't have the capacity to actually manage it themselves. And it's not even cheaper in the long run because Developer Time is usually more expensive than just paying AWS to keep their EKS up to date.

To be fair, in my home lab I also run everything bare metal and keep it updated but I run everything behind a VPN connection and run a security scanner every weekend that automatically kills any service it finds > Medium Level CVE and I fix it when I get the time to do it.

As a small Team I can only fix so much and keep so much up to date before I get overwhelmed or the next customer Project gets forced upon me by Management with Priority 0, who cares about security updates.

I'd strongly suggest to use as much managed service as you can and focus your effort as a team on what makes your Software Unique. Do you really need to hire 2-3 DevOps guys just to keep everything running when GCP Cloud Run "just werks"?

Everything we do these days runs on so many levels of abstraction anyway, it's no shame to share cost of managing the lower levels of abstraction with others (using managed Service) and focus on your product instead. Unless you are large enough to pay for whole teams that deal with nothing but infrastructure to enable other teams to do Application Level Programming you are, in my limited experience, just going to shoot yourself in the foot.

And again, just to emphasize it: I like to do everything myself because for privacy reasons I use as little services that aren't under my control as possible but I would not recommend this to a customer because it's neither economical nor does it work well in my, albeit limited, experience.

I might be an outlier. I like to think I try for a deeper understanding of what I’m using. Like, fly uses firecracker vms afaik. Sometimes, especially for quick projects or testing ideas I just want to have it work without wrangling a bunch of AWS services. I’m typically evaluating is this the right tool or service and what is the price to convenience? For anything potentially long term, what’s the amount of lock in when or if I want to change providers?
Also "They want LLMs" lol. I cant remember being asked. I dont want to use AI for coding.
I agree that split exists, and that the former is more rare, but in my experience the split is less about avoid magic and more about keeping control of your system.

Many, likely most, developers today don't care about controlling their system/network/hardware. There's nothing wrong with that necessarily, but it is a pretty fundamental difference.

One concern I've had with building LLM features is whether my customers would be okay with me giving their data over to the LLM vendor. Say I'm building a tool for data analysis, is it really okay to a customer for me to give their table schemas or access to the data itself to OpenAI, for example?

I rarely hear that concern raised though. Similarly when I was doing consulting recently, I wouldn't use copilot on client projects as I didn't want copilot servers accessing code that I don't actually own the rights to. Maybe its over protective though, I have never heard anyone raise that concern so maybe its just me.

loading story #43060956
It’s not about wanting, it’s about what the job asks for. As a self employed engineer I am paid to solve business problems in an efficient way. Most of the time it just make more business sense for the client and for me to pay to just have to git push if there is no performance challenges needing custom infrastructure.
I don't agree, I think you're just describing two sides of the same coin.

As a software developer I want strong abstractions without bloat.

LLMs are so successful in part because they are a really strong abstraction. You feed in text and you get back text. Depending on the model and other parameters your results may be better or worse, but changing from eg. Claude to ChatGPT is as simple as swapping out one request with another.

If what I want is to run AI tasks, then GPUs are a poor abstraction. It's very complicated (as Fly have discovered) to share them securely. The amount of GPU you need could vary dramatically. You need to worry about drivers. You need to worry about all kinds of things. There is very little bloat to the ChatGPT-style abstraction, because the network overhead is a negligable part of the overall cost.

If I say I don't want magic, what I really mean is that I don't trust the strength of the abstraction that is being offered. For example, when a distributed SQL database claims to be PostgreSQL compatible, it might just mean it's wire compatible, so none of my existing queries will actually work. It might have all the same functions but be missing support for stored procedures. The transaction isolation might be a lie. It's not that these databases are bad, it's that "PostgreSQL as a whole" cannot serve as a strong abstraction boundary - the API surface is simply too large and complex, and too many implementation details are exposed.

It's the same reason people like containers: running your application on an existing system is a very poor abstraction. The API surface of a modern linux distro is huge, and includes everything from what libraries come pre-installed to the file-system layout. On the other hand the kernel API is (in comparison) small and stable, and so you can swap out either side without too much fear.

K8S can be a very good abstraction if you deploy a lot of services to multiple VMs and need a lot of control over how they are scaled up and down. If you're deploying a single container to a VM, it's massively bloated.

TLDR: Abstractions can be good and bad, both inherently, and depending on your use-case. Make the right choice based on your needs. Fly are probably correct that their GPU offering is a bad abstraction for many of their customer's needs.

I don't think you got this split right.

I prefer to either manage software directly with no wrappers on top, or use a fully automated solution.

K8S is something I'd rather avoid. Do you enjoy writing configuration for your automation layer?

All professional developers want two things: Do their work as fast as possible and spend as little budget to make things work. That's the core operating principle of most companies.

What's changing is that managed solutions are becoming increasingly easier to set up and increasingly cheaper on smaller scales.

While I do personally enjoy understanding the entire stack, I can't justify self-hosting and managing an LLM until we run so many prompts a day that it becomes cheaper for us to run our own GPUs compared to just running APIs like OpenAI/Anthropic/Deepseek/...

I was thinking about this just yesterday. I was advertised a device for an aircraft to geo-assist taxiing? (I have never flown so I don’t know why). The comments were the usual “old man shouts at cloud” angry that assistive devices make lives easier for people.

I feel this is similar to what you are pointing out. Why _shouldn’t_ people be the “magic” users. When was the last time one of your average devs looked in to how esm loading? Or the python interpreter or v8? Or how it communicates with the OS and lower level hardware interfacing?

This is the same thing. Only you are goalpost shifting.

> There's an (increasingly small) group of software developers who don't like "magic" and want to understand where their code is running and what it's doing. (...) The other group (increasingly large) just wants to `git push` and be done with it

I think we're approaching the point where software development becomes a low-skilled job, because the automatic tools are good enough to serve business needs, while manual tools are too difficult to understand by anyone but a few chosen ones anyway.

I think it's true that engineers who want to understand every layer of everything in depth, or who want to have platform ownership, are not necessarily the same group as the more "product itself" focused sort who want to write something and just push it, I don't actually think I'm sold at all that any of these groups, in a vacuum, have substantial demand for GPU compute unless that's someone's area of interest for a pet project.
loading story #43060335
> The other group (increasingly large) just wants to `git push` and be done with it, and they're willing to spend a lot of (usually their employer's) money to have that experience. They don't want to have to understand DNS, linux, or anything else beyond whatever framework they are using.

lol, even understanding git is hard for them. Increasingly, software engineers don't want to learn their craft.

loading story #43060882
The way I think about it is this: any individual engineer (or any individual team) has a limited complexity budget (in other words, how much can you fit in your meat brain). How you spend it is a strategic decision. Depending on your project, you may not want to waste it on infra so you can fit a lot of business logic complexity.
increasingly small is right. i'm definitely part of that former group but sadly more and more these days i just feel dumb for being this way. it usually just means that i'm less productive than my colleagues in practice as i'm spending time figuring out how things work while everybody else is pushing commits. maybe if we were put in a hypothetical locked room with no internet access i'd have a slightly easier time than them but that's not helpful to anybody.

once upon a time i could have said that it's better this way and that everybody will be thankful when i'm the only person who can fix something, but at this point that isn't really true when anybody can just get an LLM to walk them through it if they need to understand what's going on under the hood. really i'm just a nerd and i need to understand if i want to sleep at night lol.

You lost me at "Kubernetes".
The former have the mentality of being independent, at the cost of their ability to produce a result as quickly. The latter are happy to be dependent, because the result is more important than the means. Obviously this is a spectrum.
It depends on the product you're building. At my last job we hosted bespoke controlnet-guided diffusion models. That means k8s+GPUs was a necessity. But I would have loved to use something simpler than k8s.
I don’t think this comment does justice to fly.io.

They have incredible defaults that can make it as simple as just running ‘git push’ but there isn’t really any magic happening, it’s all documented and configurable.

Where does this dichotomy between Kubernetes, and superficial understanding come from? It is not consistent with my experience, and I don't have speculation its origin.
It's been a while since I tried, but my experience trying to manually set up GPUs was atrocious, and with investigation generally ending at the closed-source NVidia drivers it's easy to feel disempowered pretty quickly. I think my biggest learning from trying to do DL on a manually set up computer was simply that GPU setup was awful and I never wanted to deal with it. It's not that I don't want to understand it, but with NVidia software you're essentially not allowed to understand it. If open source drivers or open GPU hardware were released, I would gladly learn how that works.
The latter group sounds like they're more managers than software developers.
The view that developers just want LLMs is plain wrong. The age of AI is just starting.
Somebody who doesn’t want to understand DNS, Linux, or anything beyond their framework is a hazard. They’re not able to do a competent code review on the vomit that LLMs produce. (Am I biased much?)
I'd be in the latter group if my budget were infinite. Alas!
> They don't want to have to understand DNS, linux, or anything else beyond whatever framework they are using.

tell me whether there's many brick layers who wants to understand the chemical composition of their bricks.

loading story #43056076
loading story #43067023
IaaS or PaaS?

Who owns and depreciates the logs, backups, GPUs, and the database(s)?

K8s docs > Scheduling GPUs: https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus... :

> Once you have installed the plugin, your cluster exposes a custom schedulable resource such as amd.com/gpu or nvidia.com/gpu.

> You can consume these GPUs from your containers by requesting the custom GPU resource, the same way you request cpu or memory

awesome-local-ai: Platforms / full solutions https://github.com/janhq/awesome-local-ai?platforms--full-so...

But what about TPUs (Tensor Processing Units) and QPUs (Quantum Processing Units)?

Quantum backends: https://github.com/tequilahub/tequila#quantum-backends

Kubernetes Device Plugin examples: https://kubernetes.io/docs/concepts/extend-kubernetes/comput...

Kubernetes Generic Device Plugin: https://github.com/squat/generic-device-plugin#kubernetes-ge...

K8s GPU Operator: https://docs.nvidia.com/datacenter/cloud-native/gpu-operator...

Re: sunlight server and moonlight for 120 FPS 4K HDR access to GPU output over the Internet: https://github.com/kasmtech/KasmVNC/issues/305#issuecomment-... :

> Still hoping for SR-IOV in retail GPUs.

> Not sure about vCPU functionality in GPUs

Process isolation on vCPUs with or without SR-IOV is probably not as advanced as secure enclave approaches.

Intel SGX is a secure enclave capability, which is cancelled on everything but Xeon. FWIU there is no SGX for timeshared GPUs.

What executable loader reverifies the loaded executable in RAM after imit time ?

What LLM loader reverifies the in-RAM model? Can Merkle hashes reduce that cost; of nn state verification?

Can it be proven that a [chat AI] model hosted by someone else is what is claimed; that it's truly a response from "model abc v2025.02"?

PaaS or IaaS

Having worked with many of the latter and having had the displeasure of educating them on nix systems fundamentals: ugh, oof, I hate this timeline, yet I also feel a sense of job security.

We used to joke about this a lot when Java devs would have memory issues and not know how to adjust the heap size in init scripts. So many “CS majors” who are completely oblivious to anything happening outside of the JVM, and plenty happening within it.

Eh, the way I see it the entire practice of computer science and software engineering is built on abstraction -- which can be described as the the ability to not have to understand lower levels -- to only have to understand the API and not the implementations of the lowest levels you are concerned with, and to have to pay even less attention to lower levels than that.

I want to understand every possible detail about my framework and language and libraries. Like I think I understand more than many do, and I want to understand more, and find it fulfilling to learn more. I don't, it's true, care to understand the implementation details of, say, the OS. I want to know the affordances it offers me and the APIs that matter to me, I don't care about how it's implemented. I don't care to understand more about DNS than I need. I definitely don't care to spend my time futzing with kubernetes -- I see it as a tool, and if I can use a different tool (say heroku or fly.io) that lets me not have to learn as much -- so I have more time to learn every possible detail of my language and framework, so I can do what I really came to do, develop solutions as efficiently and maintainably as possible.

You are apparently interested in lower levels of abstraction than I am. Which is fine! Perhaps you do ops/systems/sre and don't deal with the higher levels of abstraction as much as I do -- that is definitely lucrative these days, there are plenty of positions like that. Perhaps you deal with more levels of abstraction but don't go as deep as me -- or, and I totally know it's possible, you just have more brain space to go as deep or deeper on more levels of abstraction as me. But even you probably don't get into the implementation details of electrical engineering and CPU design? Or if you do, and also go deep on frameworks and languages, I think you belong to a very very small category!

But I also know developers who, to me, dont' want to go to deep on any of the levels of abstraction. I admit I look down on them, as I think you do too, they seem like copy-paste coders who will never be as good at developing efficient maintainable soltuions.

I started this post saying I think that's a different axis than what layers of abstraction one specializes in or how far down one wants to know the details. But as I get here, while I still think that's likely, I'm willing to consider that these developers I have not been respecting -- are just going really deep in even higher levels of abstraction than me? Some of them maybe, but honestly I don't think most of them, but I could be wrong!

> They don't want to have to understand DNS, linux, or anything else beyond whatever framework they are using.

This is baffling. What’s value proposition here? At some point customer will be directly asking an AI agent to create an app for them and it will take care of coding/deployment for them..

Some people became software developers because they like learning and knowing what they're doing, and why and how it works.

Some people became software developers because they wanted to make easy money back when the industry was still advertising bootcamps (in order to drive down the cost of developers).

Some people simply drifted into this profession by inertia.

And everything in-between.

From my experience there are a lot of developers who don't take pride in their work, and just do it because it pays the bills. I wouldn't want to be them but I get it. The thing is that by delegating all their knowledge to the tools they use, they are making themselves easy to replace, when the time comes. And if they have to fix something on their own, they can't. Because they don't understand why and how it works, and how and why it became what it is instead of something else.

So they call me and ask me how that thing works...

This is my experience as well. I answer many such calls from devs as part of my work.

I can usually tell at the end of a call which group they belong to. I've been wrong a few times too.

As long as they don't waste my time I'm fine with everyone, some people just have other priorities in life.

One thing I'd say is in my experience there are many competent and capable people in every group, but non-competent ones are extremely rare in the first group.

The value proposition is that you know how to fix something, when it eventually break, because you don’t fundamentally understand.
loading story #43054665