Hacker News new | past | comments | ask | show | jobs | submit
I'm interested in these kind of kernels to run very high performance network/IO specific services on bare metal, with minimal system complexity/overheads and hopefully better (potential) stability and security.

The big concern I have however is hardware support, specifically networking hardware.

I think a very interesting approach would be to boot the machine with a FreeBSD or Linux kernel, just for the purposes of hardware as well as network support, and use a sort of Rust OS/abstraction layer for the rest, bypassing or simply not using the originally booted kernel for all user land specific stuff.

Couldn't you just boot the Linux kernel directly and launch a generic app as pid 1 instead of a full blown init system with a bunch of daemons?

That's basically what you're getting with Docker containers and a shared kernel. AWS Lambda is doing something similar with dedicated kernels with Firecracker VMs

Yes, but I wanted to bypass having the complexity of the Linux kernel completely, too.

Basically single app directly to network (the world) and as little as possible else in between.

Yes, you can. You can even have a different Pid 1 configure whatever and then replace it's core image with the new Pid 1.
If you want truly high-performance networking, you can bypass the kernel altogether with DPDK. So you don't have to worry about alternative kernels for other tasks at all. On the downside, DPDK takes over the NIC entirely, removing the kernel from the equation, so if you need the kernel to see network traffic for some reason, it won't work for you.

You can check out hardware support here: https://core.dpdk.org/supported/nics/

This was true a decade ago, with modern io_uring dpdk is probably an anti-pattern.
Interesting, it's been awhile since I looked at this stuff so I did a little searching and found this: https://www.diva-portal.org/smash/get/diva2:1789103/FULLTEXT...

Their conclusion is io_uring is still slower but not by much, and future improvements may make the difference negligible. So you're right, at least in part. Given the tradeoffs, DPDK may not be worth it anymore.

There are also just a bunch of operational hassles with using DPDK or SPDK. Your usual administrative commands don't work. Other operations aren't intermediated by the kernel -- instead you need 100% dedicated application devices. Device counters usually tracked by the kernel aren't. Etc. It can be fine, but if io_uring doesn't add too much overhead, it's a lot more convenient.
"io_uring had a maximum throughput of 5.0 Gbit/s "

Wut? More than 10 years ago, a cheap beige box could saturated a 1Gbps link with a kernel as it came from e.g. Debian w/o special tuning. A somewhat more expensive box could get a good share of a 10Gbps link (using Jumbo frames), so these new results are, er, somewhat underwhelming.

That's an interesting and valuable study. I was slightly disappointed though that only a single host was used in the 'network' performance tests:

"SR-IOV was used on the NIC to enable the use of virtual functions, as it was the only NIC that was available during the study for testing and therefore the use of virtual functions was a necessity for conducting the experiments."

Not by much?? You're exaggerating..
If you use io_uring, you're subject to vulnerabilities in kernel network stack which you have no control over.
I'm not sure that's true for a good chunk of the workloads that dpdk really shines on.

A lot of the benefit of dpdk is colocating your data and network stack in the same virtual memory context. io_uring I can see getting you there if you have you're serving fixed files as a cdn kind of like netflix's appliances, but for cases where you're actually doing branchy work on the individual requests, dpdk is probably a little easier to scale up to the faster network cards.

i might be wrong but if it's ABI compatible the same drivers will work?

p.s.: i was wrong

>While we prioritize compatibility, it is important to note that Asterinas does not, nor will it in the future, support the loading of Linux kernel modules.

https://asterinas.github.io/book/kernel/linux-compatibility....

Linux doesn't even maintain ABI compatibility with itself, nobody else is going to manage it. The possibility that might work is there's a couple projects that maintain just enough API compatibility to reuse driver code from Linux (IIRC FreeBSD does this for some graphics drivers). But even then you're gambling with whether Linux decides to change implementation details one day, since internal APIs explicitly aren't stable.
The Linux kernel community takes ABI compatibility for userland very seriously. That developers in userland are frequently unwilling to understand issues surrounding ABI stability is not the fault of the Linux kernel.
Oh sure, the user-space ABI is stable; I meant kernel-space. Although I realize now that I failed to write that explicitly.
The past 30 years of the Linux kernel's evolution has proven that there is no need for a stable kernel ABI. That would make refactoring, adding new features and porting to new platforms exceedingly difficult. Pretty much all of the proprietary kernel modules have either become open source or been replaced by open source replacements. The Linux community doesn't need closed source kernel modules for VMWare anymore, and even Nvidia has finally given up on their closed source GPU drivers. Proprietary Linux kernel modules have no place in the modern world.
It depends on your goals, but at least Torvalds believes driver availability is important and unstable ABI is known to hinder driver availability.
> even Nvidia has finally given up on their closed source GPU drivers.

lol. No. They just added a CPU and then offloaded all the closed source userspace driver code to it leaving behind the same dumb open sourceable kernel driver shim as before (ie instead of talking to userspace it talks to the GPU’s CPU).

> The past 30 years of the Linux kernel's evolution has proven that there is no need for a stable kernel ABI.

What the last 30 years have shown is that there is actually a need for it, otherwise DKMS wouldn’t be a thing. Heck, intel’s performance profiler can’t keep up with the kernel changes which means you get to pick running an up to date kernel or be able to use the open source out-of-tree kernel module. The fact that Linux is alone in this should make it clear it’s wrong. Heck Android even wrote their own HAL to try to make it possible to update the kernel on older devices. It’s an economics problem that the Linux kernel gets to pretend doesn’t exist but it’s a bad philosophical position. It’s possible to support refactoring and porting to new platforms while providing ABI compatibility and Linux is way past the point where it would even be a minor inconvenience - all the code has ossified quite a bit anyway.

in general the ABI is kernel<->user space while the ABI (and potentially even API) on the inside (i.e. for drivers) can change with every kernel version (part of why it's so important to maintain drivers in-tree)
They mention this in https://github.com/asterinas/asterinas/blob/2af9916de92f8ca1...

> While we prioritize compatibility, it is important to note that Asterinas does not, nor will it in the future, support the loading of Linux kernel modules.

It's a lot "simpler" to support a Linux userland as that means one needs to "just" emulate all the Linux syscalls, than to implement the literally countless internal APIs needed for drivers etc, as that would otherwise mean literally reimplementing the whole Linux kernel and that's neither realistic, nor too useful.
And that’s not all that simple, as has been experienced by Solaris (never released(?) Linux branded zones, illumos (lx brand), and Windows (WSL1) developers that have tried to make existing kernels act like Linux.

It’s probably easier if the kernel’s key goal is to be compatible with the Linux ABI rather than being compatible with its earlier self while bolting on Linux compatibility.

> emulate all the Linux syscalls

and emulate the virtual filesystems (/proc/...)

No, it means you can run Linux userland/apps on this kernel, to the level/depth which they currently support of course.

They might not yet implement everything that's needed to boot a standard Linux userland but you could say boot straight into a web server built for Linux, instead of booting into init for example.

Why don’t you just use a SmartNIC and P4? It won’t get faster than running on the NIC itself