Hacker News new | past | comments | ask | show | jobs | submit
You’re correct that it would be absurd to build a DC, but you left out the next-best thing, and the one that is VERY financially attractive: colo’ing. I can rent 1U for around $50-75/month, or if I want HA-ish (same rack in the same DC isn’t exactly HA, but it solves for hardware failure anyway), 5U would probably run $200-250/month or so, and that lets you run two nodes with HAProxy or what-have-you, sharing a virtual IP, fronting three worker nodes running K8s, or a Proxmox cluster, or whatever. The hardware is also stupidly cheap, because you don’t need anything remotely close to new, so for about $200/node, you’ll have more cores and memory than you know what to do with.

The DC will handle physical service for you if something breaks, you just pay for parts and labor.

All of this requires knowledge, of course, but it’s hardly an impossible task. Go look at what the more serious folk in r/homelab (or r/datacenter) are up to; it’ll surprise you.

For $200/month, I can have 2 ALBs, 2 ECS services, 2 CloudWatch log groups, and 2 RDS instances on AWS (one each for dev and prod) and a GitHub Team account with enough included runner minutes to cover most deployments. A colo is going to be more hassle, and I'll have to monitor more things (like system upgrades and intrusion attempts). I'd also have to amortize parts and labor as part of the cost, which is going to push the price up. If I need all that capacity, then the colo is definitely the better bet. But if I don't, and a small shop usually doesn't, then managed infrastructure is going to be preferable.
For $200/month and all that auxiliary infrastructure, those two RDS instances will be running on the equivalent compute power of an iPhone 5s…
loading story #43060090
A Postgres `db.m6g.large` (the cheapest non-burstable instance) runs $114/month for a single AZ, and that's not counting storage or bandwidth. A `db.t4g.medium` runs $47/month, again, not counting storage or bandwidth. An ALB that somehow only consumed a single LCU per month would run $22. The rest of the mentioned items will vary wildly depending on the application, but between those and bandwidth - not to mention GitHub's fees - I sincerely doubt you'd come in anywhere close to $200. $300 maybe, but as the sibling comment mentioned, the instances you'll have will be puny in comparison.

> I'll have to monitor more things (like system upgrades and intrusion attempts)

You very much should be monitoring / managing those things on AWS as well. For system upgrades, `unattended-upgrades` can keep security patches (or anything else if you'd like, but I wouldn't recommend that unless you have a canary instance) up to date for you. For kernel upgrades, historically it's reboots, though there have been a smattering of live update tools like kSplice, kGraft, and the latest addition from GEICO of all places, tuxtape [0].

> I'd also have to amortize parts and labor as part of the cost, which is going to push the price up.

Given the prices you laid out for AWS, it's not multi-AZ, but even single-AZ can of course failover with downtime. So I'll say you get 2U, with two individual servers, DBs either doing logical replication w/ failover, or something like DRBD [1] to present the two servers' storage as a single block device (you'd still need a failover mechanism for the DBs). So $400 for two 1U servers, and maybe $150/month at most for colo space. Even with the (IMO unrealistically low) $200/month quote for AWS, at 5 months, you're now saving $50/month. Re: parts and labor, luckily, parts for old servers is incredibly cheap. PC3-12800R 16GiB sticks are $10-12. CPUs are also stupidly cheap. Assuming Ivy Bridge era (yes, this is old, yes, it's still plenty fast for nearly any web app), even the fastest available (E5-2697v2) is $50 for a matched pair.

I don't say all of this just guessing; I run 3x Dell R620s along with 2x Supermicros in my homelab. My uptime for services is better than most places I've worked at (of course, I'm the only one doing work, I get that). They run 24/7/365, and in the ~5 years or so I've had these, the only trouble the Dells have given me is one bad PSU (each server has redundant PSUs, so no big deal), and a couple of bad sticks of RAM. One Supermicro has been slightly less reliable but to be fair, a. it has a hodgepodge of parts b. I modded its BIOS to allow NVMe booting, so it's not entirely SM's fault.

EDIT: re: backups in your other comment, run ZFS as your filesystem (for a variety of reasons), periodically snapshot, and then send those off-site to any number of block storage providers. Keep the last few days, with increasing granularity as you approach today, on the servers as well. If you need to roll back, it's incredibly fast to do so.

[0]: https://github.com/geico/tuxtape

[1]: https://linbit.com/drbd/

loading story #43060480
loading story #43065278
loading story #43060513