Story Detail of id 43069445 | Liveview Hacker News

sgarland1 week ago | on: We were wrong about GPUs

> The right way of cooking RDS in AWS is to go serverless from the start

Nitpick, but there is no Serverless for RDS, only Aurora. The two are wildly different in their architecture and performance characteristics. Then there's RDS Multi-AZ Cluster, which is about as confusingly named as they could manage, but I digress.

Let's take your stated Minimum ACU of 1 as an example. That gives you 2 GiB of RAM, with "CPU and networking similar to what is available in provisioned Aurora instances." Since I can't find anything more specific, I'll compare it to a `t4g.small`, which has 2 vCPU (since it's ARM, it's actual cores, not threads), and 0.128 / 5.0 Gbps [0] baseline/burst network bandwidth, which is 8 / 625 MBps. That burst is best-effort, and also only lasts for 5 – 60 minutes [1] "depending on instance size." Since this is tiny, I'm going to assume the low end of that scale. Also, since this is Aurora, we have to account for both [2] client <--> DB and DB-compute (each node, if more than one) <--> DB-storage bandwidth. Aurora Serverless v2 is $0.12/hour, or $87.60/month, plus storage, bandwidth, and I/O costs.

So we have a Postgres-compatible DB with 2 CPUs, 2 GiB of RAM, and 64 Mbps of baseline network bandwidth that's shared between application queries and the cluster volume. Since Aurora doesn't use the OS page cache, its `shared_buffers` will be set to ~75% of RAM, or 1.5 GiB. Memory will also be consumed by the various processes, like the WAL writer, background writer, auto-vacuum daemon, and of course, each connection spawns a process. For the latter reason, unless you're operating at toy scale (single-digit connections at any given time), you need some kind of connection pooler with Postgres. Keeping in the spirit of letting AWS do everything, they have RDS Proxy, which despite the name, also works with Aurora. That's $0.015/ACU-hour, with a minimum 8 ACUs for Aurora Serverless, or $87.60/month.

Now, you could of course just let Aurora scale up in response to network utilization, and skip RDS Proxy. You'll eventually bottleneck / it won't make any financial sense, but you could. I have no idea how to model that pricing, since it depends on so many factors.

I went on about network bandwidth so much because it catches people by surprise, especially with Aurora, and doubly so with Postgres for many services. The reason is its WAL amplification from full page writes [3]. If you have a UUIDv4 (or anything else non-k-sortable) PK, the B+tree is getting thrashed constantly, leading to slower performance on reads and writes. Aurora doesn't suffer from the full page writes problem (that's still worth reading about and understanding), but it does still have the same problems with index thrashing, and it also has the same issues as Postgres with Heap-Only Tuple updates [4]. Unless you've carefully designed your schema around this, it's going to impact you, and you'll have more network traffic than you expected. Add to that dev's love of chucking everything into JSON[B] columns, and the tuples are going to be quite large.

Anyway, I threw together an estimate [5] with just Aurora (1 ACU, no RDS Proxy, modest I/O), 2x ALBs with an absurdly low consumption, and 2x ECS tasks. It came out to $232.52/month.

[0]: https://docs.aws.amazon.com/ec2/latest/instancetypes/gp.html...

[1]: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-inst...

[2]: https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide...

[3]: https://www.rockdata.net/tutorial/tune-full-page-writes/

[4]: https://www.postgresql.org/docs/current/storage-hot.html

[5]: https://calculator.aws/#/estimate?id=8972061e6386602efdc2844...

kbolino1 week ago | parent | next

I don't know much about Aurora, there was/is a little too much magic there for my taste, but I feel like once we start with "Postgres-compatible DB", we can't necessarily reason about how things perform under the hood in terms of ordinary Postgres servers. Is there a detailed breakdown of Aurora and its performance/architecture out there? My experience is that AWS is cagey about the details to maintain competitive advantage.

inkyoto6 days ago | root | parent | next

There are some details available[0], although they are scarce.

I procured my Aurora intel from a lengthy phone conversation with an exceptionally knowledgeable (and excessively talkative) AWS engineer – who had worked on Aurora – several years ago. The engineer provided detailed explanations of Aurora’s architecture, do's, and dont's as part of our engagement with AWS. The engineer was very proud of AWS’ accomplishments (and I concur that their «something serverless» products are remarkable engineering feats as well as significant cost-saving solutions for me and my clients). The engineer was willing to share many non-sensitive technical details. Generally speaking, a sound understanding of distributed architectures and networks should be sufficient to grasp Aurora Serverless. The actual secret sauce lies in the fine-tuning and optimisations.

[0] https://muratbuffalo.blogspot.com/2024/07/understanding-perf...

sgarland1 week ago | root | parent

You can find some re:Invent talks about it, but the level of depth may not be what you want.

The tl;dr is they built a distributed storage system that is split across 3 AZs, each with 2 storage nodes. Storage is allocated in 10 GiB chunks, called protection groups (perhaps borrowing from Ceph’s placement group terminology), with each of these being replicated 6x across the nodes in AZs as mentioned. 4/6 are required for quorum. Since readers are all reading from the same volume, replica lag is typically minimal. Finally, there are fewer / no (not positive honestly; I have more experience with MySQL-compatible Aurora) checkpoints and full page writes.

If you’ve used a networked file system with synchronous writes, you’ll know that it’s slow. This is of course exacerbated with a distributed system requiring 4/6 nodes to ack. To work around this, Aurora has “temporary local storage” on each node, which is a fixed size proportional to the instance size. This is used for sorts that spill to disk, and building secondary indices. This has the nasty side effect that if your table is too large for the local storage, you can’t build new indices, period. AWS will tell you “upsize the instance,” but IMO it’s extremely disingenuous to tout the ability for 128 TiB volumes without mentioning that if a single table gets too big, your schema becomes essentially fixed in place.

Similarly, MySQL normally has something called a change buffer that it uses for updating secondary indices during writes. Can’t have that with Aurora’s architecture, so Aurora MySQL has to write through to the cluster volume, which is slow.

AWS claims that Aurora is anywhere from 3-5x faster than the vanilla versions of the respective DBs, but I have never found this to be true. I’ve also had the goalposts shifted when arguing this point, with them saying “it’s faster under heavy write contention,” but again, I have not found this to be true in practice. You can’t get around data locality. EBS is already networked storage; requiring 4/6 quorum across 3 physically distant AZs makes it even worse.

The 64 TiB limit of RDS is completely arbitrary AFAIK, and is purely to differentiate Aurora. Also, if you have a DB where you need that, and you don’t have a DB expert on staff, you’re gonna have a bad time.

inkyoto6 days ago | parent

Thanks for the correction, it is Aurora, indeed, although it can be found under the «RDS» console in AWS.

Aurora is actually not a database but is a scalable storage layer that operates over the network and is decoupled from the query engine (compute). The architecture has been used to implement vastly different query engines on top of it (PgSQL, MySQL, DocumentDB – a MongoDB alternative, and Neptune – a property graph database / triple store).

The closest abstraction I can think of to describe Aurora is a VAX/VMS cluster – where the consumer sees a single entity, regardless of size, whilst the scaling (out or back in) remains entirely opaque.

Aurora does not support RDS Proxy for PostgreSQL or its equivalents for other query engine types because it addresses cluster access through cluster endpoints. There are two types of endpoints: one for read-only queries («reader endpoints» in Aurora parlance) and one for read-mutate queries («writer endpoint»). Aurora supports up to 15 reader endpoints, but there can be only one writer endpoint.

Reader endpoints improve the performance of non-mutating queries by distributing the load across read replicas. The default Aurora cluster endpoint always points to the writer instance. Consumers can either default to the writer endpoint for all queries or segregate non-mutating queries to reader endpoints for faster execution.

This behaviour is consistent across all supported query engines, such as PostgreSQL, Neptune, and DocumentDB.

I do not think it is correct to state that Aurora does not use the OS page cache – it does, as there is still a server with an operating system somewhere, despite the «serverless» moniker. In fact, due to its layered distributed architecture, there is now more than one OS page cache, as described in [0].

Since Aurora is only accessible over the network, it introduces unique peculiarities where the standard provisions of storage being local do not apply.

Now, onto the subject of costs. A couple of years ago, an internal client who ran provisioned RDS clusters in three environments (dev, uat, and prod) reached out to me with a request to create infrastructure clones of all three clusters. After analysing their data access patterns, peak times, and other relevant performance metrics, I figured that they did not need provisioned RDS and would benefit from Aurora Serverless instead – which is exactly what they got (unbeknownst to them, which I consider another net positive for Aurora). The dev and uat environments were configured with lower upper ACU's, whilst production had a higher upper ACU configuration, as expected.

Switching to Aurora Serverless resulted in a 30% reduction in the monthly bill for the dev and uat environments right off the bat and nearly a 50% reduction in production costs compared to a provisioned RDS cluster of the same capacity (if we use the upper ACU value as the ceiling). No code changes were required, and the transition was seamless.

Ironically, I have discovered that the AWS cost calculator consistently overestimates the projected costs, and the actual monthly costs are consistently lower. The cost calculator provides a rough estimate, which is highly useful for presenting the solution cost estimate to FinOps or executives. Unintentionally, it also offers an opportunity to revisit the same individuals later and inform them that the actual costs are lower. It is quite amusing.

[0] https://muratbuffalo.blogspot.com/2024/07/understanding-perf...

sgarland6 days ago | root | parent

> Aurora is actually not a database but is a scalable storage layer that operates over the network and is decoupled from the query engine (compute).

They call it [0] a database engine, and go on to say "Aurora includes a high-performance storage subsystem.":

> "Amazon Aurora (Aurora) is a fully managed relational database engine that's compatible with MySQL and PostgreSQL."

To your point re: part of RDS, though, they do say that it's "part of RDS."

> The architecture has been used to implement vastly different query engines on top of it (PgSQL, MySQL, DocumentDB – a MongoDB alternative, and Neptune – a property graph database / triple store).

Do you have a source for this? That's new information to me.

> Aurora does not support RDS Proxy for PostgreSQL

Yes it does [1].

> I do not think it is correct to state that Aurora does not use the OS page cache – it does

It does not [2]:

> "Conversely, in Amazon Aurora PostgreSQL, the default value [for shared_buffers] is derived from the formula SUM(DBInstanceClassMemory/12038, -50003). This difference stems from the fact that Amazon Aurora PostgreSQL does not depend on the operating system for data caching." [emphasis mine]

Even without that explicit statement, you could infer it from the fact that the default value for `effective_cache_size` in Aurora Postgres is the same as that of `shared_buffers`, the formula given above.

> Switching to Aurora Serverless resulted in a 30% reduction in the monthly bill for the dev and uat environments right off the bat

Agreed, for lower-traffic clusters you can probably realize savings by doing this. However, it's also likely that for Dev/Stage/UAT environments, you could achieve the same or greater via an EventBridge rule that starts/stops the cluster such that it isn't running overnight (assuming the company doesn't have a globally distributed workforce).

What bothers me most about Aurora's pricing model is charging for I/O. And yes, I know they have an alternative pricing model that doesn't do so (but the baseline is of course higher); it's the principal of the thing. The amortized cost of wear to disks should be baked into the price. It would be difficult for a skilled DBA with plenty of Linux experience to accurately estimate how many I/O a given query might take. In a vacuum for a cold cache, it's not that bad: estimate or look up statistics for row sizes, determine if any predicates can use an index (and if so, the correlation of the column[s]), estimate index selectivity, if any, confirm expected disk block size vs. Postgres page size, and make an educated guess. If you add any concurrent queries that may be altering the tuples you're viewing, it's now much harder. If you then add a distributed storage layer, which I assume attempts to boxcar data blocks for transmission much like EBS does, it's nearly impossible. Now try doing that if you're a "cloud native" type who hasn't the faintest idea what blktrace [3] is.

[0]: https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide...

[1]: https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide...

[2]: https://aws.amazon.com/blogs/database/determining-the-optima...

[3]: https://linux.die.net/man/8/blktrace

#visit	12125664
#session	46877
#live-session	0