Hacker News new | past | comments | ask | show | jobs | submit

Pg_parquet: An extension to connect Postgres and parquet

https://www.crunchydata.com/blog/pg_parquet-an-extension-to-connect-postgres-and-parquet
Parquet itself is actually not that interesting. It should be able to read (and even write) Iceberg tables.

Also, how does it compare to pg_duckdb (which adds DuckDB execution to Postgres including reading parquet and Iceberg), or duck_fdw (which wraps a DuckDB database, which can be in memory and only pass-through Iceberg/Parquet tables)?

loading story #41874177
loading story #41874044
Cool, would this be better than using a clickhouse / duckdb extension that reads postgres and saves to Parquet?

What would be recommended to output regularly old data to S3 as parquet file? To use a cron job which launches a second Postgres process connecting to the database and extracting the data, or using the regular database instance? doesn't that slow down the instance too much?

loading story #41874097
I wish RDS made it easy to add custom extensions like this.
loading story #41875299
loading story #41874650
Why not just federate Postgres and parquet files? That way the query planner can push down as much of the query and reduce how much data has to move about?
Congratulations! I'm happy to see the PostgreSQL license.
{"deleted":true,"id":41872476,"parent":41871068,"time":1729190496,"type":"comment"}