MindCast News

AI-powered news platform providing intelligent analysis and truth scoring for the digital age.

Legal

  • Privacy Policy
  • Terms of Service
  • AI Content Disclaimer

Contact

  • legal@mindcast.news
  • privacy@mindcast.news
  • corrections@mindcast.news

© 2025 MindCast News. Intelligent broadcasting powered by AI.

MINDCAST NEWS

Intelligent Broadcasting • AI-Powered Analysis

LIVE
--:--:--
HOMEALL NEWSTECHNOLOGYBUSINESSSCIENCE
382 Articles • Updated Jun 6, 2025

📢 Ad Space Available

Configure ad networks in environment variables

HomeNewsTechnologyArticle
Show HN: PgDog – Shard Postgres without extensions

Loading...

TECHNOLOGY

Show HN: PgDog – Shard Postgres without extensions

May 20, 2025 • 12:32 AM
Source: Hacker News
View Original

Hey HN! Lev here, author of PgDog (<a href="https:&#x2F;&#x2F;github.com&#x2F;pgdogdev&#x2F;pgdog">https:&#x2F;&#x2F;github.com&#x2F;pgdogdev&#x2F;pgdog</a>). I’m scaling our favorite database, PostgreSQL. PgDog is a new open source proxy, written in Rust, with first-class support for sharding — without changes to your app or needing database extensions.<p>Here’s a walkthrough of how it works: <a href="https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=y6sebczWZ-c" rel="nofollow">https:&#x2F;&#x2F;www.youtube.com&#x2F;watch?v=y6sebczWZ-c</a><p>Running Postgres at scale is hard. Eventually, one primary isn’t enough at which point you need to split it up. Since there is currently no good tooling out there to do this, teams end up breaking their apps apart instead.<p>If you’re familiar with PgCat, my previous project, PgDog is its spiritual successor but with a fresh codebase and new goals. If not, PgCat is a pooler for Postgres also written in Rust.<p>So, what’s changed and why a new project? Cross-shard queries are supported out of the box. The new architecture is more flexible, completely asynchronous and supports manipulating the Postgres protocol at any stage of query execution. (Oh, and you guessed it — I adopted a dog. Still a cat person though!)<p>Not everything is working yet, but simple aggregates like max(), min(), count(*) and sum() are in. More complex functions like percentiles and average will require a bit more work. Sorting (i.e. ORDER BY) works, as long as the values are part of the result set, e.g.:<p><pre><code> SELECT id, email FROM users WHERE admin = true ORDER BY 1 DESC; </code></pre> PgDog buffers and sorts the rows in memory, before sending them to the client. Most of the time, the working set is small, so this is fine. For larger results, we need to build swap to disk, just like Postgres does, but for OLTP workloads, which PgDog is targeting, we want to keep things fast. Sorting currently works for bigint, integer, and text&#x2F;varchar. It’s pretty straightforward to add all the other data types, I just need to find the time and make sure to handle binary encoding correctly.<p>All standard Postgres features work as normal for unsharded and direct-to-shard queries. As long as you include the sharding key (a column like customer_id, for example) in your query, you won’t notice a difference.<p>How does this compare to Citus? In case you’re not familiar, Citus is an open source extension for sharding Postgres. It runs inside a single Postgres node (a coordinator) and distributes queries between worker databases.<p>PgDog’s architecture is fundamentally different. It runs outside the DB: it’s a proxy, so you can deploy it anywhere, including managed Postgres like RDS, Cloud SQL and others where Citus isn’t available. It’s multi-threaded and asynchronous, so it can handle thousands, if not millions, of concurrent connections. Its focus is OLTP, not OLAP. Meanwhile, Citus is more mature and has good support for cross-shard queries and aggregates. It will take PgDog a while to catch up.<p>My Rust has improved since my last attempt at this and I learned how to use the bytes crate correctly. PgDog does almost zero memory allocations per request. That results in a 3-5% performance increase over PgCat and a much more consistent p95. If you’re obsessed with performance like me, you know that small percentage is nothing to sneeze at. Like before, multi-threaded Tokio-powered PgDog leaves the single-threaded PgBouncer in the dust (<a href="https:&#x2F;&#x2F;pgdog.dev&#x2F;blog&#x2F;pgbouncer-vs-pgdog">https:&#x2F;&#x2F;pgdog.dev&#x2F;blog&#x2F;pgbouncer-vs-pgdog</a>).<p>Since we’re using pg_query (which itself bundles the Postgres parser), PgDog can understand all Postgres queries. This is important because we can not only correctly extract the WHERE clause and INSERT parameters for automatic routing, but also rewrite queries. This will be pretty useful when we’ll add support for more complex aggregates, like avg(), and cross-shard joins!<p>Read&#x2F;write traffic split is supported out of the box, so you can put PgDog in front of the whole cluster and ditch the code annotations. It’s also a load balancer, so you can deploy it in front of multiple replicas to get 4 9’s of uptime.<p>One of the coolest features so far, in my opinion, is distributed COPY. This works by hacking the Postgres network protocol and sending individual rows to different shards (<a href="https:&#x2F;&#x2F;pgdog.dev&#x2F;blog&#x2F;hacking-postgres-wire-protocol">https:&#x2F;&#x2F;pgdog.dev&#x2F;blog&#x2F;hacking-postgres-wire-protocol</a>). You can just use it without thinking about cluster topology, e.g.:<p><pre><code> COPY temperature_records (sensor_uuid, created_at, value) FROM STDIN CSV; </code></pre> The sharding function is straight out of Postgres partitions and supports uuid v4 and bigint. Technically, it works with any data type, but I just haven’t added all the wrappers yet. Let me know if you need one.<p>What else? Since we have the Postgres parser handy, we can inspect, block and rewrite queries. One feature I was playing with is ensuring that the app is passing in the customer_id in all queries, to avoid data leaks between tenants. Brain dump of that in my blog here: <a href="https:&#x2F;&#x2F;pgdog.dev&#x2F;blog&#x2F;multi-tenant-pg-can-be-easy">https:&#x2F;&#x2F;pgdog.dev&#x2F;blog&#x2F;multi-tenant-pg-can-be-easy</a>.<p>What’s on the roadmap: (re)sharding Postgres using logical replication, so we can scale DBs without taking downtime. There is a neat trick on how to quickly do this on copy-on-write filesystems (like EBS used by RDS, Google Cloud volumes, ZFS, etc.). I’ll publish a blog post on this soon. More at-scale features like blocking bad queries and just general “I wish my Postgres proxy could do this” stuff. Speaking of which, if you can think of any more features you’d want, get in touch. Your wishlist can become my roadmap.<p>PgDog is being built in the open. If you have thoughts or suggestions about this topic, I would love to hear them. Happy to listen to your battle stories with Postgres as well.<p>Happy hacking!<p>Lev

Title: Introducing PgDog: A Rust-Based Proxy for Sharding PostgreSQL Without Extensions

Hello, PostgreSQL enthusiasts! I'm Lev, the creator of PgDog (https://github.com/pgdogdev/pgdog), a new open-source proxy for scaling PostgreSQL using sharding, written in Rust and designed to work without requiring changes to your application or database extensions.

Sharding PostgreSQL at scale is no easy feat, and as databases grow, splitting the primary instance becomes necessary. Unfortunately, the current tooling options are limited, and teams often resort to splitting their applications instead of their databases. PgDog aims to solve this challenge with its first-class support for sharding and a new approach to distributing data.

PgDog is the spiritual successor to PgCat, another Rust-based pooler for PostgreSQL, with a fresh codebase and new goals. The focus of PgDog is to support cross-shard queries out-of-the-box while improving flexibility, asynchronous processing, and PostgreSQL protocol manipulation at any stage of query execution.

The project is still under development, but features like simple aggregates (max(), min(), count(*), and sum()) are already functional. Sorting (ORDER BY) is also supported if the values are part of the result set. PgDog buffers and sorts rows in memory before sending them to the client, and it currently supports bigint, integer, and text/varchar data types.

PgDog is designed to work seamlessly with standard PostgreSQL features for unsharded and direct-to-shard queries. If you include the sharding key (e.g., customer_id) in your query, you won't notice any differences.

Compared to Citus, a popular open-source PostgreSQL sharding extension, PgDog's architecture is fundamentally different. PgDog operates outside the database as a proxy, enabling you to deploy it with managed PostgreSQL services like Amazon RDS, Google Cloud SQL, and others where Citus isn't available. PgDog is also multi-threaded and asynchronous, allowing it to handle thousands or even millions of concurrent connections.

As I improve my Rust skills, PgDog's performance has seen a significant increase, with a 3-5% improvement in latency compared to PgCat, and a much more consistent p95. Additionally, PgDog's multi-threaded design leaves single-threaded connection poolers like PgBouncer in the dust.

PgDog's use of the PostgreSQL parser (through pg_query) allows it to understand all PostgreSQL queries, extract WHERE clauses and INSERT parameters, and rewrite queries when necessary. This feature will be invaluable when implementing more complex aggregates and cross-shard joins in the future.

PgDog also supports read/write traffic splitting and load balancing, allowing it to be deployed in front of the whole cluster, eliminating the need for manual code annotations.

One of the most exciting features of PgDog is its distributed COPY support, which allows you to send individual rows to different shards, simplifying the process of managing data across multiple nodes.

As PgDog continues to develop, features like (re)sharding PostgreSQL using logical replication and more at-scale capabilities like blocking bad queries will be added. The project is being built in the open, and I welcome any feedback, suggestions, or battle stories from the PostgreSQL community.

Join me in happy hacking and check out PgDog at https://github.com/pgdogdev/pgdog!

Lev

📢 Ad Space Available

Configure ad networks in environment variables

📢 Ad Space Available

Configure ad networks in environment variables

CREDIBILITY ANALYSIS

Truth Score⚖️ MEDIUM confidence
75%
Moderate Credibility0% ←→ 100%

Credibility Analysis:

Moderate credibility source: Hacker News has mixed reliability
Well-structured content with good grammar and appropriate length
Contains factual indicators like statistics, dates, or research references
Balanced language with minimal bias indicators
Limited source verification or attribution
High (80-100%)
Moderate (60-79%)
Low (40-59%)
Very Low (0-39%)
MindCast News
Intelligent Broadcasting Powered by AI
BACK TO NEWS

📢 Ad Space Available

Configure ad networks in environment variables

MORE IN TECHNOLOGY

Shenandoah Students Creating VR Experience Following the Lewis and Clark Trail

Loading...

TECHNOLOGY
May 26 • 9:31 PM

Shenandoah Students Creating VR Experience Following the Lewis and Clark Trail

Read Article
Iron Spring PL/1 Compiler

Loading...

TECHNOLOGY
May 26 • 9:19 PM

Iron Spring PL/1 Compiler

Read Article
CSS Painting API

Loading...

TECHNOLOGY
May 26 • 8:59 PM

CSS Painting API

Read Article
View All Technology News

📢 Ad Space Available

Configure ad networks in environment variables

BROWSE CATEGORIES

Technology
Business
Science

📢 Ad Space Available

Configure ad networks in environment variables

AI NEWS CHANNEL
Powered by Artificial Intelligence
© 2024 AI News Channel. All rights reserved.