Beyond the Single PostgreSQL Node, Part One

When Disco was the hot new music, the people who built database management systems based their praxis around the then-correct assumption that hardware was extremely expensive, and hence that procuring things like more storage, let alone more machines on which they would run, could be counted on to require significant capital resources. This could only happen on the time scale of quarters, even if it was deemed worthwhile to fight the fights needed to procure them.

The above conditions informed the practice of hardening single nodes against failure without adding significant hardware resources in the process. This practice continues, and it's a good thing to have.


Over the past few decades, hardware resources have been dropping precipitously in cost even when acquired via capital expenditure, and tax (read subsidy here) regimes have been altered to put a massive thumb on the scale to favor operational expenditures. This in turn has driven technologies that made it easy to make those operational expenditures, and to make it possible to make such expenditures in small increments.

Now, doubling a node's storage or lighting a powerful new node and having it working in production can and does happen on the scale of minutes, not least because the first chunk of money spent in order to do so is well within the discretion of an individual developer, and a junior one at that.

Topic Transition!

That's a lot of paragraphs on history and political economy, and you may be wondering how they're pertinent to the matter at hand. Wonder no more, as we've gotten there.

Now that it's cheap and easy to light even a database node, generally a pretty expensive type of node to light, we can use techniques to approach problems that would have been untenably profligate in the Disco era. Among these are replication.

It's not all sweetness and light, though. The proliferation of new nodes has created new problems. When you switch the database to that hot new hardware, lots of new systems can need to know about that change, and there can be a large and complex problem of coordination. You might consider DNS as a way to promulgate this news, and now, as jwz once said of regexes, you have two problems.

Fortunately, there's a way to isolate the places where the news needs to go, and a way to do it that normally involves zero downtime deployment: database-aware proxies, specifically pgbouncer. It's pretty safe to put all your eggs in that basket because pgbouncer has been hammered on hard enough that failure of the underlying node is far and away the most common source of failure for it, i.e. you can often get away with having just one node, and you can in dire emergency know that lighting a new node to replace it is a matter of the configuration management you already have in place, and an IP takeover, which can happen on the scale of seconds with practice.

How, you may ask?

That's the cliffhanger for the next installment of Beyond the Single PostgreSQL Node.

Thanks as always to my employer, Aiven, for providing me with the resources to do this work.


Popular posts from this blog

Cool Stuff in PostgreSQL 10: Transition Table Triggers

Cool Stuff in PostgreSQL 10: Auto-logging

You don't need PL/pgsql!