Six For Two: September 2015

Today's post will be about relational databases. Are you familiar with the PostgreSQL history file? Have you ever wondered about its contents? I mean, probably not, unless you've had to do some deep digging on the topic of recovery in Postgres. There's not much too it, but I had a difficult time finding the specifics when I went looking, so here shall they be published.

I have a few goals in mind for this weekend (wait, it's Saturday afternoon already? Well, you know, the rest of this weekend):

build my first presentation with reveal.js for my upcoming talk at Postgres Open in Dallas
watch the Green Bay Packers win their season opener tomorrow at noon
crush my foolish opponents in Fantasy Football

But those can all wait until I relate to you this story. Imagine you are a database administrator. Now, some of you may not know what a database administrator (dba) does, but that's not too hard. A developer writes code, a sys admin keeps the data center running, a manager makes it possible for these people to work without interruption; but you're supposed to imagine being a dba. A dba worries only about the database.

To make the database happy a dba will always want one of a few things:

More space. Database admins are always going on and on about how there's not enough space, and the business should learn how to archive old data, and has anybody ever heard of a Data Warehouse?
An upgrade. Most dba's seem to think that by getting "new features" and "performance improvements" they will somehow find contentment. But they will never be content, because the new Alpha has a newer feature.
More RAM and more cores. Always.
More space. Yes, the database just kept growing while I was writing this, and we don't have enough room for all these backups. Tell me where to move these backups, and so help me if you say /dev/null I will slug you.

One day you are presented with an opportunity. You can migrate the database to a new server where you will be given more space, more memory, more compute and a chance to upgrade to the newest version. Hoisted by your own petard, you must now perform the migration with minimal downtime.

But you run the world's most advanced open source database in PostgreSQL, and that's not actually a very hard proposition if the database is on version 9.0 or newer! Version 9.0 came out almost exactly five years ago, so hopefully you are somewhat with it. And I'm not judging if you're not, but you need to catch up. Unsupported databases and Production environments just don't mix.

Anyway, in version 9.0 PostgreSQL implemented streaming replication, which has since become the de facto standard for enterprise database availability. Streaming replication is a powerful feature which allows you to build a readable clone of your production database, that keeps up with changes rapidly. Just how rapidly can depend on many factors, but that's not the point of this story.

In this story, you enable streaming replication from your primary (the old database server) to your new hardware. On the night of the cutover, when you're ready to upgrade, you: stop web traffic for a few minutes, promote your replica to master and shut down the old service (don't want anybody connecting to that), run pg_upgrade -k to upgrade without copying a bunch of files around (no backing down now), start the new cluster, and analyze your tables. If you've built a nice maintenance window in, you might take the time to VACUUM some big tables or rebuild bloated indexes while nobody's connected, but it's not a must. When you're done, the app starts up, and whoever was responsible for changing all the connection strings (wait, was that your job? I thought chef would handle it. . .) did so, and the app is now happy because the database is happy.

If you're really good at your job, you've practiced this a few times before doing it for realzies to make sure there are no quirks to the upgrade process, but eh. If you want to walk on a tightrope in your imaginary career as a dba, that's your problem. You can take it up with the imaginary CTO after you have to roll back four hours in because you hadn't realized the extension system changed between 9.0 and 9.4, and now this new production database won't load data because of "citext"!? *

Everybody is happy. The production app is running, and you can now set up a new streaming replica from this, the new primary, since the laws of high availability dictate you should have at least two of everything. You sync the data directories for the first time, eager to start up your replica, only it won't start. It complains:

ERROR:  could not open file "pg_xlog/00000002.history": No such file or directory
FATAL:  could not receive timeline history file from the primary

What happened? In fact, PostgreSQL has - or rather had - a bug of exactly this fashion. When PostgreSQL promotes a new primary server, it creates a marker of the timeline split in the form of a small text file placed in the WAL file directory. This file makes it possible to achieve Point-In-Time-Recovery under some rather complex failover and fail-back scenarios. It cannot, however, be used to recover to a point in time before the upgrade, and it knows this. So pg_upgrade deletes the .history file. This isn't a problem for the operation of the new primary. But it makes it impossible to start a streaming replica, for lack of a file that's probably less than 100 bytes. Or at least, it did before Bruce Momjian patched the bug and back-ported the fix through 9.0

But before that you were stuck. If you spend some time reading the recovery documentation you may come across recovery_target_timeline and think that setting this to 'latest' will help. But it doesn't. Instead it makes the replica think there is a newer history entry, and it will just start complaining about wanting "pg_xlog/00000003.history" now. So it seems that you will have to recreate the file. You can find a very nice summary of the .history file on the Postgres wiki. Since the info is in .pdf, though, it tends to be harder to index, so you may have trouble locating the doc if you don't already know it's there.

Which brings me in a rather roundabout fashion to what I asked at the very beginning of this entry (how long ago was that? Good god, Lemon): are you familiar with the internals of the .history file? Probably not, but they're very simple. A numbered history file contains an entry for each previous point in the WAL record where a new timeline began. Sound hard? It's not.

# cat 00000002.history
1 1404/A0000017 Some comments about when the timeline split.

That's it. One digit, a magic number, and some comments. Now about that magic number..? If we actually needed the ability to recover back to a different timeline, we'd need the correct address. But we're never going to go back to that timeline, because it's before our upgrade. All we'd need to recreate a lost file is a good enough number. And you can get one by running:

# SELECT pg_current_xlog_location();
pg_current_xlog_location
--------------------------
1/38F70328
(1 row)

Mock up a .history file in your WAL directory with these values, et voila. The replica will immediately be able to start.

Be seeing you!

* Please note that upgrading extensions from pre 9.1 versions of PostgreSQL is generally a very simple matter, and I don't mean to imply that you'll have problems with citext or any other extension. Generally, the toughest thing you'll have to do is download a patch and run CREATE EXTENSION <<extension>> from UNPACKAGED;

And you can handle that. Just don't wait until the night of the upgrade to find out you need a patch and have to start looking. Plan ahead, imaginary dba. Gosh.

I am confronted with a mountain of paper every time I sit down at my desk. I look just over it, most of the time, and stare instead at the left monitor, which of the pair has always been primary. The paper isn't scary, or difficult, and going through it would probably be of enormous personal benefit. It's at least half comprised of unopened credit card applications, scattered with actual correspondence, and densely packed with various credit card receipts.

It is these receipts that I am most trying to avoid. The easiest way to avoid them would be, of course, to stop getting receipts with most of my purchases like a normal person; and just balance online from time-to-time. But I can't do that, because I am my father's son, and I require a paper confirmation of purchases to balance against the creditor's computer or the merchant, because either of them could make a mistake.

Except I don't actually do the balancing anymore. I used to do it - enter paper receipts into a spreadsheet, and account for each line by line against the various cards' balance page - about once a week, for a period of several years. It was tedious. It was laborious. And it was truly productive. In absolute dollar value for time, it's probably the most productive thing I could do, because it forced me to account for every purchase - what it was, and what it cost - and eventually, I learned how to spend less than I made. But it didn't feel productive. It felt like work, and I avoid that whenever possible, so I stopped.

Anyhow, I choose the next easiest way to avoid something, and that's to tune it out. The mind really has no problem forgetting the details of our immediate surroundings. One of the major functions of the brain in crafting our perceptions into consciousness is to ignore all of what is deemed inconsequential, like most of the visual input we actually receive. In fact, it's incredibly common to fail to notice when our surroundings change, because we spend most of our time shucking our short term memory for a useful bit that could go onto long term memory.

Though, to digress for a moment, we actually have quite a poor understanding of what the brain deems useful for storage, or why. Consider for a moment, the case of the poor lady who one night while drifting to sleep suddenly, and without warning, began to hear full song arrangements - as if from a radio in the back of her head - endlessly looping. Sometimes the same snippet for weeks. She knew some of the tunes, but for many others she was both unable to recall having heard them ever before, and powerless to stop them playing.

So I can ignore this paper pile, is what I'm saying. And also, the brain is messed up. Really, really messed up sometimes. But at this moment, the pile is spilling over the bottom of the screen, making it tough to see my start bar. Which means I am constantly being tricked into looking at the pile when I mean to click on the PowerShell icon. And no, I'm not ashamed to run Microsoft's flagship OS, though I increasingly ask myself why exactly I'm doing it when Linux really does do almost every individual piece a little bit - and sometimes a lot - better (except stitch it all together into one convincing experience, I guess).

I think the receipt problem is the most important outstanding personal problem I have yet to solve. Or really even acknowledge. By writing this post, I have successfully not opened any of the mail, or signed up for mint.com to see if it could help me. At the moment, the solution I've arrived at is to write a simple web front-end that lets me enter receipts into a database of my design. Honestly, a dedicated coder with the slightest understanding of web forms could prototype this thing in an afternoon. Yet I have also successfully not written this seemingly trivial app in Python, or Node.js, or Phoenix, or vue.js. Which are all unquestionably awesome ways to do such things.

But now I'm at the bottom of the page, and I have to keep leaning to the right to see around the pile when I'm at the beginning of the line. :-(

Be seeing you!

Six For Two

Saturday, September 12, 2015

Just A Quick One (While He's Away)

Sunday, September 6, 2015

The Lesser of Two Evils