Ah, the joys of upgrading the home lab. It’s almost guaranteed that something goes wrong, since I don’t really spend much time maintaining my environment. I wanted to update my vCenter Appliance from 5.5 Update 3d to Update 3e. I normally use the built-in update functionality of the vCenter Appliance VAMI page. That has been one of my favourite and best features, and it has never failed. Well, until now.
The download and update process worked until the final reboot. After that, I noticed that I could not login to vCenter, so I logged back into VAMI. The vCenter Server service was not running. This is the time for a deep breath, because it’s not gonna be pretty. I did try my luck with rebooting the appliance, but of course that didn’t help. In my experience, if the vCenter Server does not start, it’s almost always the database. Log time.
The vpxd.log did have some errors in it. There was also an error related to ODBC. Database, my friend! There’s a block error, and that can’t be good.
Postgres logs at “/storage/db/vpostgres/pg_log” had some interesting errors as well. The blocks didn’t match, so there had been a write error at some point. Most likely I’ve done one of my reboots (= yank the cable), and the database got corrupted. Luckily, the fix was quite simple.
First things first, backup your vCenter Appliance before doing anything! To access the database, you need to enable bash for the postgres user. This VMware KB walks you through it quite nicely. After that’s done, we need to get rid of the bad block. There’s a another KB how to do this, although it only mentions vFabric Postgres, but it works just fine for the vCenter Appliance. In practice it’s the same database product under the hood. Follow the guide to remove the bad block. To stop and start the psql, you can either use the command in the KB, or just use “service vmware-vpostgres stop/start” as the root user. Just notice that you need to use the full path for the pg_ctl command, which is “/opt/vmware/vpostgres/1.0/bin/pg_ctl”. It won’t work without the path, and the KB fails to mention that.
After the fix the C# client worked fine, but I did have a new error when using the Web Client. You can get rid of that by simply clearing the browser cache.
Now I can finally add my repurposed Mac Mini to my cluster!