Lustre: Recovering LVM metadata

Over the weekend, of course while I’m out of town on vacation, the lustre server decided to take a crap. I checked my email that morning to find notices of lustre clients unable to connect. I check the server to find it had reset itself(unsure exactly why but at the time there was a power failure, though the servers are fully redundant, not sure why one psu failure would cause the system to reset). It was the reset that seemed to cause the problem, lustre wasn’t mounting correctly, one of the OSTs was missing.

Part of the lustre installation is to setup the OSTs as LVM targets, my guess is to make it easier to pass the target from system to system as a simple scan will show the device. So why was one of the targets not showing up in the scan? Multipath was working and the multipath device was there, pvscan was not showing it as a listed physical volume. Luckily CentOS(well Redhat), has great documentation and I found this document to be of great help: http://www.centos.org/docs/5/html/Cluster_Logical_Volume_Manager/mdatarecover.html

Though different from that document, the lvs command was not reporting any errors, it simply just wasn’t showing the missing target. A nice feature of LVM is it keeps a backup of the data used in the creation of the LVM targets, which can be used to restore that information to the drive. I used vgcfgrestore to try to restore the data, now an error messasge saying a particular UUID was not found. Great, with that I can continue.

Using that UUID and the backed up LVM metadata, I used pvcreate to recreate the physical volume using the backed up metadata to restore that metadata to the drive. Now vgcfgrestore was able to find the device and restore it. Then used lvchange to bring it back online and was able to mount the device and get lustre working again.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s