Global File System (GPFS) - The Disk Descriptor

The Disk Descriptor


When a disk is defined as an NSD for use by GPFS a descriptor is written to the disk so that it can be identified when the GPFS daemon starts. The descriptor is written in the first few sectors of each disk and contains information like the disk name and ID. The format of the disk descriptor layout is as follows:
Sector 2 contains the NSD id which GPFS should match with a GPFS disk name in the /var/mmfs/gen/mmsdrfs file. This is written when the mmcrnsd command is run.

Sector 1 contains the "FS unique id" which is assigned when the NSD disk assigned to a file system. This id is matched in the File System Descriptor (FSDesc) to a GPFS disk name. The id is written when one of the GPFS commands mmcrfs, mmadddisk, or mmrpldisk are run.

Sectors 8+ contain copy of the FSDesc, but it may not be the most current copy. This area of the descriptor is written when mmcrfs, mmadddisk, or mmrpldisk is run. A small subset (1, 3, 5, or 6) of the NSDs in the file system contain the most current version of the FSDesc. These are called the "descriptor quorum" or "desc" disks, and can be seen using the command mmlsdisk -L.

When GPFS starts up or is told that there are disk changes, it scans all the disks it has locally attached to see which ones have which NSD ids. (There is a hint file from the last search in /var/mmfs/gen/nsdmap). If it does not see an NSD id on a disk it assumes it is not a GPFS disk. A mount request will check again that the physical disk it sees has the correct NSD id and also that it has the correct "FS unique id" from the most recent FSDesc.

You can see the descriptors on a physical disk (GPFS AIX 3.2.1.9 or later, Linux 3.2.1.16/3.3.0.2 or later) using the mmfsadm command:

mmfsadm test readdescraw /dev/$devname

How can a disk descriptor be damaged?

Historically there have been many ways in which the descriptor information has been clobbered:

  1. Creating an EXT file system on top of a GPFS disk.
  2. Linux install on a new machine that can see the disks with options that allow it to "take" the disk as a swap space, or disk to install system images on.
  3. Create an AIX Logical Volume over a raw disk.
  4. Assign the disk to an Oracle raw volume on which a database is created.
  5. Run the mmcrnsd command specifying -v no so that GPFS skips the check which would tell you the disk is already in use. This puts a new NSD id on the disk and the old identity is lost.
  6. Run mmcrfs, mmadddisk, mmrpldisk specifying -v no so that GPFS skips the check which would tell you the disk is already in use. This clobbers the FS unique id and the FSDesc on the disk and may scatter other user data or system metadata around the disk.

GPFS will not notice that the descriptor sectors have been clobbered until the next time it needs to mount the file system.

Can I recover from a lost descriptor?

The only scenario from the list that is recoverable is number 5 which only clobbered the NSD id. This assumes the mistake is noticed before the disk is added to another file system. Only do this under supervision of the IBM GPFS support team.

If you are very lucky (and are extremely careful) and still have a good nsdmap file or backed up version of the mmsdrfs file, you can either force rewrite of the old NSD id back to sector 2, or change the mmsdrfs file manually to use the new NSD id for the existing GPFS disk name. The mmsdrfs file may have "(free)" line for the mmcrnsd command that was issued, that needs to be cleaned up as well.

The Moral of the Story

Never use "-v no" on mmcrnsd, mmcrfs, mmadddisk, or mmrpldisk. You will then get an appropriate warning about the disk usage. This protects against specifying the wrong physical disk because they have different names on different nodes, typos, and various other forgetful things that humans do.

The only useful purpose of "-v no" is for situations in which you were able to run mmdeldisk -p or mmdelfs -p successfully, but because some disks were unavailable, GPFS couldn't clobber the old descriptor sectors saying they were no longer a part of some file system. But use it extremely carefully in this situation too.


Comentarios