Monday, 27 July 2015

Cloning rootvg using alt_disk_copy

Using alt_disk_copy (see the Resources section) to clone your rootvg disks for ease of back-out when doing AIX® upgrades or applications upgrades that resided on the rootvg disks. In that article, I did not cover hardware migrations as this was out of scope. In this article, I discuss how this can be achieved. The man page on alt_disk_copy states (by using the 'O' option), "Performs a device reset on the target altinst_rootvg. This causes the alternate disk install to not retain any user-defined device configurations. This flag is useful if the target disk or disks become the rootvg of a different system."
In a nutshell, this means that any devices that have had their attributes changed, typically by the system administer, are reset to the default value(s). This could mean any changes to the following, but not restricted to:
  • Ethernet cards—reset of IP addresses /hostname
  • Fibre (SAN) cards—reset of any attributes changes, like fc_err_recov
  • sys0—reset of any attributes changes, like maxuproc
  • Reset of the diag hardware email notification list
The process of the migration or move using the cloned rootvg disk method is down to your operating requirements. The most common use is to clone a rootvg disk and insert it onto the new machine for a new base build. This process can be done with the source machine still running. Indeed, it is practical that once the cloned disk is across, you get that disk mirrored, then insert a new disk on the source, and re-mirror that one so you have no down time on the source machine.
However, in this demonstration, I will point out procedures one could use to migrate to new hardware. Taking fibre cards from the source machine to the new machine, as well as the SAN attached disks. As a general rule, there is no need to remove the Ethernet cards, as any new machine, should already have these present.
Review your existing configuration

If you are only concerned about cloning rootvg to build a base new machine, then there is no point in gathering current configurations. Only if you are migrating to new hardware from the current hardware do you have to take config listings to change attributes on the new system.
When migrating to new hardware, review your inittab setting, comment out the services that you do not want to come up until the migration is complete. It is also advisable to take configuration listing of the current AIX machine, so any changes that are reset by the alt_disk_copy and be changed once the new system is brought up.
The following is not an exhaustive list of configuration listing you need to check and amend on the new system when migrating, but it is a good starting point:
  • Listings of all VGs and their LV contents
  • Listing of all the PVs
  • lsattr -El output of all fibre and scsi cards
  • For aio servers, on AIX 6 and 7, ioo -a
  • For aio server, on 5.3 and below use: lsattr -El aio0
  • Listings of Ethernet cards (including ether-channel if present)
  • Take note of IPs and gateway addresses
  • lscfg -vp output
  • lsdev output (ie: lsdev -Cc disk)

Prepare to move

In this demonstration, you can assume:
  • alpha is the source host.
  • bravo is the destination host.

Be mirrored up

On host alpha, make sure the rootvg is mirrored. In this demonstration, we can see that rootvg is not fully mirrored, /opt/db2_09_01, only has one copy. So you need to fix that before unmirroring rootvg.
# lsvg -l rootvg
rootvg:
LV NAME             TYPE       LPs     PPs     PVs  LV STATE      MOUNT POINT
hd5                 boot       1       2       2    closed/syncd  N/A
hd6                 paging     4       8       2    open/syncd    N/A
hd8                 jfs2log    1       2       2    open/syncd    N/A
hd4                 jfs2       8       16      2    open/syncd    /
hd2                 jfs2       36      72      2    open/syncd    /usr
hd9var              jfs2       3       6       2    open/syncd    /var
hd3                 jfs2       16      32      2    open/syncd    /tmp
hd1                 jfs2       8       16      2    open/syncd    /home
hd10opt             jfs2       6       12      2    open/syncd    /opt
hd11admin           jfs2       1       2       2    open/syncd    /admin
lg_dumplv           sysdump    8       16      1    open/syncd    N/A
livedump            jfs2       2       4       2    open/syncd    /var/adm/ras/l
ivedump
fwdump              jfs2       1       2       2    open/syncd    /var/adm/ras/p
latform
fslv01              jfs2       8       8       1    open/syncd    /opt/db2_09_01
loglv01             jfslog     1       2       2    open/syncd    N/A

# mklvcopy fslv01 2
# syncvg -l fslv01
# lsvg -l rootvg
rootvg:
LV NAME             TYPE       LPs     PPs     PVs  LV STATE      MOUNT POINT
hd5                 boot       1       2       2    closed/syncd  N/A
hd6                 paging     4       8       2    open/syncd    N/A
hd8                 jfs2log    1       2       2    open/syncd    N/A
hd4                 jfs2       8       16      2    open/syncd    /
hd2                 jfs2       36      72      2    open/syncd    /usr
hd9var              jfs2       3       6       2    open/syncd    /var
hd3                 jfs2       16      32      2    open/syncd    /tmp
hd1                 jfs2       8       16      2    open/syncd    /home
hd10opt             jfs2       6       12      2    open/syncd    /opt
hd11admin           jfs2       1       2       2    open/syncd    /admin
lg_dumplv           sysdump    8       16      1    open/syncd    N/A
livedump            jfs2       2       4       2    open/syncd    /var/adm/ras/l
ivedump
fwdump              jfs2       1       2       2    open/syncd    /var/adm/ras/p
latform
fslv01              jfs2       8       16      2    open/syncd    /opt/db2_09_01
loglv01             jfslog     1       2       2    open/syncd    N/A 

Confirm your boot image is good and mirrored, if not alt_disk_copy will fail:
# lslv -m hd5
hd5:N/A
LP    PP1  PV1               PP2  PV2               PP3  PV3
0001  0001 hdisk0            0001 hdisk1

Identify the disk to move

Before alt_disk is executed, it is good prudence to identify physically the disk that will be removed. Depending on how your disks are placed, use either the hot-swap smit utility or smit diag to identify the disk location using the menu selections:
smit diag
task selection
identify /attention indicator

Make sure you know which is hdisk0 and hdisk1. A useful tip here is to put a small sticker of each drive with their corresponding hdisk names, so you can be sure you do not remove the wrong disk. In this demonstration, we will initially be moving hdisk1 to the new hardware, so first locate hdisk1 location code, using:
# lscfg -vp |grep -w hdisk1
  hdisk1           U787F.001.DPM0Y7F-P1-T10-L4-L0   16 Bit LVD SCSI Disk Drive (
18200 MB)

Now using smit diag, use the above hdisk1 location code to match up with the device location code you want to identify. Once the hdisk1 light comes on, label it, for reference.
Next, do the following:
  • run bosboot
  • unmirrorvg rootvg
  • reducevg rootvg to remove hdisk1
# bosboot -a
# unmirrorvg rootvg hdisk1
0516-1246 rmlvcopy: If hd5 is the boot logical volume, please run 'chpv -c <disk
name>'
        as root user to clear the boot record and avoid a potential boot
        off an old boot image that may reside on the disk from which this
        logical volume is moved/removed.
0516-1804 chvg: The quorum change takes effect immediately.
0516-1144 unmirrorvg: rootvg successfully unmirrored, user should perform
        bosboot of system to reinitialize boot records.  Then, user must modify
        bootlist to just include:  hdisk0.
# reducevg rootvg hdisk1
# lspv
hdisk0          00c23bed7c1b3d4b                    rootvg          active
hdisk1          00c23bedf2976238                    None
hdisk2          00525c6a888e32cd                    apps_vg         active

Start the alt_disk_copy

Now we can clone hdisk1 from the current AIX in rootvg which is hdisk0. The format for the alt_disk_copy is:
alt_disk_copy -Od hdisk1

Where:
  • O, described earlier, removes all user defined ODM entries.
  • d is the destination disk when the cloned rootvg disk is to go; in this example, it is hdisk1.
You can also specify bootlist options, but I prefer to control this myself as part of the checklist during the migration. Listing 1 below contains the full output from running the alt_disk_copy command.


Listing. 1 alt_disk output

# alt_disk_copy -Od hdisk1
Calling mkszfile to create new /image.data file.
Checking disk sizes.
Creating cloned rootvg volume group and associated logical volumes.
Creating logical volume alt_hd5
Creating logical volume alt_hd6
Creating logical volume alt_hd8
Creating logical volume alt_hd4
Creating logical volume alt_hd2
Creating logical volume alt_hd9var
Creating logical volume alt_hd3
Creating logical volume alt_hd1
Creating logical volume alt_hd10opt
Creating logical volume alt_hd11admin
Creating logical volume alt_lg_dumplv
Creating logical volume alt_livedump
Creating logical volume alt_fwdump
Creating logical volume alt_fslv01
Creating logical volume alt_loglv01
Creating logical volume alt_lv00
Creating /alt_inst/ file system.
/alt_inst filesystem not converted.
        Small inode extents are already enabled.
Creating /alt_inst/admin file system.
/alt_inst/admin filesystem not converted.
        Small inode extents are already enabled.
Creating /alt_inst/home file system.
/alt_inst/home filesystem not converted.
        Small inode extents are already enabled.
Creating /alt_inst/opt file system.
/alt_inst/opt filesystem not converted.
        Small inode extents are already enabled.
Creating /alt_inst/opt/db2_09_01 file system.
/alt_inst/opt/db2_09_01 filesystem not converted.
        Small inode extents are already enabled.
Creating /alt_inst/storix file system.
Creating /alt_inst/tmp file system.
/alt_inst/tmp filesystem not converted.
        Small inode extents are already enabled.
Creating /alt_inst/usr file system.
/alt_inst/usr filesystem not converted.
        Small inode extents are already enabled.
Creating /alt_inst/var file system.
/alt_inst/var filesystem not converted.
        Small inode extents are already enabled.
Creating /alt_inst/var/adm/ras/livedump file system.
/alt_inst/var/adm/ras/livedump filesystem not converted.
        Small inode extents are already enabled.
Creating /alt_inst/var/adm/ras/platform file system.
/alt_inst/var/adm/ras/platform filesystem not converted.
        Small inode extents are already enabled.
Generating a list of files
for backup and restore into the alternate file system...
Backing-up the rootvg files and restoring them to the
alternate file system...
Modifying ODM on cloned disk.
Building boot image on cloned disk.
Resetting all device attributes.
NOTE: The first boot from altinst_rootvg will prompt to define the new
system console.
Resetting all device attributes.
NOTE: The first boot from altinst_rootvg will prompt to define the new
system console.
forced unmount of /alt_inst/var/adm/ras/platform
forced unmount of /alt_inst/var/adm/ras/platform
forced unmount of /alt_inst/var/adm/ras/platform
forced unmount of /alt_inst/var/adm/ras/livedump
forced unmount of /alt_inst/var/adm/ras/livedump
forced unmount of /alt_inst/var
forced unmount of /alt_inst/var
forced unmount of /alt_inst/usr
forced unmount of /alt_inst/usr
forced unmount of /alt_inst/tmp
forced unmount of /alt_inst/tmp
forced unmount of /alt_inst/storix
forced unmount of /alt_inst/opt/db2_09_01
forced unmount of /alt_inst/opt/db2_09_01
forced unmount of /alt_inst/opt
forced unmount of /alt_inst/opt
forced unmount of /alt_inst/home
forced unmount of /alt_inst/home
forced unmount of /alt_inst/admin
forced unmount of /alt_inst/admin
forced unmount of /alt_inst
forced unmount of /alt_inst
Changing logical volume names in volume group descriptor area.
Fixing LV control blocks...
Fixing file system superblocks...
Bootlist is set to the boot disk: hdisk1 blv=hd5

# lspv
hdisk0          00c23bed7c1b3d4b                    rootvg          active
hdisk1          00c23bedf2976238                    altinst_rootvg

After executing alt_disk_copy, unless otherwise specified, the default action is to set its bootlist to the disk alt_disk it resides on; in this case, it is hdisk1. We need to err to caution and make sure host alpha will still reboot back to hdisk0 in case we have to backout of the migration. So change the bootlist to hdisk0:
# bootlist -m normal hdisk0

Then shut down host alpha.
At this point, we could have kept host alpha up and dynamically remove the disk by using:
# rmdev -dl hdisk1 

However, eventually we will need to shut host alpha down to remove the remaining rootvg hdisk. So we might as well shut it down now.

Booting the cloned disk

In this scenario, we are moving the current hardware cards from host alpha and inserting them into host bravo. Host alpha and bravo are currently down.
  • Remove the network cable and the SAN fibre cables, if present, along with the fibre cards and insert these cards on the host bravo; then plug the cables in.
  • On host bravo remove the existing, if present, internal boot disks as we will be replacing these from host alpha.
  • On host alpha remove, hdisk1 and insert it into host bravo in the location that was previously populated with the boot disks.
  • Bring up host bravo; upon boot up, go into the SMS menu.
When host bravo comes, it will display the firmware/SMS menu prompts. Hit 1 to enter SMS menu, as shown below in Figure 1.


Figure 1. SMS
Screen shot of the SMS menu 

Go to the boot order list and select the disk to boot up on. There should only be one disk to boot up. Figure 2 shows the newly cloned disk to boot up on.


Figure 2. SMS boot list
Screen shot of the SMS boot list 

Upon booting off that disk, the AIX kernel starts to load as shown in Figure 3. The disk comes up and loads AIX.


Figure 3. Booting up
Screen shot of the AIX kernel booting up 

You are then prompted to define the system console, select 1 for this console, as shown in Figure 4. The console definition has been reset, because it is a ODM user defined attribute.


Figure 4. Identify new console
Screen shot of the system console 

Once AIX is booted up, log in and check your disk:
# lspv
hdisk0          00c23bedf2976238                    rootvg          active

Notice that the formally named hdisk1 on host alpha is now hdisk0 on host bravo. Change the bootlist to boot of this disk, until the migration is complete:
# bootlist -m normal hdisk0

Configure the network card to the IP that was used on host alpha and the card speed, if required. Confirm you are connected onto the network, test to ensure that you can ping outside your gateway, and ensure you can connect to the DNS via any of the following commands:
nslookup
dig
host

Confirm that you can connect to the host bravo via telnet/ssh from a remote session. You may have to start manually ssh on host bravo as this may not be running:
# startsrc -s sshd

Bring in the other rootvg disk

If all is OK at this point, we can bring in the other rootvg disk from host alpha:
  • Remove the last boot disk (hdisk0) from host alpha.
  • Insert the disk in host bravo; this disk is now hdisk1.
  • Run cfgmgr to discover the disk.
We are not mirroring the disk just yet. First, we need to make sure we get all the SAN disks discovered. If we have issues with this, we can take the newly insert disk back to the original box in case we need to back out of the migration.
If bravo comes up with old_rootvg, just remove the definition with:# alt_rootvg_op -X old_rootvg




Review hardware attributes and bring in the VGs

On host bravo, check for any changes that may need to be done from the listings taken from the output of the aio, sys0, scsi/fibre card(s). Adjust these setting, either via smit or the chdev command.
Discover the WWNs for the fibre cards. This is obtained looking at the output from the lscfg -vp command and searching for the SAN cards fcs0, fcs1, etc. Once obtained, use these to re-zone in the SAN disks from the switch, then import the VGs. If you have more than one SAN VG across different switches, then I recommend only leaving in the cable for the switch that holds one of the VGs. Pull the disks in using cfgmr; then import that VG. Once imported, plug in the other fibre cable and re-run cfgmgr and import that VG. The reason for this is if you do not, all the disks will come down the fibre paths. When you import the VGs, all the disks will be imported onto that one VG, which is probably not what you want. (You do not get this issue with scsi raid controllers that have more than one VG).
If you have problems with importvg, be sure to check out the cables attached. You may have to run varyonvg afterwards, as AIX does not automatically varyonvg if it gets issues with the importvg.
Next, mount the file-systems.
# mount -a

Check to ensure that all file-systems are mounted and all LVs are open.
# lsvg -l <vg name>

Uncomment the changes that was made to the inittab.
Refresh iniitab, so it is re-read.
# init q

Now start the applications up and test.

Mirror up

If all looks good at this point and the sanity checks are good, mirror up the other disk. You need to use the force option when bringing it into rootvg.
# extendvg -f rootvg hdisk1 
# mirrorvg rootvg hdisk1

Oh no, it's all gone terribly wrong!

If the migration or move fails, remove the second (boot) disk from host bravo. Remember this should have not been put into rootvg on host bravo until all the applications are tested OK. So this disk still holds the ODM setting from host alpha, so you are good to go for a back-out process. Using the procedures described in the article about moving the cards/cables, simply do it in reverse.
In some situations, it maybe the case the disk will be shown with the alt_disk tag still associated with it. Simply remove this tag before bringing it into rootvg and mirroring up.
# alt_rootvg_op -X altinst_rootvg

Conclusion
I have highlighted just one method of a hardware migration. Though this has not been a step-by-step guide, it does provide one way you can approach a migration. There are different variations you can employ to do a hardware migration or move using alt_disk_copy. The alt_disk_copy utility is a good tool to use when you wish to migrate to new hardware, and the machines are located in the same location room.

No comments:

Post a Comment