XenServer – Pool Master Recovery (The Missing Part 1 to XenServer Hosts in Halted Mode)

In July of 2012 I wrote a “part 2” regarding XenServer Hosts in halted mode — however I seem to have misplaced part 1 – which I’ve rewritten after having to need to reference these steps again recently.

There are several events which can cause a XenServer Pool to become corrupt. In a recent instance of mine, the pool master was unable to communicate with the HA storage repository (SR) and fenced. I also had another instance where several shutdown unexpectedly, and the pool master was among them. Here are the steps I performed to recover the Pool Master.

  1. Work on recovering the pool, elect the server you want to become the master, and on that box run “xe pool-emergency-transition-to-master”
  2. Once that is completed, on the newly elected/transitioned master, run “xe pool-recover-slaves”
  3. Once that is complete, you should be able to run “xe host-list” and see all of your hosts listed


Based in part on information from: XenServer System Recovery Guide

Hung VM, unable to force reboot/shutdown

I have been working with a few vendor provided VM’s which run Linux. For some reason this specific set of Linux VMs do not properly respond when issuing reboot or shutdown commands when they VMs are hung. This is even true of force-shutdown. The following process works great for virtual servers that are non-responsive in a XenServer environment, after normal reboot/shutdown attempts have failed.

  1. “xe vm-list name-label={vm logical name}” to get the uuid of the VM that is hung
  2. “list_domains” to list the domain uuid’s so you can determine the domain # of the VM above by matching the uuids from this output with the uuid for your VM from the previous command.
  3. “/opt/xensource/debug/destroy_domain -domid XX” where XX is the domain number from the previous command
  4. “xe vm-reboot name-label={vm logical name} –force”



Based in part on information from: http://www.r2dtop.com/xenserver-6-virtual-machine-crash-and-hang-issue/


Install Dell Open Manage on XenServer 5.6 and higher

Installing Dell Open Manage System Administrator (OMSA) on your XenServer provided visibility to the hardware through a web interface. It also permits you to perform SNMP monitoring against the hardware.

First, download the OMSA package from the Dell Support Website (http://support.dell.com) and select your server hardware and XenServer for the operating system.

If you are installing XenServer from scratch, you can simply insert the OMSA CD when prompted for any Supplemental Packs during a normal install. No additional work is necessary, you’re all set!

If you already have XenServer installed, and you are running version 5.6 or higher, simply perform the following:
1) Connect to the console of the XenServer Host (ssh or via XenCenter Console tab)
2a) Insert the CD and type: mount /dev/cdrom /mnt
2b) Or you can copy it to the server using SCP and then mount it using: mount -o loop {omsa file name}.iso /mnt
3) Install it using
# cd /mnt
# ./install.sh
# /opt/dell/srvadmin/sbin/srvadmin-services.sh start
# cd..
# umount /mnt

4) Now you can use your web browser and point it to https://serverip:1311/

You will login using the same root username and password you use to access your XenServer Host.

XenServer: Changing management adapter in pool

After going through several rounds of problems to move a management adapter for a xenserver pool, I have found the following working process. However, it is because of this processes that Citrix makes very clear that you should configure it properly in the first place, and if you need to make changes post-installation, to make them BEFORE you join it to a pool… Also you must change the subnet when changing interfaces. Even if you need to move it to a temporary, non-existant IP address space, and then move it back to the correct IP address space after you are on the correct network interface.

However, lets say you have a pool in production and you need to make the change…

  1. Perform a metadata backup and back up your virtual machines before performing the rest of this procedure.
  2. Disable High Availability from XenCenter, if enabled.
  3. Disable external authentication (Active Director)
  4. Log on to a pool member from the physical console and change the management interface IP address
  5. From the xsconsole, go to Network and Management Interface > Configure Management Interface.
    1. Note: xsconsole freezes when the change is applied. You can use the key sequence CTRL+Z to gain access to the command prompt to run step 4 below. Then, use the command fg %1 to return to xsconsole and exit cleanly.
  6. From the CLI: use the following command: xe pif-reconfigure-ip uuid= IP= gateway= netmask= DNS= mode=
  7. To locate the correct PIF uuid for pif-reconfigure command, use the following command: xe pif-list params=uuid,host-name-label,device,management
  8. From the CLI, run the following command: xe-toolstack-restart
  9. The server enters the emergency mode. Verify that the server is using the new IP address. You can ping it from another host. Try a Secure Shell connection to it, or use the ifconfig command. Verify that the server is in emergency mode by running xe host-is-in-emergency-mode from the CLI. You should get True as the output.
  10. Repeat steps 3 and 4 on each of the pool members.
  11. Change the management interface IP address on the pool master using step 3 above.
  12. Run the following command on the pool master: xe-toolstack-restart
    From the CLI, on each of the pool members, run xe pool-emergency-reset-master master-address=IP_OF_THE_MASTER.
  14. Verify the correct status of the pool. Connect with XenCenter to the new master’s IP address and check everything from there.
  15. Re-enable High Availability and external authentication, if required

If during this process, any of your pool-slave hosts reboot and show missing management interface, and no network cards, please see our post over at: https://reddingitpro.wordpress.com/2012/04/07/xenserver-missing-network-cards-pool-member/

You can also view a video walk through of this process at: http://www.citrix.com/tv/#videos/4330

Adapted from CTX123477

XenServer: Missing network interface – pool member

I have encountered several times that after a shutdown and restart on a XenServer host when it is configured as a pool, sometimes the pool members come up with no management interface because there are no network cards shown. The main reason I have seen this is because the pool-master server changes its IP address – this could be something as simple as changing the IP address, a DHCP address change, or a change of the PIF (physical interface) used for management. In these cases, if the pool slave cannot find the master, it will go into emergency mode to protect the VMs. However, the problem this presents is that there are no network cards available on the slave, no management interface, and the VM’s which were running on that server (even if you’re using shared storage) are unavailable.

The resolution is very quick and simple…

First, verify that you are in emergency mode by running the following from the command line interface on the pool slave host “xe host-is-in-emergency-mode” – – if it returns TRUE then read on, if it returns FALSE then this will not resolve your problem.

Next, verify what the IP address is of your running POOL MASTER (this assumes your POOL MASTER is still running, otherwise you will need to perform an emergency transition)….

On the pool slave, run the following “xe pool-emergency-reset-master master-address=xxx.xxx.xxx.xxx” — where xxx.xxx.xxx.xxx is the IP address of your working pool master…

Upon success it will notify you it will make the change within 10 seconds….

After 10 seconds, run “xe host-is-in-emergency-mode” — if it returns false you should be all set. You may need to refresh (disconnect/reconnect) to the pool in XenCenter.

If your pool-master is unavailable, or all of your hosts are showing no network adapter, then you will need to transfer the master role to one of the servers, from the command line, run: “xe pool-emergency-transition-to-master”

That will make this host the new pool master. Return to the menu system “xsconsole” and document that IP address as the pool master, and then continue the documentation above.


XenServer: Hung VM

I’ve experieneced several instances where a VM appears to hang and is non-repsonsive, not only at the console level, but also to the XenServer Hypervisor and XenCenter. Attempts to force shutdown the server using xe vm-reboot or xe vm-shutdown fail with the error “Another operation involving the object is currently in progress class: VM”.

This has worked consistently to recover this VM.

1 – “xe vm-list” to get the uuid of the VM that is hung
2 – “list_domains” to list the domain uuid’s so you can determine the domain # of the VM above by matching the uuids from this output with the uuid for your VM from the previous command.
3 – “/opt/xensource/debug/destroy_domain -domid XX” where XX is the domain number from the previous command
4 – “xe vm-reboot uuid=XXXX –force” where XXXX is the uuid from the first vm-list command for your VM.

XenServer 6.0 – Import/Export OVF

We had received several OVF from a vendor who exported their VM’s from VMWare and we needed to import them into our XenServer 6.0 environment. After learning that this functionality is now built into Citrix XenServer and no longer needing XenConverter we were excited. However our initial test to import failed. After re-reading the documentation and searching several forums, nothing appeared to resolve the problem – the import would start and several seconds later it would fail.

So we imported the images into our VMWare environment to ensure the OVF’s were good, and even exported them again just to make sure the OVF files themselves were not the issue.

We then tried to export a XenServer VM via OVF and it failed as well. However we could import and export VXA files without issues. Okay, so we have it narrowed down. A bit more research brought us to this Citrix Blog about TransferVM


We attempted this but it said that the package as already installed.

We then contact Citrix who said to try: Nagivating to /opt/xensource/packages/files/transfer-vm and then running the uninstall-transfer-vm.sh

However that didn’t work, it prompted for a UUID but it didn’t document anything about the UUID

We brought this back to our test environment and it worked fine, we uninstalled and then installed and our OVF imports work properly. The difference between the test environment and production is that production is in a pool, whereas the test is standalone.

I have tried to find documentation on which UUID it is looking for but at this point I’ve tried it with the pool, host, and sr UUIDs to no avail. I might have to resort to cycling hosts out of the pool into standalone mode and reinstalling the transfer-vm component and then rejoining the pool.

Unrecoverable error during 5.5 restore (from failed 5.6)

This weekend I decided to perform the upgrade of our 2 XenServer 5.5 servers in a farm configuration to 5.6 FP1. However I found conflicting information on how to perform the actual upgrade. The mistake I made was to put the server into maintenance mode before shutting it down. When performing the upgrade you must keep the pool master in normal mode, with all VMs migrated off of it, and then shut it down, which will place the farm into a recovery mode. While in this mode you are supposed to perform the implace upgrade in a rolling style. I miss read that step. So instead I ran the upgrade with the pool master in maintenance mode (thus it was no longer the true pool master as it nominated another server to be the master). Well it let me perform the upgrade, and everything appeared to be working fine. The server rebooted and I was greeted by the regular XSConsole. However I noticed two things:
1) XenCenter still saw the server as offline;
2) XSConsole showed that there were no network interfaces (NO NICS).

After researching the issue, I discovered it was caused by an improper upgrade, but no fear there is a build in restore option. Simply insert the upgrade CD and reboot… It will prompt with a restore option. And it was working great until about 95% where it errored out saying:
“Installer only supports having a single kernel of each type installed. Found 2 of kernel-xen”

Apparently if you have any prior backups on the server, plus the one made during the upgrade, the restore will fail. I found a Citrix Forum post http://forums.citrix.com/message.jspa?messageID=1521356 which described by specific situtation and I attempted the recovery to no success. Only having mild Linux experience it took be a while to discover what I was missing from that forum post since I am a Microsoft guy. Here is the actual steps for a windows guy:
1) Reboot the server with the 5.6 upgrade CD
2) When prompted for advanced setup, press F2 (it will quickly auto select standard install if you aren’t watching)
3) It will prompt you for which advanced setup mode, type “shell” and press enter (no quotes)
4) Setup will continue and dump you to a command line
5) Type “vi /opt/xensource/installer/backend.py” and press enter (again, without the quotes)
6) You are now in the VI editor which is a pain, you can google for how to nagivate, but for the purposes of this, type “/kernel” – and press enter, repeat that until you see the line beginning with “assert len(out) == 1, “Installer only supports having a single kernel ”
7) with the cursor over that line, type dd (this should delete the entire line)
8) Then move the cursor over to “return out[0]” and press “a” to enter into the append mode, change it to read “return out[-1]” – then press “esc” and then type “ZZ” (Case sensative).

Microsoft Licensing and Virtualization

Just a reminder that when performing p2v from a server which uses OEM licensing, it will violate the EULA to move that to new hardware. So we need to ensure that during the proposal phase we’re purchasing a open license for the server we’re virtualizing. It many cases, after a p2v, during the initial boot up, if it was OEM licensing, it will force an immediate activation with no grace period. Attempts to activate online or automated phone system will fail. You must talk to an agent which may or may not let you activate the OEM software on different hardware.

You can re-enter the product key, and it will cause a new activation id to be generated which will work with an agent most of the time.  But again, this still technically violates OEM EULA. Also know that OEM media will not accept open license keys, only OEM keys.

One other option exists as well for OEM. If you purchased your OEM version of software within the last 90 days, you can simply purchase an Open License Software Assurance (without license) which is typically around 30% of full license cost, and it will effectively convert your OEM license to a standard Open License.

XenServer: SNMP Monitoring

To enable basic SNMP monitoring of the XenServer host, you have the following options:

–          Dell/HP Embedded XenServer comes with SNMP enabled by default

–          Downloaded versions come with SNMP disabled, enable using Citrix ctx116187

  • In brief, from SSH:
    service snmpd start       (this enables SNMP service)
    chkconfig snmpd on       (this ensures SNMP starts after reboot)
  • Disregard other “installation” steps on KB, as they apply to release 3.x and prior, however no more recent KB is available
  • You can customize and lock down SNMP via vi /etc/snmp/snmp.conf (if you know what you’re doing and know the vi editor)

Powered by WordPress.com.

Up ↑