I recently encountered a problem where one server in a pool had shutdown expected in a way which cased the vms running on that host to fail. We restarted the host and found that about half of the vms returned to the pool and could be started on another pool member, however a handful of vms were unable to start. Using information I have previously posted, I checked the power-state for these vms and they were in a halted state. However they were not available in the list_domains command. Further attempts at recovery had failed.
At that point we took a closer took at the system and discovered that the dom0 drive had zero free disk space by running the command df from the console. I connected using winscp and browsed to the log directory and deleted a majority of the old and large log files, which freed up over 59% of the disk space. Another reboot later and the disk space issue was resolved,
However in this case, there was a second issue, which is that the host that was in this state was hosting the Citrix license server and this specific host was unable to contact the license server so it couldn’t start vms. But since this vm was halted instead of stopped I couldn’t start it on a different host yet. Simply going into the license manager in XenCenter, I removed licensing on the host, which placed it into a 28 grace period. Once this was completed I could restart the halted vms, and then subsequently repoint the host back to the license server to remain the Enterprise License feature set.