It’s possible this is common knowledge, but since I’ve never run into it before, and since the workaround seems a little obscure, I figured I’d put it up here for future reference. Hopefully one or two of you find it useful!
Once again, the SSDs in my Synology 1511+ have caused an issue. First time it was my fault (repeat after me, RAID0 is never the answer), when a failure of a single disk brought the volume down. I took the SSD that “failed”, ran it through every test I could think of, and when it passed them all put it back into the array. I know, I know. Shush.
Of course, as all of you are thinking, that drive failed again. But since I’d moved it to a RAID1 set, crisis was averted! The drive is still under warranty, so process the RMA paperwork, send it off, no worries. That is, until the second SSD also failed. :-/
The volume crashed hard again, and the two iSCSI datastores that were presented to the lab went down with them. There were a total of three VMs affected, a domain controller, an OpenVPN client that builds a point-to-point tunnel to the hosted side of the lab, and a <purposely.vague.description> general purpose workstation that runs various software </purposely.vague.description>. Nothing irreplaceable, although having the internal DNS and DHCP services fail with the AD controller did cause much wailing and gnashing of teeth from my internal iPad users.
So the datastores are gone, the VMs are gone, heck, even the two drives that make up the volume are gone. No big deal, it won’t take more than a couple hours to spin up new VMs and get things back to normal. The only nagging issue is that the VMs that have been nuked still show up in vCenter.
I can’t delete them, or even remove them from inventory, since those options are greyed out in the right-click menu.
The knowledge base article that deals with this is #1008752, but this is the workaround to delete the hostAgentStats files from /var/lib/vmware/hostd/stats on the host that has the VMs registered. In my case, that was two separate hosts, so I did the following on each.
Interesting note: the vCenter Web Client is sometimes very slow to pick up the fact that the files have been deleted. It must be caching something, and a log out/log in would resolve it. Or you could do what I did, and flip over to the trusty C# client which updates in real-time and remove them right away.
There you go. Yes, I know that KB refers to ESX2.5 and ESX3.5, but it worked on ESXi 5.5U1 so I’m going with it. If there’s a better way to do this, please leave everyone a note in the comments!
165,841 total views, 85 views today