19
Sep
10

Windows 2008 Server Core ESX Guest Virtual Hardware Upgrade – BSOD on boot

I recently upgraded the totalnetsolutions.net internal network from ESX 3.5 to ESXi 4.1. The ESX Host upgrade itself is simple, and not worth mentioning. When complete, however, you have an option to upgrade the Guest OS Virtual Hardware from v4 to v7. Support for USB devices, thin-provisioned disks, and supposed speed improvements come with the upgrade.

The process should always be:

1. Upgrade VMware Tools to the latest available version. This pre-stages the drivers for the newest hardware, even though it’s not “installed” yet.
2. Reboot the guest and make sure it boots and runs properly after all upgrades (host and guest) have been completed.
3. Back up the entire guest VM, including the VMX and VMDK files.
4. Upgrade the virtual hardware through vSphere
5. Boot the VM and verify all settings are working properly.

I started the upgrades in the Unix lab. The Red Hat Enterprise Linux (4 and 5) and Ubuntu (10) systems went without a hitch. VMware Tools automatic upgrade went properly, systems rebooted fine, and after upgrading the virtual hardware, I didn’t have to change a thing in the guests. The Solaris 10 x86 guest, had some issues, however. I believe a rescan was all that was required to fix it, but we were planning on rebuilding the box anyways, so used the issues as the final “nail in the coffin” to the old hardware.

On the Windows side, we have 2 pools in our ESX environment: one for test machines, and one running our production environment. We have Domain Controllers (and separate forests) in both environments, but all file and Exchange operations only live in production.

The Windows 2003 DC / Exchange 2003 server came up fine, although it lost its network configuration (adapter MAC changed), so that had to be reset, but is a simple fix.

All Windows 2008 DCs in the test lab, including the RODC, came up fine, but with the same “lost network configuration” hiccup. These systems all have the NTDS data and logs on the C: drive.

The Windows 2008 Server Core DC / File server, however, was a different story. Upon reboot, the server kept giving a BSOD and rebooting, so I couldn’t read the error. As this system is the primary (200GB) file server, primary DNS server (including conditional forwarding to the test lab), and the DC that handles the most load (DNS weight on the Windows 2003 is slightly lower), fixing the Blue Screen was of major importance. This is how it’s been fixed:

1. Safe Mode and “Last known Config” didn’t work, so hit F8 on the boot process to choose “Do not restart on system failure”. This allows you to read the BSOD message. In our case, it was simply “File Not Found”. Which means, no minidump, and you might be sunk.
2. On a whim, since it is a DC, I tried to boot into Directory Services Restore Mode, hoping the “not found” file was AD related… and was right.
3. This leads us down the path of this support article.
4. Immediately upon booting, I ran: ntdsutil files integrity which gave this error:
Could not initialize the Jet engine: Jet Error -566.
Failed to open DIT for AD DS/LDS instance NTDS. Error -2147418113
5. Searching shows there’s not much useful here, but we know it’s a failure to read the DIT. This could be security, or horrid corruption.
6. I quit ntdsutil to try to check the files on the E: drive, where they lived, only to find there was no E: drive. With no MMC, it’s diskpart to the rescue.
7. diskpart
DISKPART> list disk
Disk ### Status Size Free Dyn Gpt
-------- ---------- ------- ------- --- ---
Disk 0 Online 24 GB 0 B
Disk 1 Offline 100 GB 0 B
Disk 2 Offline 100 GB 0 B

8. I ran:
select disk 1
online
select disk 2
online
exit

9. Now I can read the E: drive, so try ntdsutil files integrity again… and get the same error message. Checking the disk, everything looked fine. In Linux, I’d check permissions with a quick “touch filename”, but notepad needed to be used here, only to discover the entire disk was marked read-only. Back to diskpart!
diskpart
select disk 1
attributes disk clear readonly
select disk 2
attributes disk clear readonly

10. Now ntdsutil runs properly, reboot into normal mode, and the system is fixed!

I haven’t seen posts of other people having disks get marked offline and unreadable on their VMs after an upgrade, but this only happened on the Windows 2008 system, and it’s non-system disks.


4 Responses to “Windows 2008 Server Core ESX Guest Virtual Hardware Upgrade – BSOD on boot”


  1. 1 Wilbert
    2011-05-17 at 07:56

    Nice one! Solved my problem during the VMware HW upgrade to version 7.

    Another quick option to check if the disk is read-only:

    1) Diskpart
    2) List disk
    3) Select disk x
    4) atributes disk

    Thanks!

  2. 2011-05-17 at 08:57

    Thanks Wilbert. It would have been a lot simpler if I had even thought “the disk might be read-only after the remount, too…” initially. But, we learn as we try new things, right?

  3. 2012-06-22 at 12:03

    Cheers Robert – just had this happen to me:

    Used vmware fusion to create a VM while waiting for the server to be delivered with esxi on it.

    Used converter to ‘install’ the VM on the server, and had the same BSOD.

    Setup as 2nd DC in the main forest and a file server.

    Procedure worked perfectly.

    Many Thanks,

    Tony


Leave a Reply


About Us

Complete networking solutions for business.
September 2010
M T W T F S S
« Jul   Nov »
 12345
6789101112
13141516171819
20212223242526
27282930