Domain Controllers


Many are the times we’ve run into DNS configuration problems with Microsoft AD.  After being asked for advice a few more times than normal this year, I’ve pulled together several emails for this list of “Troubleshooting Microsoft AD-integrated DNS” highlights below.  We’ll first cover the generic topics of checking the configuration of your server configuration,  then the configuration of the zones themselves. For each topic, we’ll do a checklist followed by an explanation.

Server configuration:

Checklist

  1. Is the server (Windows 2003 or higher) pointing to itself for primary DNS in the network configuration?
  2. If a standalone DC: Does the server have *no* secondary DNS in the network configuration?
  3. If there are multiple DCs: Does the server list only other DCs in the secondary DNS server list in the advanced network configuration?
  4. Does the server have proper forwarders in the DNS server configuration (to the parent domain or to the ISP, but not both)?
  5. In a command prompt, run the following:
    ipconfig /registerdns
    net stop netlogon
    net start netlogon
  6. Read DNS and System logs to make sure there are no issues being reported.
  7. wait 20 minutes

Explanation

One of the major problems we run into is that customers will put the ISP DNS servers in the network configuration on the DC, not in the DNS Forwarders list in the DNS Server configuration.  The DC *is* a DNS server.  It needs to talk to itself, so that it can register crucial DNS settings in its own database.  If its own database can’t find the information requested (such as www.google.com), then the DNS Server service is responsible for looking that data up, and then caching it so that it’s readily available for other clients, too.  This misconfiguration also has the problem of generating DDNS update requests back to the ISP DNS servers, which are ignored at best, and a security leak at worst (like for military/government installations).

I like to tell my Unix customers “the first rule of administering Active Directory is to go get another cup of coffee.” This forces them to take their hands off the keyboard and wait for cross-site replication (hopefully) before making another change.  It’s a good reminder for the seasoned Windows admins, as well.

Zone Configuration

Reverse Lookup Zones

We’ll cover reverse lookup zones before forward lookup zones, for two reasons: 1) customers screw up reverse lookup configuration much more often than forward lookup configuration ; 2) no SRV records in Reverse zones (normally).

Checklist

If you have non-Microsoft DNS servers or multiple AD domains in your environment

  1. Does the server have reverse DNS zones defined?
  2. Does any *other* server (in the DNS Forwarders configuration list) have the same reverse DNS zone defined?
  3. Do the defined reverse zones allow “unsecured dynamic updates”?
  4. Are all IP subnets in your network defined as reverse DNS zones on the primary DNS servers (the last forwarders in the network before the ISP)?
  5. Do you have aging and scavenging turned on in the server settings?  If so (you should), do you have all clients automatically renewing their records (Windows clients will by default)?

If you only have a single AD domain, or no non-Microsoft DNS servers

  1. Does the server have reverse DNS zones defined for all IP subnets (including IPv6) in your network?
  2. Do those reverse DNS zones allow dynamic updates?
  3. Is aging of old records enabled with sane no-refresh and refresh values  in the reverse zones?

Explanation

Each DNS Zone is a database.  There can only be one authoritative owner of the database, defined by the SOA record on the Zone.  Any other DNS servers get their information from this SOA, either by normal queries, or by zone transfer (AD replication does a kind of zone transfer).  If two servers are set up with the same zone (create 0.168.192.in-addr.arpa reverse DNS zone in dns1.contoso.com and ns1.worldwidetoys.com, for example), then there is no mechanism to transfer the information between those two servers.

For example: any individual client will only talk to the DNS server it’s configured to talk to (client1.contoso.com gets its DNS info from dns1.contoso.com and winxp1.worldwidetoys.com gets its information from ns1.worldwidetoys.com). Each client will also send updates only to its own DNS server.  This means that client1.contoso.com will register its IP 192.168.0.10 with dns1.contoso.com, and winxp1.worldwidetoys.com will register its IP 192.168.0.20 with ns1.worldwidetoys.com.  These two records will never be synched between dns1.contoso.com and ns1.worldwidetoys.com.  Therefore, when winxp1.worldwidetoys.com asks ns1.worldwidetoys.com “who has 192.168.0.10?”, ns1.worldwidetoys.com will answer “nobody!”.

The DNS admin must fix this problem by manually registering all of the records from ns1.worldwidetoys.com in the zone stored in dns1.contoso.com, deleting the 0.168.192.in-addr.arpa zone from ns1.worldwidetoys.com, and then setting up a forwarder or conditional forwarder to dns1.contoso.com.  Now, that same query results in ns1.worldwidetoys.com looking in its own database, finding no answer, and reaching out to its forwarders to ask, “who has 192.168.0.10?”.  Similarly, when winxp1.worldwidetoys.com goes to register 192.168.0.20, it is directed, via the SOA record, to send that registration to dns1.contoso.com.  This is why reverse zones often need to allow unsecured dynamic updates.

Forward Lookup Zones

I have a customer who needs this much data now – I’ll follow up with the Forward Lookup zones in a separate post later this week.

I spent more time than I’d care to admit trying to write an LDIF import file for a customer today. I started with a file provided by someone else, which is of course the root of my problem.  After adding the appropriate ” ” after each “:” character (which is absolutely required), when importing it, I was receiving the following error:

There is a syntax error in the input file
Failed on line 21. The last token starts with '-'.
An error has occurred in the program

So I opened the file in Notepad, and saw nothing wrong. I sent it back to the Linux box it came from, opened it in vi, verified it had dos line endings, and still saw nothing wrong with the format, according to the MSDN Document on the subject of LDIF Schema modifications.

Only after scrolling through the file several times did I notice that line 20 wasn’t actually blank. it was a single horizontal tab character.

To recap:

  1. LDIF formatting is extremely specific, including breaking on whitespace appearance.
  2. LDIF formatting is extremely specific, including breaking on whitespace missing.
  3. LDIF formatting is extremely specific, including requiring the “-” to be a line literally on its own.

I upgraded the TNS lab this past week from Windows 2008 to Windows 2008 R2, including replacing the 4 Domain Controllers (rather than upgrading). It gave me a chance to review the procedure for moving a Certificate Server to a new system, which I hadn’t done since 2005. For those who haven’t tried, the procedure for moving a Certificate Server is reasonably well documented at the Microsoft Support site here: http://support.microsoft.com/kb/555012. The part of this that’s especially tricky, especially in our lab, is the renaming of the DC.

In our lab we have an empty forest root, as per the old (Windows 2000-era) Microsoft recommendations, to match several large customer environments. Because it’s a lab, and no clients connect to it, we only have a single DC. I snapshotted it as a backup, and went through the procedure to rename a domain controller, also well documented by Microsoft, this time at TechNet.

For review, the procedure we planned to run was:
netdom computername dc04 /add:dc01.lwtest.corp
netdom computername dc04 /makeprimary:dc01.lwtest.corp
shutdown -r -t 0
netdom computername dc01 /enum
netdom computername dc01 /verify
netdom computername dc01 /rem:dc04.lwtest.corp

I’m still not sure what caused it, but in this case, this command failed:
netdom computername dc04 /makeprimary:dc01.tns.lab
At this point, I couldn’t make the old name primary again (I would get an “Access Denied” error), so I rebooted to see which name had taken. And that’s where things went bad.

When the DC came up, we were getting this error: Netlogon EventID 5602. Source: NETLOGON, EventID: 5602, Data: “An internal error occurred while accessing the computer’s local or network security database.”

Because the DC rename hadn’t completed successfully, the computer couldn’t actually log into itself to load AD. Very bad for the root of the forest. I wasn’t able to find anything helpful in my searches, so thought I’d let you know the fix:

Name it back to the old name and try again:
Reboot into Safe Mode.
netdom computername localhost /makeprimary:dc04.lwtest.corp
shutdown -r -t 0

Boot normally
netdom computername localhost /makeprimary:dc04.lwtest.corp
netdom computername dc01 /enum
netdom computername dc01 /verify
shutdown -r -t 0

After *that* reboot, make sure, with the verify command, that the old name took, and that you can log in, and just try the rename again.

I couldn’t get the “rename back” to take untill after the attempt in safe mode. Strange, but it’s working great now! Hopefully this will help someone.

I recently upgraded the totalnetsolutions.net internal network from ESX 3.5 to ESXi 4.1. The ESX Host upgrade itself is simple, and not worth mentioning. When complete, however, you have an option to upgrade the Guest OS Virtual Hardware from v4 to v7. Support for USB devices, thin-provisioned disks, and supposed speed improvements come with the upgrade.

The process should always be:

1. Upgrade VMware Tools to the latest available version. This pre-stages the drivers for the newest hardware, even though it’s not “installed” yet.
2. Reboot the guest and make sure it boots and runs properly after all upgrades (host and guest) have been completed.
3. Back up the entire guest VM, including the VMX and VMDK files.
4. Upgrade the virtual hardware through vSphere
5. Boot the VM and verify all settings are working properly.

I started the upgrades in the Unix lab. The Red Hat Enterprise Linux (4 and 5) and Ubuntu (10) systems went without a hitch. VMware Tools automatic upgrade went properly, systems rebooted fine, and after upgrading the virtual hardware, I didn’t have to change a thing in the guests. The Solaris 10 x86 guest, had some issues, however. I believe a rescan was all that was required to fix it, but we were planning on rebuilding the box anyways, so used the issues as the final “nail in the coffin” to the old hardware.

On the Windows side, we have 2 pools in our ESX environment: one for test machines, and one running our production environment. We have Domain Controllers (and separate forests) in both environments, but all file and Exchange operations only live in production.

The Windows 2003 DC / Exchange 2003 server came up fine, although it lost its network configuration (adapter MAC changed), so that had to be reset, but is a simple fix.

All Windows 2008 DCs in the test lab, including the RODC, came up fine, but with the same “lost network configuration” hiccup. These systems all have the NTDS data and logs on the C: drive.

The Windows 2008 Server Core DC / File server, however, was a different story. Upon reboot, the server kept giving a BSOD and rebooting, so I couldn’t read the error. As this system is the primary (200GB) file server, primary DNS server (including conditional forwarding to the test lab), and the DC that handles the most load (DNS weight on the Windows 2003 is slightly lower), fixing the Blue Screen was of major importance. This is how it’s been fixed:

1. Safe Mode and “Last known Config” didn’t work, so hit F8 on the boot process to choose “Do not restart on system failure”. This allows you to read the BSOD message. In our case, it was simply “File Not Found”. Which means, no minidump, and you might be sunk.
2. On a whim, since it is a DC, I tried to boot into Directory Services Restore Mode, hoping the “not found” file was AD related… and was right.
3. This leads us down the path of this support article.
4. Immediately upon booting, I ran: ntdsutil files integrity which gave this error:
Could not initialize the Jet engine: Jet Error -566.
Failed to open DIT for AD DS/LDS instance NTDS. Error -2147418113
5. Searching shows there’s not much useful here, but we know it’s a failure to read the DIT. This could be security, or horrid corruption.
6. I quit ntdsutil to try to check the files on the E: drive, where they lived, only to find there was no E: drive. With no MMC, it’s diskpart to the rescue.
7. diskpart
DISKPART> list disk
Disk ### Status Size Free Dyn Gpt
-------- ---------- ------- ------- --- ---
Disk 0 Online 24 GB 0 B
Disk 1 Offline 100 GB 0 B
Disk 2 Offline 100 GB 0 B

8. I ran:
select disk 1
online
select disk 2
online
exit

9. Now I can read the E: drive, so try ntdsutil files integrity again… and get the same error message. Checking the disk, everything looked fine. In Linux, I’d check permissions with a quick “touch filename”, but notepad needed to be used here, only to discover the entire disk was marked read-only. Back to diskpart!
diskpart
select disk 1
attributes disk clear readonly
select disk 2
attributes disk clear readonly

10. Now ntdsutil runs properly, reboot into normal mode, and the system is fixed!

I haven’t seen posts of other people having disks get marked offline and unreadable on their VMs after an upgrade, but this only happened on the Windows 2008 system, and it’s non-system disks.

I built a Windows Server 2008 Server Core DC last week. It’s an interesting exercise because you have to use an unattend.txt file. I found quite a few places online that listed RODC unattend.txt files, but not full read-write DC unattend.txt files. So, attached to this post you’ll find the unattend.txt I used, but also, of more interest, I’m attaching the full help file directly from the server, which I used to create the file.

FIrst, you have to install the server and set an IP address – my previous posts on IP changes on DCs all used netsh commands as well, so if you followed thouse, you should be somewhat prepared for Server Core. I already had a WIndows Server 2003 DC in the environment, so that will be my primary DNS server for the install, untill DCPromo edits the settings.
netsh interface ipv4 set address local static 10.1.1.6 255.255.255.0 10.1.1.1 10
netsh interface ipv4 set dns local static 10.1.1.5
netsh interface ipv4 set wins local static 10.1.1.5

Now networking is set up, we can rename the computer: netdom renamecomputer %computername% /NewName:dc02 and join the domain with etdom join dc02 /domain:foo.local /UserD:FOO\Administrator /reboot:5 /PasswordD:*. The “5″ after the reboot flag says to reboot 5 seconds after completion, and the “*” at the end says to prompt you for your password. I join the system to the domain manually first, because then I can WSUS patch it (if WSUS is in the network), or open up the firewall for any other patching software I have.

Once the server is back from reboot, activate, update the firewall to allow remote MMC connections (if you’re not doing that through GPO already), and install new roles.
slmgr.vbs -ato
netsh advfirewall firewall set rule group="Remote Administration" new enable=yes

The following roles are optional, depending on the service of the server. Mine has DNS and the File Server roles, but not DHCP. None of these are required to install AD Domain Services!
start /w ocsetup DNS-Server-Core-Role
start /w ocsetup DHCPServerCore
start /w ocsetup FRS-Infrastructure
start /w ocsetup DFSN-Server
start /w ocsetup DFSR-Infrastructure-ServerEdition

If this is the first Windows Server 2008 DC in your environment, you’ll need to take the Windows Server 2008 DVD to the DC with the Infrastructure Master role (required for /gpprep only) and run the following (E: assumed as DVD-ROM drive):
e:\sources\adprep\adprep.exe /forestprep
e:\sources\adprep\adprep.exe /domainprep
e:\sources\adprep\adprep.exe /domainprep /gpprep
(Also run adprep /rodcPrep if you plan on building RODCs.)

Now you’re ready to do the DCPromo itself. Create an unattend.txt file. To add a DC to an existing domain, you can use:
[DCInstall]
AutoConfigDNS=Yes
ConfirmGc=Yes
DatabasePath=E:\Windows\NTDS
LogPath=c:\windows\NTDS
RebootOnSuccess=Yes
ReplicaDomainDNSName=foo.local
ReplicaOrNewDomain=Replica
ReplicationSourceDC=dc01.foo.local
SafeModeAdminPassword=passwordhere
SysVolPath=e:\windows\SysVol
UserDomain=foo.local
/Password:passwordhere

DCPromo will wipe out the passwords when it starts, or you can fill in “*” instead of the password, to be prompted. When it’s done, the server will reboot and be a new Global Catalog / DC in your domain. DCPromo will install neccessary binaries and configure the firewall for DC Services for you. It’s quite slick.

And as promised, here are the DCPromo Unattend Options for reference for creating your own unattend.txt.

« Previous PageNext Page »