Technology


I upgraded the TNS lab this past week from Windows 2008 to Windows 2008 R2, including replacing the 4 Domain Controllers (rather than upgrading). It gave me a chance to review the procedure for moving a Certificate Server to a new system, which I hadn’t done since 2005. For those who haven’t tried, the procedure for moving a Certificate Server is reasonably well documented at the Microsoft Support site here: http://support.microsoft.com/kb/555012. The part of this that’s especially tricky, especially in our lab, is the renaming of the DC.

In our lab we have an empty forest root, as per the old (Windows 2000-era) Microsoft recommendations, to match several large customer environments. Because it’s a lab, and no clients connect to it, we only have a single DC. I snapshotted it as a backup, and went through the procedure to rename a domain controller, also well documented by Microsoft, this time at TechNet.

For review, the procedure we planned to run was:
netdom computername dc04 /add:dc01.lwtest.corp
netdom computername dc04 /makeprimary:dc01.lwtest.corp
shutdown -r -t 0
netdom computername dc01 /enum
netdom computername dc01 /verify
netdom computername dc01 /rem:dc04.lwtest.corp

I’m still not sure what caused it, but in this case, this command failed:
netdom computername dc04 /makeprimary:dc01.tns.lab
At this point, I couldn’t make the old name primary again (I would get an “Access Denied” error), so I rebooted to see which name had taken. And that’s where things went bad.

When the DC came up, we were getting this error: Netlogon EventID 5602. Source: NETLOGON, EventID: 5602, Data: “An internal error occurred while accessing the computer’s local or network security database.”

Because the DC rename hadn’t completed successfully, the computer couldn’t actually log into itself to load AD. Very bad for the root of the forest. I wasn’t able to find anything helpful in my searches, so thought I’d let you know the fix:

Name it back to the old name and try again:
Reboot into Safe Mode.
netdom computername localhost /makeprimary:dc04.lwtest.corp
shutdown -r -t 0

Boot normally
netdom computername localhost /makeprimary:dc04.lwtest.corp
netdom computername dc01 /enum
netdom computername dc01 /verify
shutdown -r -t 0

After *that* reboot, make sure, with the verify command, that the old name took, and that you can log in, and just try the rename again.

I couldn’t get the “rename back” to take untill after the attempt in safe mode. Strange, but it’s working great now! Hopefully this will help someone.

I had a Bourne Shell (sh) script I needed to capture the exit status of, but it was being run through “tee” to capture a log file, so “$?” always returned the exit status of “tee”, not the script. In a nutshell, it went something like this:
#!/bin/sh
DO_LOG=$1
LOGNAME="`hostname`.out"
if [ "$DO_LOG" -eq "1" ]; then
# Logging is turned on, so relaunch ourself with logging disabled, and tee the output to the logfile
sh $0 0 | tee $LOGNAME
exit $?
fi
#... Do lots of things in the script
exit $ERRORCODE

Now, the important thing here is that the script sets very specific error codes (we have 16 defined) based on different error states, so that a tool like HP Opsware can give us different reports based on the exit status. When run with “0″ for no logging, this works great, but it requires the controlling tool to capture logs, and not all do (especially cheap “for” loops in a shell script.)

But when run with logging enabled, all of the fancy error code handling (45 lines of subroutines’ worth) gets lost, because “$!” is equal to the status code of the “tee” command. Bash scripters out there will say “but what about $PIPESTATUS ?” If we could use bash, the code would be:
#!/bin/sh
DO_LOG=$1
LOGNAME="`hostname`.out"
if [ "$DO_LOG" -eq "1" ]; then
# Logging is turned on, so relaunch ourself with logging disabled, and tee the output to the logfile
sh $0 0 | tee $LOGNAME
exit ${PIPESTATUS[0]}
fi
#... Do lots of things in the script
exit $ERRORCODE

(Note the single line change in the conditional exit.)

But, I don’t have the luxury of bash (thanks AIX and FreeBSD and Solaris 8), so we needed to get fancy…
#!/bin/sh
DO_LOG=$1
LOGNAME="`hostname`.out"
if [ "$DO_LOG" -eq "1" ]; then
# Logging is turned on, so relaunch ourself with logging disabled, and tee the output to the logfile
cp /dev/null $LOGNAME
tail -f $LOGNAME &
TAILPID=$!
sh $0 0 >> $LOGNAME 2>&1
RETURNCODE=$?
kill TAILPID
exit $RETURNCODE
fi
#... Do lots of things in the script
exit $ERRORCODE

In this last example, we’re creating the empty logfile by copying /dev/null to the logname, then starting a backgrounded “tail” command on the empty file. Because we haven’t disconnected STDOUT in the backgrounding, we will still get the screen output we desire from “tail”. The script now only writes *its* output, with redirected STDOUT and STDERR, to the log file, which is already being tailed to the actual screen. At the end of the script, we capture the true exit code, clean up the tail ugliness, and exit with the desired status code.

This does have a serious downside that if the script encounters and error and exits, the “tail” is left running indefinitely on Linux and Solaris, since the kernel there will simply scavenge the process to be owned by init. So, if you take this method, be very careful to capture all errors you may possibly encounter. Or, just use a better scripting tool. :)

I have recently pushed the main ESX host for TNS to 70% overcommit on RAM, since upgrading to 4.1. Interestingly (expectedly), the performance now is the same as it was on 3.5 with 2 fewer VMs and only 50% overcommit. But, it’s still pretty poor in the “Lab” performance pool, even after changing that pool from “low” to “normal” shares. So we finally ordered new memory, doubling the server to 16gb. It goes in Sunday night, so we’ll see how things perform next week when Rob’s on site with customers.

I’ve been fighting K9Mail for weeks now, trying to get it to sync with MailStreet who hosts “exchange.ms”) hosted Exchange. If you’ve already followed the instructions at the K9Mail Wiki with no success, read on.

Thanks to the k9mail wiki on debugging connection issues and the fact that I already had the Android SDK installed, I was able to solve the 2 related errors I was getting. I would either get an “HTTP 404 not found” or an “HTTP 501 Not Implemented” depending on the settings I chose. With no additional settings other than suggested in the Wiki, I’d get a “501 not implemented”. If I tried to set a mailbox path, or a WebDAV path, I’d get the HTTP 404 Not Found.

In the debugging log, I saw that the system was calling “http://mail.$domain.exchange.ms/”$webDAVpath/Inbox – if I set it to a full URL, the full URL was getting appended. When I attempted to hit those same paths in a full browser, I’d always get an HTTP 404. So, digging in my history in Firefox, I found the following (cleaned) path:

http://mail.$domain.exchange.ms/exchange/$emailaddress/

In this case $emailaddress was my Exchange mail address with the “@” stripped out. Appending “Inbox” to the end of this path resulted in a valid load of my OWA inbox.

Plugging then: /exchange/$emailaddress/ into the WebDAV box in K9Mail, and my email immediately loaded up.

Now I have Android syncing my calendars and contacts, and k9mail is handling my massive inbox!

Upgrading software – always required to keep things secure. Windows, WordPress, Mac OSx, Linux, Office, Firefox, etc. So I just finished upgrading TotalNetSolutions.net again. Hopefully I’ll be able to be better about this, now that WordPress does the automatic upgrades now.

I’ve been doing the automatic upgrades on one of my other sites since they came out. They’re easy, fast, and even more painless than the 3-step upgrade that works so well. So now, I should be able to keep TNS much further away from the “cobbler’s kids” syndrome so many small company’s systems suffer with.

« Previous PageNext Page »