The past week has given me major troubles.  I was tasked with performing a restore of a large database from our offsite storage.  Upon getting the tapes back I found that their indexes were no longer available and I would need to read them in from the tapes…there were only 107 tapes.  Not knowing the software well enough to accomplish this quickly I contacted support, where things began to get more…”interesting”.

After four hours on the phone I was able to determine the two tapes that would be needed to recover the 79Gb database file and started reading in the specific saveset that was needed.  Two hours later I was able to start a restore, which failed.  2Gb of the restore file was missing.  After another two hours on the phone with support I was told “Let’s reposition the tape.  It could take a while, on newer technology I’ve seen it take an hour, on LTO1 and LTO2 drives I’ve seen it take 8 hours.”  You guessed it, I have LTO2 drives.  Fortunately I have a multitude of drives to reposition the tapes with so it won’t impact backups, unfortunately I have a time limit on the restore that’s fast approaching.

So what do you do when you backup your file systems?  Do you simply believe that the software you backup with validates your tapes or do you test them regularly?  Are you satisfied with seeing an email at the end of a backup routine stating “SUCCESS”?  Then answer is simply NO.  Your backups are only as good as your ability to restore from them. Keeping that in mind and all the different technologies and services available what do you choose?

For us the answer is simple.  We require low cost, reliable, offsite secure data storage as do most companies nowadays.  TAPE.  We’ve looked into collocated services and replicated SANs with virtual tape backup but the cost far exceeds it’s benefits.  Tape technology has been proven over and over for decades.  There is no cost effective replacement for a good old fashioned tape, even taking into consideration the troubles it can give you.  Our entire datacenter can be put onto 6 tapes costing $25 each.  4.8TB for $150.

Any good backup initiative should be followed up with an equally adequate restore plan.  So next time you recommend a backup solution plan a regular restore plan to test because there’s nothing worse than spending an entire week restoring one file.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

We’ve had a few customers and Open users posting about problems with machine accounts trying to access Samba shares and getting denied with:
smbd/sesssetup.c:reply_spnego_kerberos(439) Username DOM\COMPUTER1$ is invalid on this system
The “$” at the end of the account name means it’s a computer account, not a user. We’re seeing this for Citrix MetaFrame application servers on shared storage, startup scripts not stored on a DC, and several other cases.

On a Samba server joined to AD with winbind, this is easy to deal with because Samba’s winbind can treat the computer accounts just like user accounts, and assign them access to the unix filesystem with whatever backend has been configured. When a Samba server is joined with Likewise, however, the machine accounts are not visible, and the “username is invalid” message comes up.

Fortunately, Samba gives us a method to handle this, in form of the “username map” directive in /etc/samba/smb.conf.  There are two ways to use this, the first is with the username map file.
In smb.conf, to simply add:
[global ]
username map = /etc/samba/smbusers

then create a file named /etc/samba/smbusers and populate it with localuser=aduser pairs, like:
COMPUTER1$ = compacct
COMPUTER2$ = compacct
CITRIXFARM1$ = citrxact

and so on. Lastly, you’ll have to add the local accounts from the pairs above:
useradd -c "Account for AD Computers to use Samba" compacct -G users -u 998
useradd -c "Account for AD Citrix Servers to use Samba" citrxact -G users -u 999

Then, whenever one of the AD computers in the list attempts to access the Samba share, it’ll be mapped to the local account.

The problem with this is when you have a lot of servers, like a Citrix MetaFrame farm, or a Windows Server 2008 R2 Remote Desktop Services farm, that may be changing frequently, because managing that file could get hard. In this case there is the username map script directive, which is added to smb.conf as:
[global ]
username map script = /usr/lib/samba/auth/machine-acct-map.pl

Then download this script and save it in /usr/lib/samba/auth/ and make it executable (chmod +x /usr/lib/samba/auth/machine-acct-map.pl). Then run:
useradd -c "Account for AD Computers to use Samba" compacct -G users -u 998
Now, all computers which access the share will be remapped to the “compacct” user, and you won’t have to manage a file for every time the server farm changes.

Get the file here.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

aka: Technology by Voodoo, Information Technology by Voodoo, Troubleshooting by Voodoo, Administration by Voodoo, Troubleshooting by Faith, etc.

The act of “trusting” that a computer will do something every time the same way, only because it did the last 2 times you tried it.

The alternative is to actually learn what the computer is doing, so that you can know it will do the same thing each time, because you’ve controlled all of the appropriate parameters.

Usage: “This sysadmin is performing IT by Voodoo – he just asked if I have faith that my file copy will work.”

Now that it’s defined, can we all stop doing it?  There’s enough resources on the internet to figure out how anything works down to the API call at least, and in some cases down to the processor registers, if you care to go that far.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

I just finished my upgrade from Kubuntu 8.04 to 8.10 this past week (since I had downtime from work, I could afford to break things for a few days).  The upgrade went great, and I’ll write about it shortly, once I get used to the newness.

Anyways; Workstation 6.5 has been giving me problems.  Because of the newness of KDE4, I initially thought it was a KDE problem, but it turns out it’s something between Workstation 6.5 and Ubuntu 8.10.  I just ran the “adapt –dist-upgrade-devel” command from the Ubuntu wiki to upgrade, and upon reboot, I couldn’t “ctrl-alt-ins” or “ctrl-alt-del” to log into my Windows VM, my “Windows/Start” key on the keyboard wouldn’t respond, and my arrow keys wouldn’t work.  Incredibly, when I’d hit the “down” arrow, I’d get the Windows Start menu pop up!!

Fix is easy, edit /etc/vmware/config and add the line below like:

sudo vim /etc/vmware/config
:$
A (that's vi-command for "go to the end of the file, and start writing a new line")
xkeymap.nokeycodeMap = true

Have to restart your VMs for this change to take effect. Thanks to Duncan Epping for this fix (he posted it in the forums, where I found it).

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

In the Windows world, tools like Group Policy, System Center Configuration Manager, and DesktopAuthority, among others, have been around for 8 or more years to allow fast simple deployment of software and updates to remote computers, or force tasks to be run on remote computers.

For the Unix/Linux world, there doesn’t seem to be as much available.

If you have a pure HP-UX shop, there is HP Systems Insight Manager (SIM) with plug-ins available for software deployment, and I believe IBM Tivoli has a function or sub-product which does the same thing if you have all AIX systems. Red Hat Network has a feature to allow commands to be run on your servers, but only whenever they check in with the RHN or your internal Satellite Server (much like Group Policy, except GPO doesn’t allow “in the middle of the day” script creation without GP-Preferences). So what’s available that’s like SCCM or DesktopAuthority – a “click now and do this thing” tool?

A bunch of my customers just have various levels of logging and processing that come down to being a big for loop that ssh’s into a server and runs a command:
for i in `cat server-list.txt` ; do scp scriptname $i:/root/; ssh $i "/root/scriptname" | tee logfile-$i.log; done;
While it works great for smaller commands. if you have a mixed environment, the “scriptname” script has to be intelligent enough to know what it’s running against, or your “server-list.txt” has to be broken up by class of system. In either case, if you have 200 systems in the list, and the task takes 5 minutes per server, a single install will run for 16-17 hours.

Software like Likewise Enterprise which allow Group Policy management to remote computers is great, because you can have guaranteed delivery and execution of your script or command in (by default) 30 minutes, but my problem is how to get it there in the first place?

So, administrators out there in companies with 1000, 4000, 10000+ servers (or even Desktops), what mutli-threaded or multi-process tool are you using to tackle this timing/resouce problem? Please post below!

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

« Previous PageNext Page »