I was troubleshooting a problem with some other vendor’s software tonight on a Red Hat Enterprise Linux 5.3 system.  We were able to reproduce the problem in the lab,  which was a huge boost to production, and insight, but we hit a wall when we got this error:

GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-42.el5)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
Reading symbols from /opt/vendor/redacted/process...(no debugging symbols found)...done.
Attaching to program: /opt/vendor/redacted/process, process 16492
ptrace: Operation not permitted.

The weird thing is that we were running gdb as root, and it was 2.6.18. In the latest Ubuntu versions, a security hardening option has been added to the kernel to limit gdb (profiling, particularly, which gdb requires) to only being run on child processes. Since this was Red Hat Enterprise Linux 5.3, it didn’t have this option.

Well, it turns out that explaining it to #gdb on Freenode pointed us to the solution: a parent process had been attached to via “strace -f”. Since only one profiling process can run on any program at one time, the parent process’s “strace”, by following all forks with the “-f”, blocked out our “gdb” from attaching to the child. Simply adding 1 line:

pkill strace
gdb /opt/vendor/redacted/process `pgrep process`

to our debugging solved the mysterious “ptrace: Operation not permitted.” error which was showing me no results in web searches. FYI: this absolutely will block gcore in the same way.

(Originally drafted November 2nd, 2007, finally finished and posted much later)
As I posted last night, we built a new Fedora Core 7 box last night for PHP testing. Whenever at all possible, I leave SELinux enabled on new systems in Enforcing mode. Oracle 10g hasn’t had any issues with it, Oracle 11i EBusiness Suite hasn’t had any issues with it, and my NFS and FTP servers run without at hitch. The Oracle systems are RHEL4 (Red Hat Enterprise Linux 4), and the NFS and FTP servers are RHEL5.

However, this new PHP webserver caused a few glitches. I feel a little silly for not catching this as being an SELinux problem earlier, but since it’s caused 0 issues in 9 months of use in production, I didn’t even consider it initially.

What we initially saw was 0 errors from PHP – all the pages would run without error. PHP.ini has the following lines:

sendmail_from = from@domain.com
sendmail_path = /usr/sbin/sendmail -t -i

and testing cat mail.txt | /usr/sbin/sendmail -t -i as a non-root user delivered mail properly as well. Combine that with /var/log/maillog being completely empty for every test page loaded, and it was sure that the mail wasn’t getting TO postfix (our preferred localhost MTA).

So, I looked at the /var/log/httpd/error_log for apache and found:

sh: /usr/sbin/sendmail: Permission denied
sh: /usr/sbin/sendmail: Permission denied
sh: /usr/sbin/sendmail: Permission denied
sh: /usr/sbin/sendmail: Permission denied
sh: /usr/sbin/sendmail: Permission denied

But I knew that non-root users could access sendmail as defined in php.ini, so I finally decided to tail /var/log/messages and saw:

Nov 2 11:05:41 $(servername) setroubleshoot: SELinux is preventing the sh from using potentially mislabeled files sendmail.postfix (sendmail_exec_t). For complete SELinux messages. run sealert -l c9001c48-5d48-4b7c-9fd7-8400544daa8f

So now to fix it…
This is surprisingly simple, actually. The sad part is, we had this problem, fixed it, forgot about it, had it again, and I blogged it… and lost the post. so this has been sitting in my “drafts” folder for about 10 months now:
setsebool httpd_can_sendmail=true
service httpd restart
service postfix restart

And retry sending mail. There’s a few posts about sendmail and having to change permissions on home directories or on “main.cf”, but I use postfix, and not sendmail, so I don’t know how effective or necessary those changes are.


(Edit: repost on 2/23/2012 because of a DB problem losing the original)

I ran into a problem 2 years ago where I couldn’t remember the native packet capture tool for Solaris and couldn’t install tcpdump, so i thought I’d put down as many as many native packet capture commands as I knew, by OS, in a single place.  I’ll update this as I find more, since there’s hundreds of Operating systems out there.

  • AIX: iptrace: /usr/sbin/iptrace [ -a ] [ -b ][ -e ] [ -u ] [ -PProtocol_list ] [ -iInterface ] [ -pPort_list ] [ -sHost [ -b ] ] [ -dHost ] [ -L Log_size ] [ -B ] [ -T ] [ -S snap_length] LogFile
  • FreeBSD: tcpdump (I think): tcpdump [ -adeflnNOpqRStuvxX ] [ -c count ] [ -C file_size ] [ -F file ] [ -i interface ] [ -m module ] [ -r file ] [ -s snaplen ] [ -T type ] [ -w file ] [ -E algo:secret ] [ expression ]
  • HP-UX: nettl: nettl requires a daemon start, and other setup: /usr/sbin/nettl -traceon kind… -entity subsystem… [-card dev_name…] [-file tracename] [-m bytes] [-size portsize] [-tracemax maxsize] [-n num_files] [-mem init_mem [max_mem]] [-bind cpu_id] [-timer timer_value]
  • Linux 2.4 and higher:
    • tcpdump (some distros): tcpdump [ -AdDefKlLnNOpqRStuUvxX ] [ -c count ] [ -C file_size ] [ -G rotate_seconds ] [ -F file ] [ -i interface ] [ -m module ] [ -M secret ] [ -r file ] [ -s snaplen ] [ -T type ] [ -w file ] [ -W filecount ] [ -E spi@ipaddr algo:secret,… ] [ -y datalinktype ] [ -z postrotate-command ] [ -Z user ] [ expression ]
    • wireshark (some distros, used to be called “ethereal”): GUI-config, no command-line, use tethereal (now tshark) for that
    • tshark: tshark [ -a <capture autostop condition> ] … [ -b <capture ring buffer option>] … [ -B <capture buffer size (Win32 only)> ]  [ -c <capture packet count> ] [ -C <configuration profile> ] [ -d <layer type>==<selector>,<decode-as protocol> ] [ -D ] [ -e <field> ] [ -E <field print option> ] [ -f <capture filter> ] [ -F <file format> ] [ -h ] [ -i <capture interface>|- ] [ -l ] [ -L ] [ -n ] [ -N <name resolving flags> ] [ -o <preference setting> ] … [ -p ] [ -q ] [ -r <infile> ] [ -R <read (display) filter> ] [ -s <capture snaplen> ] [ -S ] [ -t ad|a|r|d|e ] [ -T pdml|psml|ps|text|fields ] [ -v ] [ -V ] [ -w <outfile>|- ] [ -x ] [ -X <eXtension option>] [ -y <capture link type> ] [ -z <statistics> ] [ <capture filter> ]
  • Mac OSX: tcpdump (among others): tcpdump [ -adeflnNOpqRStuvxX ] [ -c count ] [ -C file_size ] [ -F file ] [ -i interface ] [ -m module ] [ -r file ] [ -s snaplen ] [ -T type ] [ -w file ] [ -E algo:secret ] [ expression ]
  • Solaris: snoop: snoop [ -aPDSvVNC ] [ -d device ] [ -s snaplen ] [ -c maxcount ] [ -i filename ] [ -o filename ] [ -n filename ] [ -t [ r | a | d ] ] [ -p first [ , last ] ] [ -x offset [ , length ] ] [ expression ]
  • Windows 2000, XP, 2003, Vista, 2008 and beyond:

Any others anyone wants added (or corrected), just comment or email and I’ll update this.
(Edit 7/29/08 – change tcpdump link)
(Edit 10/13/08 – add tshark info, thanks Jefferson!, and wireshark on Windows)
(Edit 2/23/2012 – repost since a DB problem lost this post.  Thanks wayback machine!)

A quick CLI reference for perl people…

perl -e ' my @t=localtime(time() + $ARGV[0]*24*60*60); $t[4]++; $t[5]+=1900; print "$t[4]/$t[3]/$t[5]\n";' XX

I’ve needed this 2x today already, and hope it helps you!

Someone made a comment, as people on the internet are prone to do, so here’s the long-form non-one-liner version:

my $addDays = shift;
my ($second, $minute, $hour, $day, $month, $year, $dayOfWeek, $dayOfYear, $daylightSavings) = localtime(time());
my ($fsecond, $fminute, $fhour, $fday, $fmonth, $fyear, $fdayOfWeek, $fdayOfYear, $fdaylightSavings) = localtime(time() + $addDays*24*60*60);

#fix 0 = 1 values, and "0 = 1900" problem:

print "today is: $month/$day/$year\n";
print "$addDays days from today is: $fmonth/$fday/$fyear\n";

Run it as:

rob@laptop:~$ fdate.pl 50
today is: 1/25/2012
50 days from today is: 3/15/2012

Because not enough information exists in easy-to-find searches: as a simple reminder – SCSI device IDs can and will change.

A few months ago I hot-added a new disk to an ssh bastion host (a VM on ESX). As these things tend to go, I eventually took a maintenance window and updated firmware/BIOS/OS on the ESX host. When the bastion VM came back online, however, I was presented with an odd error:

[root@bastion ~]: fsck /dev/sdc1
e2fsck 1.39 (29-May-2006)
fsck.ext3: Device or resource busy while trying to open /dev/sdc1
Filesystem mounted or opened exclusively by another program?
[root@oracle1 ~]# cat /proc/mounts
rootfs / rootfs rw 0 0
/dev/root / ext3 rw,data=ordered 0 0
/dev /dev tmpfs rw 0 0
/proc /proc proc rw 0 0
none /selinux selinuxfs rw 0 0
devpts /dev/pts devpts rw 0 0
tmpfs /dev/shm tmpfs rw 0 0
none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0
sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0
[root@oracle1 ~]# cat /etc/fstab
/dev/main/root / ext3 defaults 1 1
/dev/sdc1 /home ext3 defaults 1 2
/dev/main/var /var ext3 defaults 1 2
/dev/main/tmp /tmp ext3 defaults 1 2
LABEL=/boot /boot ext3 defaults 1 2
tmpfs /dev/shm tmpfs defaults 0 0
devpts /dev/pts devpts gid=5,mode=620 0 0
sysfs /sys sysfs defaults 0 0
proc /proc proc defaults 0 0
/dev/main/swap swap swap defaults 0 0
# Beginning of the block added by the VMware software
.host:/ /mnt/hgfs vmhgfs defaults,ttl=5 0 0
# End of the block added by the VMware software

So everything in the fstab is how I left it – /dev/sdc1 is the new disk I added that is giving errors mounting. So I thought to check for corruption on the disk, and found the problem:

[root@oracle1 ~]# fdisk -l
Disk /dev/sda: 42.9 GB, 42949672960 bytes
255 heads, 63 sectors/track, 5221 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 83 Linux
/dev/sda2 14 5221 41833260 8e Linux LVM
Disk /dev/sdb: 42.9 GB, 42949672960 bytes
255 heads, 63 sectors/track, 5221 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdb1 1 5221 41937651 83 Linux
Disk /dev/sdc: 32.2 GB, 32212254720 bytes
255 heads, 63 sectors/track, 3916 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Device Boot Start End Blocks Id System
/dev/sdc1 * 1 3917 31457279+ 8e Linux LVM

So, a simple fix – change “/dev/sdc1″ to “/dev/sdb1″ in /etc/fstab (or to VOLUME=home), and boot back up.

It’s not something that’ll probably happen on this server again, but it is something to be aware of, on both VM guests and on physical servers. This is why so many newer Linux OSes are using UUID= or VOLUME= instead of device path for SCSI disks.

« Previous Page