bash command timeout

Posted by Peter Burkholder Mon, 22 Jun 2009 18:51:00 GMT

At $WORK I’m needing to maintain a backup system wherein our backup server a) starts an SSH process to stop-and-dump our CMS service, then b) SCPs the dumpfile back to the backup servers for writing to tape. I’ve discovered that the stop-and-dump part of the process would hang for 24 hours* when the stop-and-dump perl script exited but the initiating OpenSSH sshd process would not exit, preventing the SCP process from going forward.

I’ve decided to put a command timeout on the SSH process, and here’s how it looks in bash:

# Inspired by:
# http://www.ultranetsolutions.com/BASH-terminate-command-after-timeout.html
cmd_timeout() {
   [ $# -eq 2 ] || die "cmd_timeout takes 2 arguments" 
   command=$1
   sleep_time=$2

   # run $command in background, sleep for our timeout then kill the process if it is running
   # $! has the pid of the backgrounded job
   $command &
   cmd_pid=$!

   # sleep for our timeout then kill the process if it is running
   ( sleep $sleep_time && kill $cmd_pid && echo "ERROR - killed $command due to timeout $sleep_time exceeded" ) &
   killer_pid=$!

   # 'wait' for cmd_pid to complete normally.  If it does before the timeout is reached, then
   # the status will be zero.  If the killer_pid terminates it, then it will have a non-zero 
   # exit status
   wait $cmd_pid &> /dev/null
   wait_status=$?

   if [ $wait_status -ne 0 ]; then 
      echo "WARNING - command, $command, unclean exit" 
   else
      # Normal exit, detach and clean up the useless killer_pid
      disown $killer_pid
      kill $killer_pid &> /dev/null
   fi

   return $wait_status
}

cmd_timeout "ssh myhost some_long-running_command" 
next_command
* but I ought to raise this on an openssh mailing list in case it’s a bug, but anyho…

while read scripting trick

Posted by Peter Burkholder Tue, 13 Mar 2007 00:44:00 GMT

I saw this mentioned on the dc-sage email list but missed the particular example, until Sweth Chandramouli posted the following example for testing whether nameservers in /etc/resolv.conf are actually working:

#!/bin/sh
while read TOKEN IP ; do
   case $TOKEN in nameserver )
      echo "Testing DNS query against $IP: `dig -x 127.0.0.1 @$IP | grep ';; ->>'`" |\
      logger -p local3.info -t check_dns ;;
   esac
done < /etc/resolv.conf

Nice trick, although I’d like to cat the input into while at the top:


cat /etc/resolv.conf |
while ...

Oh yes, I have a new job. Director of System and Network Administration at EchoDitto, which is a topic meriting several blog posts.