Apache redirects with RewriteMap and RewriteCond 47

Posted by Peter Burkholder Wed, 15 Jul 2009 01:13:00 GMT

At work we’re migrating several thousand research articles from Zope to another CMS. The CMS folks are taking care of moving the content, but when we’re done we’re going to institute a boatload of R=301 redirects from the old URLs to the new URLs.

RewriteMap is the accepted way with mod_rewrite of handling a lot of one-to-one mappings that don’t follow any particular pattern. What I figured out today was that I can use a rewrite map in a RewriteCond statement so I only do the redirect when there’s a match in the rewrite map lookup.

Here are some snippets from my httpd.conf to illustrate:

For testing, we’ll want logging:


RewriteLog /var/log/httpd/rewrite.log
RewriteLogLevel 2

Define the map we’re using. The content of the map is ‘old_uri new_uri’ with a space separating the two. Use .txt for testing, and the below we’ll convert to a DBM.


RewriteMap research_map txt:/etc/httpd/conf/research_map.txt 

Here, I got a hint from http://www.tunnell.org/blog_posts_view.php?blog_postid=3. Remember that the syntax for a RewriteCond is:

RewriteCond TestString ConditionPattern

TestString will be a map lookup of $1, where $1 is the match string of the following RewriteRule, expressed: ${research_map:$1}

For ConditionPattern, we will test if the TestString is lexically greater than ””, the empty string, which is what the map lookup returns when there’s no match. Expressed: >""


RewriteCond ${research_map:$1}  >""     # IE, if map result is greater than "" 

So if the URL starts with /research, then use the research_map value for the key $1 to redirect to new address


RewriteRule ^(/research/.*$) ${research_map:$1}      [R=301,L]

What we end up with is a few lines of configuration that quickly let me put in place 3342 new redirects. Here’s the whole stanza:



RewriteMap research_map txt:/etc/httpd/conf/research_map.txt 
RewriteCond ${research_map:$1}  >""     # IE, if map result is greater than "" 
RewriteRule ^(/research/.*$) ${research_map:$1}      [R=301,L]

# Same thing, but lookup with a trailing slash if there isn't one
RewriteCond ${research_map:$1/}  >"" 
RewriteRule ^(/research/.*[^/]$) ${research_map:$1/}      [R=301,L]

Lastly, converting the textfile to a dbm speeds up the lookup by at least an order of magnitude.

bash command timeout 14

Posted by Peter Burkholder Mon, 22 Jun 2009 18:51:00 GMT

At $WORK I’m needing to maintain a backup system wherein our backup server a) starts an SSH process to stop-and-dump our CMS service, then b) SCPs the dumpfile back to the backup servers for writing to tape. I’ve discovered that the stop-and-dump part of the process would hang for 24 hours* when the stop-and-dump perl script exited but the initiating OpenSSH sshd process would not exit, preventing the SCP process from going forward.

I’ve decided to put a command timeout on the SSH process, and here’s how it looks in bash:

# Inspired by:
# http://www.ultranetsolutions.com/BASH-terminate-command-after-timeout.html
cmd_timeout() {
   [ $# -eq 2 ] || die "cmd_timeout takes 2 arguments" 
   command=$1
   sleep_time=$2

   # run $command in background, sleep for our timeout then kill the process if it is running
   # $! has the pid of the backgrounded job
   $command &
   cmd_pid=$!

   # sleep for our timeout then kill the process if it is running
   ( sleep $sleep_time && kill $cmd_pid && echo "ERROR - killed $command due to timeout $sleep_time exceeded" ) &
   killer_pid=$!

   # 'wait' for cmd_pid to complete normally.  If it does before the timeout is reached, then
   # the status will be zero.  If the killer_pid terminates it, then it will have a non-zero 
   # exit status
   wait $cmd_pid &> /dev/null
   wait_status=$?

   if [ $wait_status -ne 0 ]; then 
      echo "WARNING - command, $command, unclean exit" 
   else
      # Normal exit, detach and clean up the useless killer_pid
      disown $killer_pid
      kill $killer_pid &> /dev/null
   fi

   return $wait_status
}

cmd_timeout "ssh myhost some_long-running_command" 
next_command
* but I ought to raise this on an openssh mailing list in case it’s a bug, but anyho…

Building a VirtualBox server lab on OsX with NAT and Internal Networking (intnet) 4

Posted by Peter Burkholder Wed, 27 May 2009 00:13:00 GMT

Chef and Puppet, Take 2

A few weeks ago I announced my intent to compare Puppet to Chef, then quickly got bogged down with, among other things, needing to get my Typo installation up to snuff.

I’m going to take another shot at this, with the initial goal of getting Puppet and Chef running on CentOS 4.7 inside VirtualBox on a OsX 10.4 system.

VirtualBox

VirtualBox is a semi-free virtual environment from Sun. The open-source version seems suitable for server testing, but there’s also a more fully-featured proprietary version. I’m choosing it because I have more space on my $WORK system then my personal system, but I don’t have a VmWare Fusion license. Further, my home virtual Linux systems are all Ubuntu, and $WORK is RHEL, so there’s more point to this exercise using a RHEL-derived system if I’m going to get some traction on my plan to get a configuration-management system rolling at my workplace.

Creating a new VirtualBox

First I’m going to get two instances of Damn Small Linux. In VirtualBox

  • click “New”
  • select “Linux”, “Debian”, “256 Mb Ram”, “34 Mb hard drive”
  • create
  • in the system description, select CD/DVD-ROM and connect to the .iso image
    • in VirtualBox, you ‘Add’ isos from various places in your filesystem to a list of available ISOs, so add dsl-4.4.10-initrd.iso
  • Press ‘Start’ then you get an info window on which key to use to de-associate your keyboard from the GuestOS.
  • DSL (DamnSmallLinux) boots astonishingly fast
  • The process of cloning a system is not worth it for a diskless system. Just make DamnSmall2 following the same steps.

Out of the box, this created two hosts connected to the world via NAT, but both apparently on the same IP address (10.0.2.15). So far, so good.

What I want is:

  • All VMs to have NAT access to the world, and SSH access from the host operating system via port-forwarding.
  • All VMs to communicate with each other over an internal-only network at 192.168.5.0

It seems that the VirtualBox tutorials out there really muddy up the water. It’s not that hard.

NAT and port-forward ssh access

  • VirtualBox configuration
    • Add network adapter 1 as NAT
    • Run the following script

# Number all hosts up from 1
host=DamnSmall1
port=2201

# let eth0/adapter1 be DHCP NAT
VBoxManage setextradata "$host" \
    "VBoxInternal/Devices/pcnet/0/LUN#0/Config/guestssh/Protocol" TCP 
VBoxManage setextradata "$host" \
    "VBoxInternal/Devices/pcnet/0/LUN#0/Config/guestssh/GuestPort" 22 
VBoxManage setextradata "$host" \
    "VBoxInternal/Devices/pcnet/0/LUN#0/Config/guestssh/HostPort" $port
  • Guest Host Configuration—turns out the IP doesn’t matter ‘cause we use port-forwarding to reach the host
    • Debian /etc/network/interfaces
iface eth0 inet dhcp

Internal Networking on Static IPs

  • VirtualBox Configuration
    • Connect Adapter 2 to ‘intnet’ internal network
  • Guest Host configuration: Set up eth1 static ip addresses internally, e.g.
    • Debian /etc/network/interfaces
iface eth0 inet static
    address 192.168.5.201
    netmask 255.255.255.0