apache trick two
Jun 17, 2005, 01:43am EDT
We all know that short urls are better than longer ones. So how can one make short urls?
mod_rewrite (take 1)
The most common approach I’ve seen is to use piles and piles of mod_rewrite rules that handle the conversion for you. This allows you to keep your existing URLs and augment them with short URLs. Here’s an example of the rewrite rules that Wordpress automatically generates:
RewriteEngine On RewriteBase /webnote/wordpress/ RewriteCond %{REQUEST_FILENAME} -f [OR] RewriteCond %{REQUEST_FILENAME} -d RewriteRule ^.*$ - [S=43] RewriteRule ^feed/(feed|rdf|rss|rss2|atom)/?$ /webnote/wordpress/index.php?&feed=$1 [QSA,L] RewriteRule ^(feed|rdf|rss|rss2|atom)/?$ /webnote/wordpress/index.php?&feed=$1 [QSA,L] RewriteRule ^webnote/page/?([0-9]{1,})/?$ /webnote/wordpress/index.php?&paged=$1 [QSA,L] RewriteRule ^page/?([0-9]{1,})/?$ /webnote/wordpress/index.php?&paged=$1 [QSA,L] RewriteRule ^comments/feed/(feed|rdf|rss|rss2|atom)/?$ /webnote/wordpress/index.php?&feed=$1&withcomments=1 [QSA,L] RewriteRule ^comments/(feed|rdf|rss|rss2|atom)/?$ /webnote/wordpress/index.php?&feed=$1&withcomments=1 [QSA,L] … (there are about 30 more lines of this)
This works, but it’s ugly and hard to debug.
add a pinch of server scripting
Here’s a variation that uses
apache’s
ForceType and PHP to make clean URLs. I like this approach better because
it keeps all the rewrite logic in code (PHP) rather than having to maintain
them externally in a .htaccess
file. The one down side is
that a separate ForceType rule would have to be made for each directory
prefix. Again, ick.
python is cute
Vellum, a clever blogging program written in python, uses a similar trick that it refers to as “funky caching” (I would just call it lazy evaluation). It sets the apache ErrorDocument:
When the user goes to an invalid url, Vellum rebuilds the site and then redirects the user to the correct page. This is clever, but it sends a 404 Document Not Found header back and uses HTTP redirects in excess.
by your powers combined…
So what I end up doing on this site is a combination of the above. First, I use mod_rewrite to redirect when the URL doesn’t exist:
RewriteEngine On RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule ^(.*)$ /bloopy.py/$1 [L]
This says, if there is no file and there is no directory at the requested url, send the request to /bloopy.py/[requested_filename]. Then bloopy.py is then responsible for parsing the requested filename and loading the appropriate page. I like this approach better because
- It’s not a pile of regular expressions (it could be, but it’s not).
- It sends the right HTTP response codes.
- I don’t have to update my .htaccess file when I want to add a new path. I just add a new case in bloopy.py.
- It’s easier to test new paths because python is easier to debug than apache directives.
There are some downsides, like being a bit slower than apache directives, but overall this works very well for me.