Monthly Archives: May 2010

Making sense of Rewritemaps in Apache

Title: Using RewriteMap to ease site migrations
Date: 2010-05-21 20:43:11

This post outlines the approach we took when migrating content from one site to another on a recent project, as the documentation available currently this could be much clearer, and this is clearly a problem that others working on the web are likely to have in the near future, as people realise just how how much you can do with WordPress, and move more sites to it.

As mentioned before, we had to move a fairly large site across from one content management system based around ColdFusion a modified version of WordPress this month, making sure we didn't break any the links amassed over the last few years the site had been online.

If we were dealing with just a few urls, we could do this by hand using Apache's more well known [RewriteRule][] directives, but with than 5500 links to preserve, this approach just isn't realistic. We need a different tool for this.

Enter RewriteMaps

Fortunately, Apache RewriteMaps are designed specifically for these situations. As you'd expect from the name they allow you to map one url to another without needing to write the same ReWriteRule snippets thousands of times.

Anything that cuts repetition reduces the chances for typos to slip in, and makes it easier to maintain in the long run, so RewriteMaps are a handy tool to add to your reportoire.

How to use a ReWritemap

I find it helps to think of using a rewrite map as a 4 step process:

  • create the rewritemap

  • define the rewritemap inside an apache conf file

  • define what conditions you want to test the map against

  • define what you want to do with the return value of the rewritemap, and redirect accordingly

Now, in more detail....

Create the rewritemap

The mapping file can be as simple a plain text file with pairs of values, separated by at least one space, like so:

 
/content/press		        22
/content/jobs		        23
/content/contact		    24
/content/accessibility		25

Whitespace isn't significant, so you can happily go for readability here.

Define the RewriteMap

Now that we have a mapping files we need to let Apache know that we'd like to us it:

 
    RewriteEngine on
 
    RewriteMap url_rewrite_map txt:/srv/html/domain.com/domain.migration.map

We've told Apache to switch on its url rewriting features (RewriteEngine On), and declared the urls the text file as the patterns to match against.

Define what conditions you to test the map against

Now it's time to actually use the rewritemap:

 
    RewriteCond ${url_rewrite_map:$1|NOT_FOUND} !NOT_FOUND

This condition definitely needs some unpacking.

RewriteConditions in Apache are designed as tests to see if a RewriteRule should be applied, depending on whether the expression they're testing returns true. RewriteMaps work by passing a value into them, like how we have with S1, but they also a fallback value in case an expression didn't match any of the patterns in the url map file declare earlier, which should explain that the NOT_FOUND is doing inside the curly braces.

So far, we're testing if we have a corresponding value for the captured pattern passed in (which would return true), and returning NOT_FOUND if there's nothing there.

However, we only want to apply the url rewrite if the we have a match in the map, and as it stands both values return will return 'true' here. This will create an infinite rewriting loop, so we need to add a ! operator to check if the value is NOTFOUND or not, and only rewrite the rules if the result is !NOTFOUND.

Yes, we really did just test for not NOTFOUND.

I'm sorry - I have a real problem with this; it's extremely confusing to read, and utterly unintuitive, as well as just being bad english.

Sadly, this seems to have become a fairly common Apache htaccess idiom, and right now, I can't think of another way to express this that works using Apache's syntax. Any other suggestions would be gratefully accepted.

Define what to do with the results

 
    RewriteRule ^(.*) http://domain.com/?p=${url_rewrite_map:$1} [R=301]

This rule uses a regular expression to match the request sent to us, (that's the ^(.*) part), and then passes that through the urlmap, so a path for a content item with a unique of id "34", but a previous path along the lines of "/press/importantpressrelease" would have its numeric id returned again, giving us a rewritten url like "http://domain.com/?p=34".
In our case, the unique numeric id was the only constant we could rely during migration from one system to another; thankfully WordPress has some handy built-in rules of it's own that can convert these numeric id requests into nice friendly urls, along the lines of "http://domain.com/press/2009/06/10/important-press-release".

We're using a permanent 301 redirect flag with this ReWriteRule (that's the [R=301]), so that search engine spiders know to follow the link, retaining any search engine mojo we may have had before..

Conclusion

So, using the url mapping here, we now have a sufficiently fast way to preserve the integrity of old urls, without locking us into this structure for the future, without diving too deep into the Apache directive rabbit hole.

Discovering the Walrus

If you haven't heard of the Canadian Magazine The Walrus before, you could do a lot worse than read this article I found via Bruce Sterling's blog, Beyond the Beyond as introduction the quality of its writing - it's been one of the most refreshingly hopeful pieces about what I'm referring to (for want of a better phrase) as the Green Renaissance, where decisions by planners and policy makers to build cities around people instead of cars are leading to extraordinarily walkable, efficient and ultimately liveable cities:

Gehl believes urban public space is the lifeblood of democracy, the essence of humanism, and the sine qua non of green-minded livability. “Throughout history,” he told me, “public space had three functions: it’s been the meeting place and the marketplace and the connection space. And what has happened in most cities is that we forgot about the meeting place, we moved the market space to somewhere else, and then we filled all the streets with connection, as if connection was the number one goal in city planning, in public space.”

What he means is that we replaced public squares with parking lots, enclosed and privatized our marketplaces as shopping malls, and then turned over our streets almost exclusively to rapid transportation by private vehicle. In so doing, we enslaved ourselves to oil, choked ourselves on exhaust, and shattered into a million fragments the public realm where civil society once flourished.

Copenhagen’s great lesson for the New Grand Tourist is that the essential first step, maybe the only critical one, in reassembling these shards and building the urban foundation of the Green Enlightenment is to put people ahead of their cars and public spaces ahead of private ones in the planning priorities of the city — of any city.

In this shining upbeat, version of Europe, power comes from decentralised microgeneration in people's houses, or majestic power sources like the Grand Spires of Solùcar in Andalusia, that that use focussed sunlight on steam turbines to generate electricity:

Solùcar looks like a sci-fi movie set, but it also comes off as ageless and permanent, almost obvious after a while. I was reminded of a one-liner I once heard the sustainable design guru William McDonough deliver. Whenever he meets skepticism about how far we can go with this Green Enlightenment, he said, he likes to point out that it took us 5,000 years to put wheels on our luggage. It took us only a couple of hundred to employ heat-concentrating mirrors in industrial power generation. The trail ahead is thick with low-hanging fruit. And the reason all of this is happening on the plain of Andalusia is not strictly nor even primarily because it is a very hot and sunny place; rather, it is because Spain was one of the first countries to pass a conscious imitation of Germany’s feed-in tariff.

In all, it's about as attractive a version of our current society I can see that still revolves around economic growth as a way to fix the world's ills. I'm not convinced this is how things will pan out, and I'd like to know what the less upbeat scenarios are too, which is why I'll be heading to the Dark Mountain next weekend.

That said, I don't buy into a all of the apocalyptic narratives that are associated with the Dark Mountain camp; I think there's a middle ground between them and the Green Renaissance Utopia described above, and I think I might get a better idea of what that might be after the weekend.

Maybe I'll see you there.

The ForkBomb Tattoo

This doesn't count for much, but I think this is one of the cleverest, geekiest, most elegant tattoos I've ever seen. Carving an Apple, Cisco or Nike Logo onto your body? That's really quite sad. But this is something different:

Why do I like it?

I think it's about as attractive as a code type tattoo is going to get

The proportions are well balanced, and there's a nice visual rhythm in there - this is largely down to the choice Bitstream Sans Mono, the typeface used for this. This typeface is a well loved font in programmer circles, who spend hours staring at monospace fonts - a lovely touch.

It's actually executable code

This particular incantation has its own story too - this cryptic combination of symbols is a working example of a forkbomb, and was presented as a piece of open source art back in 2002, by Denis Jaromil Rojo, an Italian Rastafarian developer and media activist now residing in Amsterdam. A forkbomb is a piece of self replicating code that when called, start forking itself relentlessly, until it consumes all the resources on a computer system, rendering it unusable.

For the terminally curious, here's how it works:

 
:()      # define ':' -- whenever we say ':', do this:
{        # beginning of what to do when we say ':'
    :    # load another copy of the ':' function into memory...
    |    # ...and pipe its output to...
    :    # ...another copy of ':' function, which has to be loaded into memory
         # (therefore, ':|:' simply gets two copies of ':' loaded whenever ':' is called)
    &    # disown the functions -- if the first ':' is killed,
         #     all of the functions that it has started should NOT be auto-killed
}        # end of what to do when we say ':'
;        # Having defined ':', we should now...
:        # ...call ':', initiating a chain-reaction: each ':' will start two more.

More on Forkbombs here.

RVM and Textmate in harmony

One side effect of Ruby's popularity is the proliferation of ruby interpreters that can now execute ruby code, which is generally seen as a good thing, as a sign of a healthy community. However, keeping track of all these versions of Ruby, especially when testing gets harder as each new version of Ruby is released, so to help alleviate this problem Wayne E. Seguin released Ruby Version Manager, or RVM to it's friends last year.

RVM does some clever voodoo with symbolic links and suchlike on your box to let you switch between versions of Ruby very easily on the command line simply by typing RVM use ruby-version when you want to use a particular flavour of Ruby. So If I wanted to run MacRuby instead of the usual version of Ruby 1.8.7 that uses the MRI Interpreter, I'd simply check what versions of Ruby I had installed on my box like so :

 
chrisadams@edam[/usr/local/Library]
[7:52]:rvm list
 
   jruby-1.4.0 [ [x86_64-java] ]
   macruby-nightly [ ]
   ree-1.8.7-2010.01 [ ]
=> ruby-1.8.7-tv1_8_7_249 [ x86_64 ]
=> (default) ruby-1.8.7-tv1_8_7_249 [ x86_64 ]
   system [ x86_64 i386 ppc ]

And then say which Ruby I want to use for the rest of the session I have open:

[7:52]:rvm use macruby-nightly
 
  Now using macruby nightly

And if I wanted to switch back, I'd just type

chrisadams@edam[/usr/local/Library]
[7:54]:rvm use default        
 
Now using default ruby.

This is extremely handy, except if you're using Textmate, which by default, will happily use a version of Ruby that pays no attention to your version switching japery, making testing and development rather less fun.

Fortunately, there are now some tools to make RVM work with everyone's favourite mac only OS X editor now. Here's how to to make the two work together easily:

First of all tell RVM to setup your symlink:

    rvm 1.8.7 --symlink textmate

Now, you'll need to tell Textmate to use this version of Ruby instead by a) setting a shell variable, and then forcing Textmate to use this Ruby instead, by moving the Builder class, that normally sets up it's Ruby shell environment:

Here's where you should set your shell variable in Textmate:

Once you have that, swap out the Builder as mentioned before, then restart Textmate.

    cd /Applications/TextMate.app/Contents/SharedSupport/Support/lib/ ; mv Builder.rb Builder.rb.backup

And that should be about it. You can easily test that this worked in Textmate by opening a new file containing only the following code snippet:

    puts RUBY_DESCRIPTION

... then either saving the file with a .rb extension, or set the syntax colouring for Ruby, then hitting command +

If it worked, you should see something like this:

If not, don't despair - there's lots of useful docs on the RVM site itself, and the irc channel, #RVM is full of wonderfully helpful types.

Now go, armed with this knowledge and make Textmate and RVM to play nicely together again.

Enjoying BankSimple’s rhetoric

If ever there was an sector that needs to be disrupted, it would be the banking sector, so I'm quite enthused by both the rhetoric, and the calibre of the team who are amassing around BankSimple.

That doesn't happen in banking. The huge barriers to entry mean that the big guys can continue to screw you over, secure in the knowledge that no startup can compete for your business. And of course the government will bail them out if they get into trouble. Here at BankSimple, we are hoping to shake things up a little. While we sit atop existing banking systems, we've got some great technology that hides much of the underlying nastiness and lets us bank in a more real-time fashion.

Moreover, their blog is already chock-full of great content, and actual useful analysis of the business now, and how the system currently works.

I have the same feeling about them, as I do about Dropbox - but sadly, I can't see the UK getting any clever financial services companies like this launch just yet (the closest would be Smile, but while their customer service is exemplary, they have the same creaking infrastructure that forces me to use wesabe and a load of other tools to make sense of what my money.

Check them out - I really think when they launch properly, you'll see some proper disruption in the financial sector at last.

The phone tariff that should exist, but doesn’t

I'm coming to the end of an 18 month contract with O2, where I ended up spending £45 per month, to essentially have more minutes and texts than God at my disposal, when really, all I cared about was having a handset designed by people who understand that user experience is more important than features , and having a dataplan that let me use it.

Now O2 (I've been with them for more than 5 years, easy) haven't been terrible, but I've had calls drop on me, and where I work in Tower Bridge, reception is atrocious, but the market has changed since I bought a phone now, and I think there's a gap in the market that mobile operators are either ignoring, or don't want to publicise too much.

I don't need a new handset (and when I say 'need' I'm using the first world definition of 'need' here) - nothing out now is better than the iPhone 3G to the extent that the iPhone was better than the rest of market when I bought my first phone, and I can't see that changing anytime soon.

Now I'm going to outline a plan that I think should exist, but I can't see anything that fits this description:

  1. I don't use the phone too much to actual phone conversations (no more than say 300 minutes tops probably)
  2. I don't need visual voicemail or anything like this (Hullomail gives me everything I'd need now, and Google Voice as another alternative)
  3. I do use my phone for watching videos, using twitter, and reading my choice of content online.
  4. I'd like to use any data on my plan as a data I can use on when my laptop is tethered to it; I paid for it after all, right?
  5. I'm okay with data not being 'unlimited' here, because I understand the how geeks are basically subsidised by lighter users anyway, but if I pay for a couple of gigs of data, _I want to be able to use it. _

Generally, paying more than £40 per month feels like subsidised phone territory here; for £5 more, O2 were throwing in a whole iPhone on a previous contract 18 months ago for example, so paying that much for just bits leaves a terrible taste in my mouth.

The closest thing I see are Vodafone's sim only tariffs, but they're not clear about tethering with iPhones (point 4 above).

I can't be alone here in being a iPhone user who isn't slavishly devoted to the newest shiny from Apple, but doesn't want to pay over the odds for using my phone in a different way to how most operators expect me to, and would prefer to use what they have in smarter ways, rather than buying more gadgets and dongles that'll end up in landfill in about 18 months time.

Is there anything else like this hypothetical tariff around? I really think there should be, and I'd it did, I'd join it in heartbeat.

Tar is not zip

Today, I had to package up some code that was residing on a remote server that I had ssh access to, to send to someone who was probably on a windows box.

This meant I had to use the zip tool, which I can never remember how to use, hence this memory jogging post.

Now if someone's using a Mac or is on Linux, the normal command line approach to to tar up a directory to send to someone goes like this:

tar -cvzf my_tarfile.tar.gz a_directory_to_tar and_otherone_here

We're calling tar here, with the -create (c), file (f), compress (z) oh, and the verbose (f), flag. Think of that string like saying to the computer: "create the tarfile my_tarfile putting it into a file we just created, verbosely, while compressing it, from a_directory_to_tar and and_otherone_here".

As commands go, the order feels a bit weird at first, but you get the hang eventually.

Where zip is different

Using zip isn't too different really, but it you need to remember to tell it to zip recursively (why is this not default behaviour?). If we call command:

zip archive a_directory_to_tar and_otherone_here

We'll end up with an rather useless empty file called archive.zip. We need to pass in the recursive flag

zip -r archive a_directory_to_tar and_otherone_here

There's also no need to add the .zip file extension - this is added for you as a courtesy. Well, after that ridiculous non-recursive default shenanigans, frankly that's the least it could do...

Anyway, there you have it. The differences you between them if you don't want to dive into the manpages just to package up a folder for somebody, or look at this linux.about.com.

How to fix a WordPress site when the database corrupts on you.

Phew!

... and as soon as I say this, the database powering this site corrupts on me - charming!

For one awful moment I thought I had lost everything (I hadn't set up a regular mysql backup task for this blog), but thankfully, bringing this blog back from the dead was surprisingly straight forward.

Here's what happened, and what I did to fix it.

Yesterday when writing a post about how there's a mobile missing tariff that should exist but doesn't, when I posted, WordPress would accept my post; those pretty git spinners just kept spinning on the text page.

I didn't think too much of it, at the time, as I was heading out to meet James to work on some top sekrit project, based around node.js, mongodb, and doing something strange yet hopefully useful with wifi, so I just saved it locally on my mac, and headed out on my bike.

This evening though, I tried again, and I had the same error. "Ah," I thought, "maybe I just need to restart the database on this box - it's been going for ages anyway". So I ssh'd into the server running the system, and called the usual CentOS restart command

  service mysql restart

This took nearly two minutes to stop, then spat out this response when restarting:

[chris@stemcaa2 ~]# service mysql restart
Shutting down MySQL........................................[  OK  ].......
Starting MySQL.........................................................................................................................................................................................................................................../sbin/service: line 66: 27651 Terminated              env -i LANG="$LANG" PATH="$PATH" TERM="$TERM" "${SERVICEDIR}/${SERVICE}" ${OPTIONS}

Okay, slight panic now - lets check the disk, to see if there's space for the database to write, with df -h

 
[chris@stemcaa2 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             9.9G  9.4G     0 100% /
tmpfs                 129M     0  129M   0% /dev/shm
/usr/tmpDSK           485M   11M  449M   3% /tmp

Ah, that's not good. Disk space is like oxygen for servers, and without it, things break quickly.

The quickest way to make some breathing space fast is to clear the yum cache, like so:

yum clean headers packages

Okay, lets try restart again:

[chris@stemcaa2 ~]# service mysql restart
Shutting down MySQL.                                       [  OK  ]
Starting MySQL.                                            [  OK  ]

Sweet! It worked! But hang on... now wordpress isn't showing any posts! This really isn't good.

So lets check the database;

[root@stemcaa2 ~]# mysql
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 11
Server version: 5.0.90-community MySQL Community Edition (GPL)
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema | 
| antia_wordpress    | 
| chris_wordpress    | 
| cphulkd            | 
| deadonim_wordpress | 
| dov_wordpress      | 
| eximstats          | 
| horde              | 
| leechprotect       | 
| modsec             | 
| mysql              | 
| roundcube          | 
+--------------------+
12 rows in set (0.16 sec)

Okay the database is there. What about the tables though?

 
Database changed
mysql> show tables;
+---------------------------+
| Tables_in_chris_wordpress |
+---------------------------+
| wp_ak_twitter             | 
| wp_commentmeta            | 
| wp_comments               | 
| wp_links                  | 
| wp_options                | 
| wp_postmeta               | 
| wp_posts                  | 
| wp_term_relationships     | 
| wp_term_taxonomy          | 
| wp_terms                  | 
| wp_usermeta               | 
| wp_users                  | 
+---------------------------+
12 rows in set (0.00 sec)

Hmm.. they're there too. Maybe the tables?

 
mysql> select * from wp_posts
    -> ;
ERROR 145 (HY000): Table './chris_wordpress/wp_posts' is marked as crashed and should be repaired

Uh-oh. I think this is going to be ugly. And yet...

The magic fix

...a quick check here on google gleaned the fix - MySQL's got a tool for just this occasion, and you call it like mysqlcheck broken_database from the command line:

[root@stemcaa2 ~]# mysqlcheck chris_wordpress
chris_wordpress.wp_ak_twitter                      OK
chris_wordpress.wp_commentmeta                     OK
chris_wordpress.wp_comments                        OK
chris_wordpress.wp_links                           OK
chris_wordpress.wp_options                         OK
chris_wordpress.wp_postmeta                        OK
chris_wordpress.wp_posts
warning  : Table is marked as crashed
error    : Size of datafile is: 745472         Should be: 745692
error    : Corrupt
chris_wordpress.wp_term_relationships              OK
chris_wordpress.wp_term_taxonomy                   OK
chris_wordpress.wp_terms                           OK
chris_wordpress.wp_usermeta                        OK
chris_wordpress.wp_users                           OK

The fix was trivial from here - just pass in an auto-repair flag:

 
[root@stemcaa2 ~]# mysqlcheck chris_wordpress --auto-repair
chris_wordpress.wp_ak_twitter                      OK
chris_wordpress.wp_commentmeta                     OK
chris_wordpress.wp_comments                        OK
chris_wordpress.wp_links                           OK
chris_wordpress.wp_options                         OK
chris_wordpress.wp_postmeta                        OK
chris_wordpress.wp_posts
warning  : Table is marked as crashed
error    : Size of datafile is: 745472         Should be: 745692
error    : Corrupt
chris_wordpress.wp_term_relationships              OK
chris_wordpress.wp_term_taxonomy                   OK
chris_wordpress.wp_terms                           OK
chris_wordpress.wp_usermeta                        OK
chris_wordpress.wp_users                           OK
 
Repairing tables
chris_wordpress.wp_posts
info     : Found block that points outside data file at 742932
status   : OK

And then all was well again! Why can't technology always be this easy to fix? A huge, huge heartfelt thanks goes to Felipe Cruz for adding such simple instructions on his site explaining how to use that insanely handy mysqlcheck command.

Not quite sure what I’m doing with this site

I'm currently trying to work out what this site is for at the mo, after looking at the sites of a few of my friends and work colleagues', and finding a degree of inspiration to finally get this site sorted out, so it's something I'd be prepared to actively tell people about.

Since discovering that I can happily write in markdown in WordPress, and seeing that writing is coming a bit more easily to me now, I'm going to just write without thinking too hard about what comes onto here, and in the next month, use that to inform an actual redesign.

Bear with me, please, it'll all settle down in a bit.

Making the WordPress source code something you’d want to read

If you develop with WordPress at all, you're likely to spend a fair chunk of your time wading through source code, and poring over the WordPress Codex when something isn't working the way you expected, or when you're coding new features.

If you're working with Ruby and (assuming the projects you're working on has some documentation in the first place), tools like RailsAPI or rdoc.info have made browsing source code, and getting an idea of how various classes or objects make up project a fairly simple process now.

Thing is, moving from there to Drupal, with its insistence on not having any useful examples in its documentation, or WordPress, where the surrounding Codex docs are great, but where the source code less as easy to browse, has always felt clunky by comparison.

Thankfully, Harry and Tom at the Dextrous Web have worked out a clever hack that makes using the docs on WordPress's just as nice an experience as it is on a normal ruby project, by passing all source code through Doxygen, then passing that through the same all singing, all dancing , autocompletin' sdoc templates as used on RailsAPI, giving us all the slickness we'd come to associate with a well written Ruby project, but on a workhorse platform like WordPress.

Here's how to to get your own browsable docs like these, local to your machine for speedy offline reference.

Install Doxygen

Doxygen is a tool you can use to generate actual documentation the comments in WordPress's existing code base. If you're on a mac, and using macports, you can install it like so (you may need to sudo install these, depending on how your system is set up):

# if you're on a mac...
port install Doxygen #if you're using macports
port install Doxygen #if you're using homebrew
 
# or if you're using linux ...
yum install Doxygen #if you're using a CentoOS/Redhat 
apt-get install Doxygen #if you're using a Debian/Ubuntu

Fetch the WordPress source code.

We need to pull down this code so when we call the next step we have everything we need to create the documentation, then pass it through sdoc to create the docs.

 
git clone git://github.com/dxw/wordpressapi.git
cd wordpressapi
gem install wpdoc
git clone git://github.com/dxw/geshi.git
git clone git://github.com/dxw/[Wordpress][].git
ln -s wordpress/wp-includes
ruby update_codex.rb

Generate your own docs

This step takes a long time (as in more than half an hour on my black new macbook) , and uses a fair old chunk of CPU power too, so don't be alarmed if this seems to take longer than you're used to.

ruby doxydoc.rb

Bookmark the document index page

The docs you're after will be created in the docs/ directory in the wordpressapi project; you'll find the index page at docs/ , presenting a page of all the classes that make up WordPress, with an autocomplete listing of every class, method and file used in the project, with links to the source code on github.com, locally, and where relevant, to the [Wordpress Codex][3].

Browse the source

Have a quick browse - you also have keyboard shortcuts for navigating around the code, you'll I guarantee that the easier access to the source code will help you learn something new about using WordPress, without even trying, and be navigating through the the guts of the WordPress source like a pro in no time.