Wikipedia domain redirect problem

Looks like we had a problem this morning with our domain redirection configuration, which broke access to the site for at least some people for a while.

Wikimedia has a lot of non-default domains registered, which we set up as redirects to the various primary domains — for instance www.wikipedia.com redirects to www.wikipedia.org, the standard location for Wikipedia’s multilingual entry portal.

This is handled by setting up a special Apache web server virtual host configuration which accepts connections for all the domains we don’t actually host wikis on — this virtual host has a bunch of mod_rewrite settings which go through and decide which domain to send the request on to. It returns an HTTP redirect response to the browser, which then goes on to the correct site.

For efficiency, many of these responses are declared to be cacheable (“301 Moved Permanently”), since they always send on to the same spot. This means that multiple hits to the same redirected URL will make use of our Squid proxy caching layer, reducing traffic to our backend servers.

The unfortunate thing is that if the configuration gets messed up and people are sent to the *wrong* URL, that’s also cached. An accidental breakage in the redirect config file was made this morning while maintaining it, creating some redirect loops for URLs which weren’t supposed to redirect in the first place.

To fix it, we’ve been restarting the Squid proxies and clearing their caches to ensure that all bad redirects are flushed out of the system.

As part of our ongoing mission to create permanent fixes to known site maintenance problems, we’re pushing up some improvements already on our list but not yet reached:

  • Proper version control for the relevant config files
  • Staging server for web server configuration changes — something we can test against in the live environment but which doesn’t pollute the primary web caches if it breaks while we’re testing it

3 thoughts on “Wikipedia domain redirect problem”

  1. Instead of redirecting certain pages, like wikipedia.com to .org , the could also be used to show a basic page to indicate that the have used the wrong url and it is .org

    This educates the user to use the correct one in the future and it not the WMF must care about loss of page views.

  2. Walter, that wouldn’t really help for the core case:

    1) It would disrupt old or “journalist mistake” links to wikipedia.com for no good reason. We have an interstitial bug-you page for bogus links like en.wikipedia.org/Some_page but that’s because people don’t link to them already. :)

    2) It wouldn’t help with this issue — you’d still have lots of broken stuff if we got the config wrong in this way.

Comments are closed.