Leverage your synergy

On Rob’s advice, I set up Synergy to share my keyboard and mouse between my Linux and Mac boxes at the office.

Pretty straightforward to set up (if you’re a *nix geek); I had just one nasty surprise. If you’re sharing a keyboard from a PC server to a Mac OS X client, it switches the alt and command keys for you.

That might be a cute option if you’re using a PC keyboard, where Alt and Windows keys appear in the opposite order from the Mac’s Alt/Option and Command keys. Not so cute if you’re using a Mac keyboard and want things to remain sensible.

Luckily, it’s pretty easy to switch them back. In the screen section of the config file for the Mac client, add these options:

		super = alt
		alt = super

It seems to consider ‘super’ and ‘meta’ to be almost the same, but if you say ‘meta’ here it gets confused — you get two option keys and no command key.

Wikimedia in Google Summer of Code

Wikimedia’s been accepted as a mentoring organization for the 2007 Google Summer of Code program.

Here’s our organization page, and I put up an initial project list on meta.

The list is semi-protected so it won’t be too vandalized ;) but additional suggestions are welcome. I’d like to ask that people who aren’t directly involved in development not add too much to the main page directly, though; last year we ended up with lots of project submissions for things that weren’t really considered high priority, so I’d like to keep the list a little more ordered this time.

We don’t know for sure how many projects we’ll get assigned, so we’ll see. :) At least Tim and I will serve as mentors for the student projects; if a couple more experienced developers would like to help out with that too that would be super.

Last year’s projects went really well up to the public demo stage but never quite got integrated into the mainline; I’m hoping that this year we can stick with projects that will be easier to slip in and take live much earlier in the process, which should help keep the students interested and the projects active.

I hate^H^H^H^Hlove you, Subclipse

Inspired by river’s addition of PostgreSQL support, I was gonna make a few quick changes to mwdumper. I figured I’d get Eclipse and the Subclipse SVN plugin set up on my 64-bit Linux workstation so I’d have a decent Java IDE to work on it in.

Well… no.

Neither with the version of Eclipse that ships with Ubuntu Feisty, nor with a fresh copy of it from eclipse.org… when I try to check out from SVN, and get to the final stage, it just… stops. No error message, no explanation. Just the wizard’s done and I’ve got no project.

I… hate… computers.

Update: Mark Phippard explained the secret in a comment — you can check out a project from the SVN Repository browse view, and it works! Thanks, Mark!

Fun with mb_strlen

I noticed the fallback implementation for mb_strlen() that we had in GlobalSettings.php sucked:

	function mb_strlen( $str, $enc = "" ) {
		preg_match_all( '/./us', $str, $matches );
		return count($matches);

There are two things to note about this code:

  1. It doesn’t actually work, because no matches are done — it always returns 1
  2. Even if you fix it to return the matches, it’s extremely slow and will eat lots of memory by creating a giant array of every character in the (potentially quite long) string

I’m replacing this with a new version which uses PHP’s count_chars() function to count up the ASCII-compatible bytes and multibyte sequence head bytes. It’s still a smidge slower than mb_strlen but it’s… much better than the old one.

	 * Fallback implementation of mb_strlen, hardcoded to UTF-8.
	 * @param string $str
	 * @param string $enc optional encoding; ignored
	 * @return int
	function new_mb_strlen( $str, $enc="" ) {
		$counts = count_chars( $str );
		$total = 0;

		// Count ASCII bytes
		for( $i = 0; $i < 0x80; $i++ ) {
			$total += $counts[$i];

		// Count multibyte sequence heads
		for( $i = 0xc0; $i < 0xff; $i++ ) {
			$total += $counts[$i];
		return $total;

Some quick benchmarks using the UTF-8 normalization benchmark pages (code):

Testing washington.txt:
              strlen      31526 chars    0.007ms
           mb_strlen      31526 chars    0.114ms
       old_mb_strlen      31526 chars 4813.686ms
       new_mb_strlen      31526 chars    0.132ms

Testing berlin.txt:
              strlen      36320 chars    0.001ms
           mb_strlen      35899 chars    0.129ms
       old_mb_strlen      35899 chars 6328.748ms
       new_mb_strlen      35899 chars    0.127ms

Testing bulgakov.txt:
              strlen      36849 chars    0.001ms
           mb_strlen      20418 chars    0.076ms
       old_mb_strlen      20418 chars 3003.042ms
       new_mb_strlen      20418 chars    0.133ms

Testing tokyo.txt:
              strlen      36244 chars    0.001ms
           mb_strlen      19936 chars    0.071ms
       old_mb_strlen      19936 chars 2623.109ms
       new_mb_strlen      19936 chars    0.131ms

Testing young.txt:
              strlen      36694 chars    0.001ms
           mb_strlen      16676 chars    0.063ms
       old_mb_strlen      16676 chars 2246.179ms
       new_mb_strlen      16676 chars    0.125ms

Font fun in Gimp on Mac

While whipping up a theme for the nascent Planet Wikimedia, I needed to use the standard font for Wikimedia logos, Gill Sans.

Gill Sans seems to come conveniently preinstalled on Macs, so I opened up my MacBook, symlinked the Mac fonts from /Library/Fonts to /usr/lib/x11/fonts/TTF, and whipped out Gimp… Unfortunately I ran into an old problem I’d forgotten about:

Fonts o doom.png

For some fonts, only the bold-italic version seems to come up in the font list. This time I decided to get to the bottom of it. Poking at the font files, I found that Gill Sans comes as a single .dfont file instead of the more traditional bundle of .ttfs. While Gimp/Freetype was happy to read the file, it appears to only pick up one of the style variants — in this case bold italic.

Some googling turned up this page which included a hint that you can extract .ttf files out of a .dfont using the utility fondu. Conveniently this is available as a fink package, so in a couple minutes I was able to replace the .dfont symlinks in /usr/lib/x11/fonts/TTF with separated .ttf files. Restart Gimp, and presto!

Fonts o fun.png

Virtualization on Mac

Click me, I'm deliciousWe’ve got some new machines in for the Wikimedia staff, and among them a shiny Core 2 Duo iMac has found its way to my desk as my in-office development workstation. Yum!

Doing web development, I need to have access to a number of operating systems for testing purposes: Linux servers, Windows clients, Windows servers, Linux clients, Mac clients, and the occasional other oddity.

In theory, at least, an Intel-based Mac should be the ideal environment to run this: test the Mac clients on the main OS, and everything else running in virtualization at full speed. The new Core 2 Duo boxes are further capable of running both i386 and x86_64 guest OSs, for full coverage.

With this in mind I’ve fiddled around for a while with the main desktop-level virtualization packages on the Mac to get a feel for what’s available… unfortunately the field isn’t very thick.

Basically there’s Parallels and the beta of VMware Fusion. There’s also some QEMU-based packages, but last I tried that was very unsatisfactory, both slow and unstable.


The good:

  • A real shipping product!
  • Relatively inexpensive
  • Good Windows integration (drivers, keyboard and mouse handling, filesystem integration)

The bad:

  • No 64-bit guest support — 32-bit only
  • Guests can only use one CPU core
  • No guest tools support for Linux; GUI desktops are slow and awkward to use
  • No snapshots
  • Snapshotting is the ability to save the state of the virtual machine, run it further, then return to the saved state. You can use this to roll back installation of experimental software, for instance. *Very* useful when developing and testing software, for obvious reasons.

    Since this has been part of VMware Workstation for some time, I had hoped to find it also in…

    VMware Fusion

    The good:

    • Based on the mature VMware engine
    • Portability of VMs to and from VMware Workstation and Player on other platforms
    • 64-bit guest support
    • Dual-processor guest support
    • Guest tools & drivers for Linux and some other Unix clients as well as Windows
    • Limited support for snapshots

    The bad:

    • Still in beta; there’s no shipping product and you can expect problems.
    • Last I checked, networking was horribly broken, but that may be better on beta 2 (need to try it more)
    • As of beta 2 only allows a single snapshot per VM

    That single snapshot limitation is *horrible* from my perspective; it’s totally arbitrary and wrecks much of the usefulness of it.

    An example of one of my prime uses for snapshots on VMware Workstation was maintaining a single copy of Windows XP in both IE 6 and IE 7 states; I could switch back and forth between them at will, while still using the snapshotting for more local changes. That’s something I couldn’t do with only a single snapshot available — I’d have to install two separate copies, which would imply a second license. And then I’d still be stuck with only a single snapshot for all my debugging uses!

    The quick fix

    For now I’ve wiped the disk and installed Ubuntu Linux, so I can run VMware Workstation for Linux. I’ve got the full range of snapshotting features available, and can still use my laptop for Mac client testing and all the other happy shiny Mac OS X goodness.

    Of course there were some installation issues… ;)

    iMac vs Ubuntu

    • Distorted screen at native resolution with VESA video driver (proprietary ATI driver works once fiddled with a bit)
    • Installation fails on setup of GRUB bootloader with a Boot Camp dual-boot configuration; you have to wipe the disk and install a DOS partition map
    • Sound doesn’t work
    • Doesn’t seem to wake from Suspend
    • … and probably others ;)

    Hopefully Parallels will catch up or they’ll get proper snapshotting into Fusion and I can someday reinstall Tiger (or perhaps Leopard by then), but in the mean time it looks pretty rockin’ on my desk and VMware actually works!

Cardee’s Jr

I grew up in California and have lived pretty much all my life there. The Carl’s Jr. burger chain, itself born in Southern California, has always been a cultural fixture for me in the western United States. Knowing that they didn’t have Carl’s back east was one of those sad little things about moving to Florida that I was going to have to get used to.

Driving down I-10 in the middle of the South somewhere, though, I started seeing a familiar logo on the Gas-Food-Lodging signs. Perhaps fatigue from the long drive had made me hallucinate? But no, I got a good up-close look at one next to a gas station in northern Florida:

Oh look, a Carl'... wha?

Even the web sites are the same… Carl’s vs Hardee’s

Well, a quick peek through the company history pages indicates that Carl Karcher Enterprises bought out the Hardee’s chain in the 1990s, and started rebranding it in the 2000s with the “new look” (apparently the Carl’s logo) and new menu items. It’s kind of… creepy nonetheless. I guess they didn’t want to lose local brand recognition by changing the name, but adopting a different logo seems kind of weird.

Combining the maps from the store locators, it seems there’s no territorial overlap between the Carl’s and Hardee’s chains:

Carls and Hardees maps combined

Only Oklahoma has both; there’s one Hardee’s way out East, but it’s miles away from the nearest Carl’s. Both chains are missing in the Northeast, which is probably why I haven’t stumbled on any Hardee’s back east before…

Looks like there is a Hardee’s in town here in St. Pete, I’ll have to track it down and see if the inside is as eerily familiar as the outside.