Gotta wonder who greenlighted a skin lotion commercial to run during Silence of the Lambs…
Month: September 2007
GNU Mailman kinda sucks, but like democracy I’ve yet to come across something better. ;)
A few things I’d really like to see improved:
- Web archive links in footer
I find I fairly commonly read something on a list and then want to discuss it with other folks in chat. To point them at the same message I was reading, I have to pull up the web archives, then poke around to find it, then copy the link.
In an ideal world, the message footer could include a link to the same message on the web archives, and I could just copy and paste.
- Thread moderation
It should be easy to place an out-of-control thread on moderation. I’ll be honest, I can’t figure out how to do it right now. There’s _spam_ filtering, but we discard that. There’s whole-list moderation. There’s per-user moderation. But how do you moderate a particular thread?
In some cases, a simple time-delay throttle can help calm things down without actually forcing a moderator to sit there and approve messages. It can feel “fairer” too, since you’re not singling out That One Guy Who Keeps Posting In That One Thread.
- Easy archive excision
On public mailing lists, sometimes people post private information accidentally (phone numbers in the signature, private follow-up accidentally sent to list, etc) which they then ask to be removed from the archives. People can understandably get a bit worked up over privacy issues, particularly when the Google-juice of a wikimedia.org domain bumps the message to the first Google hit for their name. ;)
Unfortunately it’s a huge pain in the ass to excise a message from the archives in mailman. You have to shut down the mailing list service, edit a multi-megabyte text file with all that month’s messages, carefully so you don’t disturb the message numbering, rebuild the HTML archive pages from the raw files, and then, finally, restart the mailman service. That’s a lot of manual work and some service outage balanced against having people scream at you, arguably justifiably, and for now we’ve ended up simply disabling crawler access to the archives to keep them out of high-ranked global search indexes. (I know you disagree with this, Timwi. It’s okay, we still love you!)
If we could strike a message from the archives in a one-touch operation, the way we can unsubscribe someone who can’t figure out the unsubscribe directions, we could switch those crawlers back on and make it easier to search the list.
- Integrated archive search
We’ve experimented with the htdig integration patch, but the search results are not terribly good and the indexing performance is too slow on our large archives. Even if we get Google etc going again, it’d be nice to have an integrated search that’s a little more polished.
Did some upgrades on my girlfriend’s Windows PC today… The techs who originally set up her computer gave her an unconscionably small C: drive, a tiny 10 gig slice of an already-modest 40 gig drive. Even with careful discipline trying to put things on the D: partition, 10 gigs doesn’t go very far. Shared DLL installs, gobs of temporary files, cached updaters for all manner of software, etc all fill that stuff up and it was running out of room constantly.
My secret weapon to fix this was to be an Ubuntu Linux live CD, which conveniently comes with GParted.
I took a 200 gig drive left over from my dear departed Linux box and hooked it up, figuring I could back up the old data over the network, overwrite it with a raw disk image from the 40 gig drive, and then resize the NTFS partitions to a livable size.
Well, sort of. :)
It turns out I could have saved myself some trouble at the command line by copying the partitions across drives with GParted itself instead of goin’ at it all old-school with dd. (Neat!)
I had two sticking points, though.
First, it didn’t seem to let me move the extended (D:) partition to a different place on the drive. That meant there was no room to expand the C: partition, which was the point of the exercise.
I ended up having to create a copy of the D: partition, which it let me put in the middle of the drive, and then delete the old partitions. Kind of roundabout, and it changed the partition type from extended to primary, but Windows doesn’t seem to care about that so keep those fingers crossed…
My second snag was due to Ubuntu’s user-friendliness. As soon as the new partition was created, the system mounted it — which caused the NTFS cloning process to abort, warning that it can’t work on a mounted filesystem.
Had to go into the system settings and disable automatic mounting of removable media… luckily that’s easy to find in the menus. If you know it’s going to be there, at least. :)
Distributed time zones
Working with a distributed team, such as Wikimedia’s tech team, has its advantages and disadvantages. One irksome, yet useful aspect of that is the different time zones that people live in.
In the early days, our time zone distribution looked roughly like this:
With Tim in Australia, Mark and others in Europe, and me in California, our timezones were nearly evenly spaced. If we all worked the same hours (local 9-to-5s for June are marked above), we’d almost never be online at the same time. Of course we all worked irregular hours, so there tended to be some overlap.
For most of 2007 though we’ve had something more like this:
Tim moved to England, I moved to Florida, and suddenly our time zones are much more compressed, with a much larger overlap.
On the one hand this is nice — we have more “face time” for real-time interaction in the chat channels.
On the other hand this leaves a big portion in the day when none of the core tech team is “on duty”, which reduces our ability to respond quickly to crises. Luckily we’ve had a lot fewer problems this year since we’ve gotten a lot of old problems fixed up and our hardware capacity has generally stayed at or ahead of the growth curve.
For 2008 it looks like we’ll be going back to a more spread out team:
Tim’s moving back to Australia, and I’ll be heading back to California when the Wikimedia Foundation sets up its new offices in the San Francisco bay area. We’ll also have Rob still active with the servers in Tampa, filling in some holes in coverage in the middle.
There’s some concern that this’ll reduce our ability to work directly with each other by IRC, but that’s not necessarily a bad thing. Relying too much on chat introduces problems of its own:
- Those who aren’t available online constantly get marginalized…
When important decisions are made in chat, you don’t get to participate if you dare to sleep, have a day job, go to class, have a life… :)
- Records are poorer compared with a mailing list or wiki — not only did you miss the boat, you don’t get to see what the boat looked like. You may not even know there was a boat…
We try to combat this by keeping a detailed server admin log and announcing details of big outages or updates on the lists.
Putting more emphasis on mailing list and wiki communication could make it easier to embrace new developers who can’t all be online at the same time… and paying more attention to our own wikis might help with dogfooding. ;)
Updated: Corrected Melbourne to Sydney in 2008 time zone map.
So you wanna be a MediaWiki coder?
Some easy bugs to cut your teeth on…
- Bug 1600 – clean up accidental == header markup == in new sections. (Note — there’s an unrelated patch which got posted on this bug by mistake ages ago, just ignore it. :)
- Bug 11389 – current diff views probably should clear watchlist update notifications generally, as they do for talk page notifications.
- Bug 11380 – the ‘Go’ search shortcut needs some namespace option lovin’…
Or maybe you’re prefer to clean up an old patch and get it ready to go?
Bug 900 – Fix category column spacing. Since letter headers take up more space than individual lines, we get oddly balanced columns if some letters are better represented than others…
Age: 2 years, 7 months. Ouch! :)
Patch status: Applied with only minor cleanup, this function hasn’t changed much! There seems to be something wrong with the algorithm here; while it seems to balance a bit better, I see items dropped off the end of the list sometimes. Needs more work.
Bug 1433 – HTML meta info for interlanguage links.
Age: Two years, seven months.
Patch status: Applied after minor changes, but doesn’t seem compatible. Provided an alternate version which seems to work with SeaMonkey and Lynx. Is this an appropriate thing, and how do we i18nize the link text?
Gave a little talk on Wikipedia’s scalability architecture at GatorJUG in Gainesville the other day… I should probably blog these things before they happen, right?
Will also be at OrlandoJUG in a couple weeks; Thursday, September 27. Whee!
Some more fiddling with Motion: