So what’s in the job queue anyway?

In en.wikipedia.org’s job queue at the moment, breakdown by job type…

job_cmd count(*)
htmlCacheUpdate 31,147
refreshLinks 10,106,739
renameUser 119

Note that the current system allows for duplicate entries to get put in the queue; the dupes are removed as the first one in the stack gets run. This makes the raw number of refreshLinks entries much higher than it “really” is — [[Wikipedia:Talk:Union Station (Louisville)|Talk:Union Station (Louisville)]] is listed 9 times, presumably once for each template edit that triggered an “update me!” job.

Update: Figured out why the queues were growing so big last few days — system clock was 7 seconds slow on the database master. This made the replication lag detection misread a 7-second minimum lag on every slave. The job queue batch runners were all sitting waiting for the lag to resolve. :)

Resynced the clock (presumably drifted during the period when some IPs were broken), things are moving again.

One thought on “So what’s in the job queue anyway?”

  1. Is there any statistics about what edits cause most queue load? We may want to protect hevily used templates to avoid putting extra load on servers due to not-really-needed edits.

Comments are closed.