XMPP output fix for StatusNet

Yay, it’s big commit time!

I’ve landed my XMPP output queuing work in 0.9.x, running on my public test site.  Do let me know if XMPP subscription or i/o is flaky on it!

Should be ready to merge to testing & master and deploy when we’re content to do so… there is a database change necessary for the DB-based queueing system, so I want to confirm that’s not a problem before pushing out.

Big thanks to Craig Andrews for his work on generalizing the DB queues which I’ve started integrating; we should be able to land the rest including the IM pluginization for 1.0.

Commit summary:

XMPP queued output & initial retooling of DB queue manager to support non-Notice objects.

Queue handlers for XMPP individual & firehose output now send their XML stanzas to another output queue instead of connecting directly to the chat server. This lets us have as many general processing threads as we need, while all actual XMPP input and output go through a single daemon with a single connection open.

This avoids problems with multiple connected resources:

  • multiple windows shown in some chat clients (psi, gajim, kopete)
  • extra load on server
  • incoming message delivery forwarding issues

Database changes:

  • queue_item drops ‘notice_id’ in favor of a ‘frame’ blob. This is based on Craig Andrews’ work branch to generalize queues to take any object, but conservatively leaving out the serialization for now. Table updater (preserves any existing queued items) in db/rc3to09.sql

Code changes to watch out for:

  • Queue handlers should now define a handle() method instead of handle_notice()
  • QueueDaemon and XmppDaemon now share common i/o (IoMaster) and respawning thread management (RespawningDaemon) infrastructure.
  • The polling XmppConfirmManager has been dropped, as the message is queued directly when saving IM settings.
  • Enable $config[‘queue’][‘debug_memory’] to output current memory usage at each run through the event loop to watch for memory leaks

To do:

  • Adapt XMPP i/o to component connection mode for multi-site support.
  • XMPP input can also be broken out to a queue, which would allow the actual notice save etc to be handled by general queue threads.
  • Make sure there are no problems with simply pushing serialized Notice objects to queues.
  • Find a way to improve interactive performance of the database-backed queue handler; polling is pretty painful to XMPP.
  • Possibly redo the way QueueHandlers are injected into a QueueManager. The grouping used to split out the XMPP output queue is a bit awkward.

StatusNet queue refactoring landed

Woohoo! After a couple months off and on adjusting the architecture to something that seems to meet our needs, I’ve merged my refactoring of StatusNet’s background queue processing to 0.9.x.

Some of my design notes are up on the wiki, with a couple updates based on tweaks I made from my original plans.

Key items from the commit summary:

Major refactoring of queue handlers to support running multiple sites in one daemon.

Key changes:

  • Initialization code moved from common.php to StatusNet class; can now switch configurations during runtime.
  • As a consequence, configuration files must now be idempotent… Be careful with constant, function or class definitions.
  • Control structure for daemons/QueueManager/QueueHandler has been refactored; the run loop is now managed by IoMaster run via scripts/queuedaemon.php IoManager subclasses are woken to handle socket input or polling, and may cover multiple sites.
  • Plugins can implement notice queue handlers more easily by registering a QueueHandler class; no more need to add a daemon.

The new QueueDaemon runs from scripts/queuedaemon.php:

  • This replaces most of the old *handler.php scripts; they’ve been refactored to the bare handler classes.
  • Spawns multiple child processes to spread load; defaults to CPU count on Linux and Mac OS X systems, or override with –threads=N
  • When multithreaded, child processes are automatically respawned on failure.
  • Threads gracefully shut down and restart when passing a soft memory limit (defaults to 90% of memory_limit), limiting damage from memory leaks.
  • Support for UDP-based monitoring: http://www.gitorious.org/snqmon

Rough control flow diagram:

QueueDaemon -> IoMaster -> IoManager
QueueManager [listen or poll] ->  QueueHandler
XmppManager [ping&  keepalive]
XmppConfirmManager [poll updates]

Todo:

  • Respawning features not currently available running single-threaded.
  • When running single-site, configuration changes aren’t picked up.
  • New sites or config changes affecting queue subscriptions are not yet handled without a daemon restart.
  • SNMP monitoring output to integrate with general tools (nagios, ganglia)
  • Convert XMPP confirmation message sends to use stomp queue instead of polling
  • Convert xmppdaemon.php to IoManager?
  • Convert Twitter status, friends import polling daemons to IoManager
  • Clean up some error reporting and failure modes
  • May need to adjust queue priorities for best perf in backlog/flood cases

Detailed code history available in my daemon-work branch.