Parallelizing PNG compression, part 2: bilingual GNOME Shell

In the first part of our adventure, I regaled you with the tale of slow screenshots on a Linux GNOME desktop with the extremes of a slow-but-multicore CPU and a very high resolution display.

The problem was twofold:

GNOME Shell’s screenshot tool didn’t show audiovisual feedback until after the 7680×2160 image was saved as a PNG file, which was surprisingly slow.
Saving PNG files is single-threaded, wasting 7/8 of available CPU power.

With a solution at hand for each:

Split the screenshot into two asynchronous operations: capture and compression. Show audiovisual feedback after capture and run the effects in parallel with the compression.
Parallelize the filter and compression steps of PNG image writing across multiple threads.

Simply moving the audiovisual feedback earlier in the screenshot operation in GNOME Shell will eliminate the “feel” of the delay, solving most of my specific problem! But PNG writing can totally be parallelized too, and we’ll get to that in later blog posts. :)

GNOME Shell: How Does It Work?

GNOME Shell is a strange beast that takes a central role in the 3.x generation of the GNOME Desktop experience. It serves as a desktop compositor / window manager, a sort of intermediary for drawing all your windows, and it runs the interactive title bars, system menus, taskbar/sidebar, application selector, etc. It’s written in a combination of C using glib’s GObject system and JavaScript. And it can have plugins and stuff.

But the part we care about is its screenshot service, which is what’s triggered when we hit PrintScreen!

This is divided into two parts: a C component which uses the low-level libraries to read a portion of the screen and save it to a PNG, and a JavaScript component which exports a D-bus service wrapping a call to the C component and adding the “camera flash” effect for audiovisual feedback to the user.

Break it down

The C ScreenshotService object’s various methods (for full screen, window, area…) schedule an actual screen capture for the next paint event in the compositor, and then return control to JS, using a GTask to manage the asynchronous return callback.

On the paint event, the pixbuf is captured on the main thread (maybe 0.1s on this case), then a task thread is started to do PNG compression and writing to a file. Main thread continues on, keeping the slow PNG stuff (1.25s in this extreme!) from blocking the UI!

When the file is written out, the task calls back to the main thread and into the JavaScript callback, where it asks the service for the final filename and starts the audiovisual feedback.

Build it back up

For fun, I’ve started on a provisional patch which separates the single capture+compress operation into two separate operations. Now, the JS first calls the appropriate screenshot method to do capture.

When the callback comes back, we’ve got a capture but nothing has been compressed or written to disk — the perfect time to start modifying the screen by starting a visual flash effect! If the capture succeeded, it’s also the time to call a new method to compress & write the previously-captured image to disk.

This seems relatively straightforward to do by untangling and re-tangling some code. Everything’s already wrapped in an asynchronous GTask, and it’s mostly separating out and duplicating and de-duplicating a couple bits.

Patch is in progress, but I haven’t built or tested it yet because I’m still figuring out how to test a custom version of the latest development GNOME Shell on a desktop running a current release OS. :)

And that adventure, dear readers, will be the subject of our next post!