A peek at ogv.js updates

Between other projects, I’ve done a little maintenance on our ogv.js WebM player shim for Safari and IE. This should go out next week as 1.6.0 and should remain compatible with 1.5.x but has a lot of internals refactoring.

Experimental features

This release include an experimental AV1 video decoder using the dav1d library, based on ePirat’s initial work getting it to build with emscripten. This is an initial test, and needs to be brought back up to date with upstream, optimized, etc.

Also included are multithreaded VP8 and VP9 decoders, brought back from the past thanks to emscripten landing updated support and browsers getting closer to re-enabling SharedArrayBuffer. Performance on 2-core and 4+-core machines is encouraging in Firefox and Chrome with flags enabled, but cannot yet be tested in Safari.

SD videos scale over 2 cores reasonably well, with a solid 50% or more decoding speed boost visible. HD can scale over 4 cores, with about 200% speed boost.

Threading must be manually opted in. The AV1 decoder is not yet threaded.

Low-end performance work

In Internet Explorer 11, performance is weaker across the board versus other browsers because its JS engine is an old, frozen version that’s not been optimized for asm.js or WebAssembly. In addition, machines where people are running IE 11 are probably more likely to be older, slower machines.

I did some optimization work which greatly improves perceived playback performance of a 120p WebM VP9/Opus file on my slowest test machine, a 1.67 GHz Atom netbook tablet from 2012 or so.

  • more aggressively selecting software YCbCr-RGB conversion when WebGL is going to be worse
  • Microoptimizations to conversion and extraction of YCbCr data from heap
  • Replaced timer-based audio packet consolidation with buffer-size-based one
  • Tuned audio communication with Flash to reduce calls, per-call overhead
  • Increased number of buffered audio and video frames to survive short pauses better
  • Fix for dropping of individual frames when an adjacent decoded frame is available

I don’t think there’s a lot left to be done in optimizing VP9 decoding for IE; the largest components in profiling at low resolutions are now things like vpx_convolve8_c which really would benefit from integer multiplication and just not being so darn slow… I tried a replacement function micro-optimizing to factor out common additions and bit-shifts in the generated emscripten code but it didn’t seem to make any difference; either that’s the one bit the IE optimizer catches anyway or the bit-shifts are so cheap compared to the memory accesses and multiplications that it just makes no measurable difference. :P

Ah well! This is why I started producing files down to 120p, to handle the super-slow cases.

What is important is making sure that it plays audio smoothly, which it’s now better at than before. And in production we’ll probably add a notice for slow decoding recommending picking another browser…

Refactoring

I’ve also started refactoring work in preparation for future changes to support MSE-style buffering, needed for adaptive bitrate streaming.

Instead of a hodge-podge of closure functions and local variables, I’ve transitioned to using ES6 modules and classes, with a babel transform step for ES5 compatibility.

This helps distinguish retained state vs local variables, makes it easier to find state when debugging, and generally makes things easier to work with in the source. More cleanup still needs to be done in the larger processing functions and the various state vars, but it’s an improvement so far.