October 2016 – brionv

A few months ago I made a quick test transcoding video from MP4 (or whatever else the browser can play) into WebM using the in-browser MediaRecorder API.

I’ve updated it to work in Chrome, using a <canvas> element as an intermediary recording surface as captureStream() isn’t available on <video> elements yet there.

Live demo: https://brionv.com/misc/browser-transcode-test/capture.html

There are a couple advantages of re-encoding a file this way versus trying to do all the encoding in JavaScript, but also some disadvantages…

Pros

actual encoding should use much less CPU than JavaScript cross-compile
less code to maintain!
don’t have to jump through hoops to get at raw video or audio data

Cons

MediaRecorder is realtime-oriented:
- will never decode or encode faster than realtime
- if encoding is slower than realtime, lots of frames are dropped
- on my MacBook Pro, realtime encoding tops out around 720p30, but eg phone camera videos will often be 1080p30 these days.
browser must actually support WebM encoding or it won’t work (eg, won’t work in Edge unless they add it in future, and no support at all in Safari)
Firefox and Chrome both seem to be missing Vorbis audio recording needed for base-level WebM (but do let you mix Opus with VP8, which works…)

So to get frame-rate-accurate transcoding, and to support higher resolutions, it may be necessary to jump through further hoops and try JS encoding.

I know this can be done — there are some projects compiling the entire ffmpegÂ package in emscripten and wrapping it in a converter tool — but we’d have to avoid shipping an H.264 or AAC decoder for patent reasons.

So we’d have to draw the source <video> to a <canvas>, pull the RGB bits out, convert to YUV, and run through lower-level encoding and muxing… oh did I forget to mention audio? Audio data can be pulled via Web Audio, but only in realtime.

So it may be necessary to do separate audio (realtime) and video (non-realtime) capture/encode passes, then combine into a muxed stream.

I’ve often wished that for ogv.js I could send my raw video and audio output directly to a “real” <video> element for rendering instead of drawing on a <canvas> and playing sound separately to a Web Audio context.

In particular, things I want:

Not having to convert YUV to RGB myself
Not having to replicate the behavior of a <video> element’s sizing!
The warm fuzzy feeling of semantic correctness
Making use of browser extensions like control buttons for an active video element
Being able to use browser extensions like sending output to ChromeCast or AirPlay
Disabling screen dimming/lock during playback

This last is especially important for videos of non-trivial length, especially on phones which often have very aggressive screen dimming timeouts.

Well, in some browsers (Chrome and Firefox) now you can do at least some of this. :)

I’ve done a quick experiment using the <canvas> element’s captureStream() method to capture the video output — plus a capture node on the Web Audio graph — combining the two separate streams into a single MediaStream, and then piping that into a <video> for playback. Still have to do YUV to RGB conversion myself, but final output goes into an honest-to-gosh <video> element.

To my great pleasure it works! Though in Firefox I have some flickering that may be a bug, I’ll have to track it down.

Some issues:

Flickering on Firefox. Might just be my GPU, might be something else.
The <video> doesn’t have insight to things like duration, seeking, etc, so can’t rely on native controls or API of the <video> alone acting like a native <video> with a file source.
Pretty sure there are inefficiencies. Have not tested performance or checked if there’s double YUV->RGB->YUV->RGB going on.

Of course, Chrome and Firefox are the browsers I don’t need ogv.js for for Wikipedia’s current usage, since they play WebM and Ogg natively already. But if Safari and Edge adopt the necessary interfaces and WebRTC-related infrastructure for MediaStreams, it might become possible to use Safari’s full screen view, AirPlay mirroring, and picture-in-picture with ogv.js-driven playback of Ogg, WebM, and potentially other custom or legacy or niche formats.

Unfortunately I can’t test whether casting to a ChromeCast works in Chrome as I’m traveling and don’t have one handy just now. Hoping to find out soon! :D

Month: October 2016

Testing in-browser video transcoding with MediaRecorder

Canvas, Web Audio, MediaStream oh my!