somewhere to talk about random ideas and projects like everyone else

stuff

#webm

Whammy: A Real Time Javascript WebM Encoder 19 August 2012

This is sort of a conceptual reversal (or not, this might just be making the description needlessly confusing) of one of my older projects,Weppy. First, what Weppy did was it added support for WebP in browsers which didn’t support it by converting it into a single-frame video. This is instead predicated on the assumption that the browser already has support for WebP (at this point, it means it only works on Chrome since it’s the only browser which actually supports WebP), not only decoding WebP but encoding it as well.

The cool thing about WebP which was exploited in Weppy is that it’s actually based on the same codec as WebM, On2’s VP8. That means the actual image data, when the container formats are ignored, are virtually interchangable. With a catch: it’s intraframe only.

So it’s a video encoder in that it generates .webm files which should play in just about any program or device which supports the WebM format. But interframe compression is actually a fairly important thing which could reduce the file size by an order of magnitude or more.

But, there isn’t too much you can do on the client side in the ways of encoding stuff. And whatever you do, you basically can’t do interframe compression (aside from some really rudimentary delta encoding). More or less, when your only alternative is to maintain an array of DataURL encoded frames or encoding it (rather slowly) as a GIF, a fast but inefficient WebM encoder stops looking too bad.

This was actually Kevin Geng‘s idea, and he contributed some code too, but in the end most of the code was just leftovers from Weppy.

Demo

http://antimatter15.github.com/whammy/clock.html

Basic Usage

First, let’s include the JS file. It’s self contained and basically namespaced, which is pretty good I guess. And it’s not too big, minified it’s only about 4KB and gzipped, it’s under 2KB. That’s like really really tiny.

<script src="whammy.js"></script>

The API isn’t terrible either (at least, that’s what I’d like to hope)

var encoder = new Whammy.Video(15); 

That 15 over there is the frame rate. There’s a way to set the individual duration of each frame manually, but you can look in the code for that.

encoder.add(context or canvas or dataURL); 

Here, you can add a frame, this happens fairly quickly because basically all it’s doing is running .toDataURL() on the canvas (which isn’t exactly a speed-demon either, but it’s acceptable enough most of the time) and plopping the result onto an array (no computation or anything). The actual encoding only happens when you call .compile()

var output = encoder.compile(); 

Here, output is set to a Blob. In order to get a nice URL which you can use to stick in a &lt;video&gt; element, you need to send it over tocreateObjectURL

var url = (window.webkitURL || window.URL).createObjectURL(output); 

And you’re done. Awesome.

Documentation

Weppy.fromImageArray(image[], fps) this is a simple function that takes a list of DataURL encoded frames and returns a WebM video. Note that the images have to all be encoded with WebP.

new Weppy.Video(optional fps, optional quality) this is the constructor for the main API. quality only applies if you’re sending in contexts or canvas objects and doesn’t matter if you’re sending in encoded stuff

.add(canvas or context or dataURL, optional duration) if fps isn’t specified in the constructor, you can stick a duration (in milliseconds) here.

Todo

This pretty much works as well as it possibly could at this point. Maybe one day it should support WebWorkers or something, but unlike the GIF Encoder, it doesn’t actually require much real computation. So doing that probably wouldn’t net any performance benefits, especially since it can stitch together a 120-frame animation in like 20 milliseconds already.

But one of the sad things about it is that now it uses Blobs instead of strings, which is great and all except that blobs are actually slower than strings because it still has to do the DataURL conversion from string to Blob. That’s pretty lame. Firefox supports the canvas toBlob thing, but for some reason Chrome doesn’t, but eventually it probably might, and that might be useful to add.

Also, if someone ever makes a Javascript Vorbis encoder, it would be nice to integrate that in, since this currently only does the video part, but audio’s also a pretty big part.


The Ambiguity of "Open" and VP8 vs. H.264 13 January 2011

Google has recently announced their intention to remove the H.264 video codec from its Chrome browser. This decision has been smeared as an evil campaign for controlling video on the web, akin to not-invented-here syndrome. It’s also been lauded as the push that the web needs to remain open and free. Mostly, it’s been marked as inconsistent, due to the bundling of Adobe’s proprietary Flash player.

Richard Stallman doesn’t like the term Open Source because it fails to embody the true meaning of “free software”, and the one thing that’s worse is the word “open”. This debate can’t simply be labeled as one for or against openness (even ignoring the technical details).

H.264 is an open standard. It was developed by a committee, standardized, reviewed by many engineers and developers for multiple companies and has been standardized for use with a multitude of containers and devices. However, H.264 is not royalty free. Software patents in many countries restrict the distribution of software that utilizes the codec to those who pay the MPEG-LA.

VP8 is not a standard. It was developed secretly by a single company, and until recently, had only a single working implementation. The public wasn’t open to collaboration on the specification until the bitstream spec was frozen, including the bugs that existed within. Now, the source code and reference implementation are available under liberal licenses, and all the related patents are irrevocably royalty-free.

Adobe Flash, while not synonymous with a video codec (unless you mean .flv flash videos, which are either VP6 or Sorenson Spark), is going here anyway because everyone feels like comparing it. Flash’s SWF format is not standard, but it is open. There are a few implementations (Swfdec, Gnash, GameSWF, Gordon, etc), but none of them are as complete as the official proprietary implementation. There’s a bitstream specification that anyone can read to create an independent implementation of the player. Implementing and using the Flash player is still royalty-free (Since the Flash VM can decode H.264, obviously that part, not controlled by Adobe still has royalties), anyone can make software that can export SWF animations without paying Adobe.

Implementation vs. Distribution

Or: How Bundling Flash doesn’t violate the Novikov self-consistency principle

Name Standardization Implementation Distribution Dev History
Theora Standardized? Open Source Royalty Free Mostly Open
VP8 Not Standardized Open Source Royalty Free Mixed
H.264 Standardized Open Source Royalties Open
Flash* Not Standardized Proprietary Royalty Free Proprietary
* Adobe Flash isn’t a video codec

Each column of the chart can be treated as one part of “open”. The <video> debate now primarily concerns VP8 and H.264, as Theora is inferior for technical reasons and Flash isn’t a video codec. I don’t know if Theora’s actually standardized, but it has multiple implementations and was developed by the Xiph.Org foundation, and is the closest thing to an actual standard, short of being a product of MPEG. Theora was created from On2’s VP3 code, but was changed significantly by the Xiph.org Foundation before release, so the development history can be considered mostly open. VP8 as announced, and the bitstream was frozen shortly afterwards, leaving little community involvement. For the years before its debut as part of WebM, the format was secret, patent encumbered, and proprietary. At least now, the source code is open and people are free to do whatever they please, but the open source video community probably had very little say in the development of the codec. H.264 has an open software reference implementation, as does VP8 and Theora. Flash’s de-facto implementation is Adobe’s, which is proprietary.

The one that concerns Mozilla, Opera and Google is the distribution rights; whether or not royalties have to be paid. Notably, this is where the parallel drawn by most critics of Google’s move with H.264 and Flash comes into question. Many people believe that Flash is the epitome of evil, and proprietary software, and embodies everything wrong with proprietary software. The distribution column is arguably the most important, and it’s the one that fundamentally determines whether or not something acts as a detriment to “open innovation”. Flash animations and applications can be created and distributed without paying royalty fees to Adobe, regardless of the viewership. Innovation is still permitted as long as the distribution is free (except with changing the inner workings of the proprietary implementation).

The Ambiguity of “Plugin”

TODO: Use this subtitle to mention random tangential thoughts.

It seems there are lots of misconceptions about WebM, and new ones as well, because of the use of the word “plugin” by Google in their “More about Chrome HTML Codec Change“. Here’s what the relevant part of the post says (it’s buried into the last paragraph).

This is why we’re joining others in the community to invest in WebM and encouraging every browser vendor to adopt it for the emerging HTML video platform (the WebM Project team will soon release plugins that enable WebM support in Safari and IE9 via the HTML standard <video> tag). Microsoft and Apple through IE9 and Safari, respectively, rely on the underlying operating system’s multimedia frameworks to handle <video> decoding. Chrome and Chromium bundle a customized version of the ffmpeg framework, while Opera uses gstreamer and Firefox has its own framework built on various open source codec libraries.

Browser WebM H.264 Theora Anything Imaginable
Firefox Version 4 Never Version 3.5 Never
Chrome Version 6 Removed in Version 10 Version 3 Never
Opera Version 10.60 Never Version 10.50 Never
Safari QuickTime Default QuickTime QuickTime Default QuickTime*
IE9 Windows Media Default Windows Media Windows Media Default Windows Media
_ Well, not really anything imaginable, but a lot of them_

QuickTime and Windows Media are pluggable multimedia frameworks. And since Safari and IE9 use them for <video> support, any codec or container format supported by the respective frameworks, works in the browser. For QuickTime, the list for videos goes along the lines of 3GP, Apple Video, AVI, DV, Cinepak, H.261, H.262, H.263, H.264, Microsoft Video 1, MPEG-1, MPEG-4 Part 2, Motion JPEG, Pixlet, Planar RGB, Sorenson Video, Qtch, QuickTime Movie, and QuickTime VR. But these media frameworks are pluggable, which means they play whatever codecs are installed. It just happens that those codecs listed above are plugins that are installed by default, somewhat analogous to how Google Chrome bundles Flash. The plugin frameworks for multimedia don’t treat plugins any different from “native” codecs. There’s absolutely no way to tell the difference from a user perspective. Right clicking a video will give you the same ordinary context menu.

Codec packs are often installed by users anyway. A codec pack consists of a set of plugins for the operating system’s multimedia framework. Once one of those codecs are installed to the system, support is available to the applications that rely on the OS’s framework: QuickTime Player, Windows Media Player, IE9 and Safari.

Notably absent are the Opera, Chrome and Firefox browsers that ignore whatever is installed, to use their own bundled decoders. There’s a reason for this, and it’s pretty easy to spot. Firefox, Chrome and Opera are also the only ones on the list that aren’t tied to a specific operating system, and bundling your own media framework lets you keep the functionality consistent. IE and Safari aren’t cross platform. Well, Safari works on Windows, but it still requires QuickTime for Windows to be installed to play the video content.

You also don’t want all the other video codecs thrown into the mix. Standards are about standardization, where you have a limited number of codecs instead of an entire forest of them. This brings us to a bit of History, I’ll begin with the olden days. I can’t tell you too much, as much of the first section happened when I was 6 years old- and certainly wasn’t into the negative user experience of platform-specific multimedia plugins.

A Brief History of Stuff Named “A Brief History…”

Uh… I mean, Video on the Web.

Before the popularity of HTML5 Video, or event Flash, video was viewed on the web using the RealVideo, MPlayer, VLC, Windows Media Player, and Quicktime Plugins. Note that pretty little plugins aren’t plugins to the underlying multimedia frameworks: these are the nasty type of plugins that Flash is among. They’re the plugins that take a long time to load, have interfaces that never look like the surrounding page or browser, only work on certain operating systems, at best, inconsistently. This is the epitome of what standardization is meant to prevent, and the distillation of everything that’s wrong with plugin based video.

Flash’s popularity (probably thanks to YouTube), brought a semblance of standardization to the industry. Video on the web was delivered through the flv container, encoded either with Sorenson Spark or, in later versions of Flash, On2’s VP6. A while later, H.264/AVC which was later added to Flash, and as the superior codec, most people switched to that and FLV is slowly fading away.

HTML5 video became popular recently. I’ll say it’s probably because of iOS, since that’s the only way to get web video to play there and the rest of the aggressive marketing that Apple does. This brings us to today (or at least the general time period of early January 2011 in which I’m writing this post in). Google has just announced it’s intention to remove H.264 from an upcoming revision of the Chrome browser (much like how the Chromium browser never supported H.264). And everyone on the internet that cares enough to say a word is going insane. Which brings us back to the last section, about how Safari and IE9 use the operating system’s underlying multimedia frameworks and how all the other browsers (Firefox/Chrome/Opera) that work on my beloved operating system (Linux) bundle their own.

If Firefox, Chrome and Opera used the OS’s media frameworks, we would be set back into the dark ages when people used every video codec imaginable and nobody would be happy. Standards exist for consistency, and it would be terrible if people started to go on a cost saving measure to never again transcode video, and serve it straight from the server’s filesystem as AVI files containing DV or god forbid MJPEG (the same format the users’s uploaded!) because all the Safaris and IE9s could play it fine. The fact Firefox, Chrome and Opera bundle their own multimedia frameworks forbids the possibility of this, because those browsers (in the forseeable future) will never support anything other than WebM, Theora and potentially H.264.

There’s more to H.264 than just H.264

This part doesn’t make much sense.

Often, the argument against VP8 is that it’s inferior to the H.264 codec, and to me, this seems like the most ideologically valid concern. But in a lot of cases, it stems from a misunderstanding of how H.264 works. H.264 is not a single video codec (not even to mention multimedia container formats), but rather has several Profiles that work on different devices and implementations. Rarely are videos encoded only in MPEG-4/AVC H.264 Extended Profile at 1080p and 60fps. Go on the Apple store and you’ll see that every iPod device you can find (including the iPhone), includes what video codecs are supported. And it doesn’t just say H.264, but rather, something much more wordy like

Video formats supported: H.264 video up to 720p, 30 frames per second, Main Profile level 3.1 with AAC-LC audio up to 160 Kbps, 48kHz, stereo audio in .m4v, .mp4, and .mov file formats For the insanely great iPhone 4. Google is meaner and doesn’t make the information about profile support on the Nexus S quite as accessible (though it’s not the only reason I’m an iPhone user), but it should be safe to assume it’s something along the lines of what the iPhone 4 supports. The iPod Classic has a nice, even longer string, that represents even less support for H.264: H.264 video, up to 1.5 Mbps, 640 by 480 pixels, 30 frames per second, Low-Complexity version of the H.264 Baseline Profile with AAC-LC audio up to 160 Kbps, 48kHz, stereo audio in .m4v, .mp4, and .mov file formats; H.264 video, up to 2.5 Mbps, 640 by 480 pixels, 30 frames per second, Baseline Profile up to Level 3.0 with AAC-LC audio up to 160 Kbps, 48kHz, stereo audio in .m4v, .mp4, and .mov file formats; This is because very few devices can actually utilize all the great features that H.264 defines. Wikipedia has a nice pretty chart. So the point of all of this is to say, even though VP8 is inferior to H.264 from a purely technical standpoint, you probably can’t just use the Main or Extended profiles to support all the devices that “support H.264”. Does this invalidate the inferiority argument? Nope. Dark Shikari said “I expect VP8 to be more comparable to VC-1 or H.264 Baseline Profile than with H.264”. But the large number of devices that support H.264 might actually only support the baseline profiles.

My Hopeless Ideals

Better be blunt with what will probably never happen.

VP8 is a bit worse than H.264, and had it been a patent encumbered video format, there would be almost no reason to prefer it over AVC. The <video> part of the HTML5 specification states:

It would be helpful for interoperability if all browsers could support the same codecs. However, there are no known codecs that satisfy all the current players: we need a codec that is known to not require per-unit or per-distributor licensing, that is compatible with the open source development model, that is of sufficient quality as to be usable, and that is not an additional submarine patent risk for large companies. This is an ongoing issue and this section will be updated once more information is available The only major codecs that are royalty free are Theora and VP8, and the former probably isn’t of sufficient quality. Both of them come with patent risks (or at least that’s what MPEG-LA wants people to believe), leaving a set of zero acceptable codecs. For this to work at all, something needs to be compromised.

H.264 defines a baseline profile that all decoders could be reasonably expected to handle. Such a profile exists so that the video can be viewed, albeit with inferior compression on a variety of devices and platforms with limited computational ability. The internet was built on interoperability, and HTML5 needs an equivalent “baseline codec” for the web. Something that compresses video at sufficient, though not bleeding-edge quality. Something that can be implemented and distributed openly on all platforms.

For HTML5 <audio>, nearly all browsers implement the basic WAV codec. It serves as an acceptable baseline for small snippets of audio to be played in a cross-brower manner. It’s usually uncompressed, and for most applications, inferior to MP3 and Vorbis. <video> needs similar treatment. That niche is, at this time, best filled by WebM/VP8. If Apple and Microsoft were to add support for the WebM format, it would only improve the environment for open innovation. H.264 can be considered the “bleeding-edge” codec, heck, they could even add HEVC/H.265 to set off an environment of useful competition (or not). But a “baseline codec” should be established first. Something that publishers can encode their videos in, so that any modern browser can view.

Right now, the discussion has been polarized between free software advocates who often seek to eradicate proprietary or patent-encumbered ideas from the face of reality and those who hold a disregard for the open source values. This way of thinking, and of treating innovation is profoundly dangerous for both the free software community and the patent-creators. The free software community can not afford to be constantly twenty years behind the times in terms of innovation (especially if you subscribe to the law of accelerating returns), and the patent-creators can’t afford for the free software community to involved in actively foiling their innovations. It’s truly great what MPEG has done, and it’s terrible that such a controversy exists around its adoption. The best scenario would be for MPEG or the ISO standards bodies to eliminate royalties. The fact multiple industries can agree upon a single, high quality, interoperable codec should be enough of a market and innovation advantage to waive the relatively nominal money from licensing.

Will this actually happen? I doubt it. Apple seems firmly invested in the success of H.264. Microsoft may be more willing to add VP8 support if and when the format gains popularity (and especially if Google adds patent indemnification just so they can see a lawsuit made). Mozilla will likely never compromise on principle and Opera might add AVC one day. Given the huge backlash, Google’s the most likely to revert its opinion and add AVC back into Chrome in a future release. Yay for hopeless idealism! And since I’m espousing hypothetical and impossible ideas anyway, why not vouch for reform of the U.S. patent system as well? If you think of it, the fact it’s such a controversy and that people need to use an inferior product in order to innovate, that smacks in the face of the very Article 1, Section 8, Clause 8 of the United States Constitution:

To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries. And this ends what may be the longest blog post I’ve ever written, revised three times. If you’ve actually read this through, please consider commenting or sending me a tweet at @antimatter15 (or following me, that would be great!). Also, if I’m misinformed, please inform me and I might revise this yet again.


Weppy Updates Opera, Chrome and Firefox support and simpler usage 09 October 2010

With help from @Frenzie and @paul_irish, the latest not-yet-versioned release of Weppy, my Javascript WebP to WebM conversion library, or something of a polyfill for a format that is yet to be part of any specification (HTML5 seems to specifically reference the image src attribute are examples such as PNG, GIF, JPEG, APNG, PDF, XML, SVG, SMIL, and MNG). The new release brings some awesome new features that really don’t change much and shouldn’t really be used in the real world because most browsers in the world still aren’t Firefox, Chrome or Opera - that is, a large portion of the browser market includes browsers like Safari and IE, either suffering from antiquity (IE6! aah!) or just liking h264 (IE9 + Safari).

The new release supports Opera. I never bothered debugging Opera, I figured it was another huge issue that would demand a rewrite (as supporting Firefox had needed, because the order of the object keys isn’t preserved and breaks the EBML result, or at least for Firefox’s parser which seems to be somewhat stricter than Chrome’s, is that ffmpeg?). And after premature optimization (stripping “unnecessary” EBML tags), my code didn’t work in chrome, so I had to revert to an earlier revision. All my testing code was based on file drag-drop stuff, and Opera doesn’t support that. Until I saw this mozillazine topic, I didn’t care, but it was a lot easier to fix than I feared.

Part of the solution was getting rid of the canvas stage. Admittedly, the canvas stage was pretty useless once the toDataURL() stage was removed before the first public release, but I didn’t feel like deleting code, so it stayed there. Also, I noticed that the global variable that gets introduced was accidentally named “WebM”, which is wrong, it should be “WebP”, but because of the uncreative format naming and similarities, I didn’t notice. Not sure, but it seems to be more stable now.

Chrome probably will add WebP soon, and it needs to be future proof, detecting whether or not a browser supports the WebP format. To do that, it creates an Image, sets the src to a data url of a 4x4 webp image and listens to the onload and onerror events, checking if the size is correct and there were no errors loading it. The routine is expected to error and totally untested as there aren’t any browsers that support the feature yet for me to try.

Another change, is that by default, it will automatically load all the same-origin (because of the limitations of XHR) webp images (from <img> tags), on the DOMContentLoaded event, so the library is practically drop-in now. In any web page, you can pretty much add <script src=”http://antimatter15.github.com/weppy/weppy.js“></script> and on the supported browsers, it should automatically load and replace all WebP images, though not something I would really recommended.

The demo is the same place it always was: http://antimatter15.github.com/weppy/demo.html

There is also this nifty hack that uses <canvas> to add an alpha channel to the WebP image (adapted from the original JPEG one): http://antimatter15.github.com/weppy/alpha/alpha.html

Also, please follow me on twitter.