somewhere to talk about random ideas and projects like everyone else

stuff

#google

Protobowl: Real-Time Multiplayer Online Quizbowl 28 December 2014

I guess it’s a trivia site and I’m never actually going to legitimately finish this README, as it’s substantially over 3,000 words and I haven’t even finished the introduction to the prequel. I might as well distill the whole thing down to what actually matters— the little bits of trivia that really don’t matter but might be amusing to some people with particularly derranged senses of amusement.

  • The official about page says that Protobowl is a portmanteau of “Prototype Quizbowl Application”, but this is in fact a lie. It is in fact and has always been an acronym for Promoting Racism Online To Obstruct Better Organized World Leadership. It was always designed to be nothing less than a “literal cancer of the quizbowl community”.

  • Protobowl’s live-as-you-type real time chat was inspired by Google Wave, which Kevin is quite embarassed to have spent the first half of high school playing with, almost as embarrassed as he feels right now writing about himself in third person.

  • Protobowl was going to have a social component (called Socialbowl) so that users could log in and view their scores and detailed performance stats with graphs depicting buzz rates per category. Ben stopped working on it before it launched, so it was never added.

  • Protobowl runs primarily on Nodejitsu, but falls back to a Bitcable VPS, and then to an MIT XVM if all hell breaks loose. In late 2012, Ben and I had a bet over whether or not he would get into the Science Honor Society, and whoever lost would have to pay for the operational costs of Protobowl for the rest of time. He got accepted and lost the bet.

  • For a long time (and probably still today, albeit hopefully less so), Protobowl suffered from a mysterious lag problem that we could never quite figure out. In the process of debugging it, we instituted a bunch of voodoo superstitious policies under semi-credible conjectures. For the possibility that it might be due to increasing memory pressure due to a small memory leak, Protobowl restarts every day at 4am EST every day. The time was chosen because it represents the time of day with the fewest online users.

  • Protobowl has a built-in chat bot which is the remanant of a chat bot I had built in middle school for Omegle. It can be triggered by going into any empty room and saying (in chat) “I’m lonely”. It’s mostly reflective of the hornier, angstier and shallower side of online chat rooms, so it’s unlikely to pass a turing test (even if it decides to answer “asl” with “18 f uk”.

  • Protobowl was made using Twitter Bootstrap, which for quite a while was the blandest theme you could find on the internet because everyone who couldn’t design thought it looked beautiful. But now Bootstrap has itself become enamored with flat design, and Protobowl looks clean and original again with its soft gradients and gentle rounded corners.

  • Protobowl stores active rooms in-memory as simple Javascript objects. Inactive rooms are saved in a MongoDB database so that they can be unfrozen when requested. Questions are stored in an entirely separate MongoDB database.

Version 3

The current version of Protobowl which lies before you is the “third” version. By that, I mean these version numbers don’t mean much more than significant changes to the core of protobowl. Version 3 constituted more or less an entire rewrite to the codebase with specific attention paid to offline-first and abstracting away the socket layer. The result is something which is, at least architecturally, pretty cool. A rather large amount of code is shared between client and server, such that the transition between online and offline and back is super seamless.

README & Anachronisms

Okay, so I think NPM would complain if I didn’t have a readme, so I guess I’ll start writing something which might be mistaken for a readme given a certain number of prior conditions. As you might figure out in the “Prototype Quizbowl Application” section, the core of protobowl has evolved significantly during the life of this readme, and especially because I’m so insistent on writing pseudo-literary fluff in lieu of helpful concise manuals and bullets, there’s a good chance that the vast majority of the readme will be rife with anachronisms, statements that once were true, were falsified, returned to truthfulness and then fell into a stasis of untruth.

Manual

Okay, so this application probably ranks among one of the largest things I’ve ever done, which actually does say something as to its scope. However, it’s still designed to be fast and responsive and what not.

But it’s got a somewhat large scope which makes me believe that it may be useful to have a sort of overview not of how things work (that’s what the code is for), but a somewhat deeper look into what it does. What kinds of features and little tricks.

Responsive Design

Yeah, yeah, buzzwords galore. It’s responsively designed which means that it should, at least in theory, work on all screen sizes, from gigantic monitor walls of 60 HD displays (I’ve only tested the application on six WXQGA monitors) to an iPhone screen (I won’t venture any smaller, because running on watches represents a significant effort beyond what I already have).

It’s been tested on a few browsers, Chrome (on Linux, Windows, iPad, Galaxy Nexus) and Firefox and Safari on iPad.

It’s built with Twitter Bootstrap, sans fugly black bar on top, so it should inherit most of the responsiveness. Also, it uses Font Awesome for all glyphs and that means everything’s vector and smooth even at absurd pixel densities.

Offline

Protobowl was designed first with a flexible sync architecture. However, regrettably, it wasn’t designed with the idea of Offline-first. Don’t get me wrong, offline works great, but it’s implemented with a offline.coffee which is largely (I’d say 80%) a reimplementation of methods in web.coffee. That’s not a good quantity of don’t repeat yourself, or at least it’s hardly ideal.

However, by virtue of running off Node, there’s a single language (javascript) which runs both client and server. As such there’s a natural ability to reuse some code, in this case, all the supporting libraries are shared between client and server, the answer checking (which is surprisingly sophisticated, albeit likely unnecessarily so).

Offline was built with Appcache in mind, that’s pretty obvious because you sort of need appcache to make things work offline. The offline code is loaded asynchronously and there isn’t any fundamental difference between offline-start and disconnect behavior. So that means there isn’t any fumbling between multiple copies of the code or any limitation on the functionality of the offline mode. You can disconnect from the server in the midst of the game, perhaps because of a flaky connection and you can continue without interruption. And it even tries to automatically reconnect, and picks up the state and resumes (albeit, probably losing what you’ve done offline).

I have a sort of wordsy packrat syndrome, so I’ll leave all the text above untarnished, and state plainly that none of the words above are true, not anymore at least. Version 3 was a rewrite of the entire core of protobowl: client, server, heaven and earth. And to make the architecture more clean and gody, I endeavored to share code between client and server in a pretty cool way.

I don’t think this is meant to be a technical exposé of the artistic symmetries in the yet-unmade UML diagram of the universe, but I’ll talk about it a bit anyway, at least long enough for this document to hit the 3000 word mark.

Interface

This is what I really think matters, how to actually use it, and the little interaction features.

Keyboard

The primary interface is meant to be the keyboard. In fact, early design sketches didn’t even include a button bar for the desktop UI. Eventually, I relented, and there’s now buttons, but still, use keyboards.

The first button is space, which make sense because it’s the biggest button and also probably the most important. Space generally means “buzz”, however there’s another very small thing it does: when you open up and see that big green button saying “start the game”, you can also press space to trigger that (no ambiguity since the buzz button is disabled in such circumstances).

Next, or skipping, as it was referred to in earlier iterations, is also a fairly commonplace operation. It can be accessed with not one, but three keys, S, N and J. S and N are probably pretty obvious, referring to “Skip” and “Next”. J is just convenient because it’s on the home row (well, so is S, technically) and usually refers to “down” for people who use Vi or Vim, although it is commonplace for websites to respect j meaning up, and k meaning down. (See: Facebook, all Google products, etc.)

You can pause and resume with P and R. Both are equivalent, so you can technically pause with R, and resume with P, though that would be metaphorically confusing.

Since it’s designed to be “social”, if you don’t mind me tossing around more loaded buzzwords, chat was one of the first features added (also one of the easiest, but that doesn’t help me rewrite history). Chat is accessed through the / key, that is, the forward slash or by pressing enter while not buzzed. For those who don’t like letters which aren’t in the alphabet, it’s also accessible through the letter C.

Buttons

Yeah, buttons, they exist. But they’re only meant to be used on mobile. Seriously, use your keyboard.

Other things you can click on

There are a number of non-button things which can be clicked on as well.

The “breadcrumb”, as Bootstrap calls it, is the little row which precedes every question which includes the category, difficulty and tournament to which the question belongs. The one on the top isn’t clickable, but all the other ones are. Clicking on those expands the question readout which gets collapsed below them.

Within the breadcrumb is a slightly grayed out word “Report”, which can be clicked on to bring up a (currently non-functional) modal dialog to submit a question for manual review in case there was something wrong with the question. For instance, maybe the question was formatted wrong or truncated, or you’re exceedingly pedantic and think that there is a typo which hinders your ability to participate, or something.

Also, on the far right is a blue star which can be clicked on to “bookmark” a question. Right now, bookmarking does little other than filling in that blue star and preventing the question from getting deleted (questions that drop far enough down get deleted unless they’re bookmarked). However, in the future, it may be imaginable that the bookmarked questions would be added to some kind of interface which could be managed by the server.

The breadcrumb also reveals the answer to the question whenever the question is over.

Leaderboard/Statistics

In multiplayer mode, there is a leaderboard showing a ranking of all the users who have participated in the room. In single player, it’s just a single grid giving your statistics.

Multiplayer

It’s a grid, which is pretty cool, the first row is the ranking and the score, the latter of which is inside a bubble which is colored. The score is at the moment computed based on the number of “early” answers, which are correct answers before the asterisk in applicable packets, the number of interrupts and the number of correct answers. The current weights are 15 for an early answer, 10 for a correct answer and -5 for an interrupt. The colors of the bubble denote the online status of the player:

  • gray the user is offline and has no connected sessions at that moment
  • orange the user has not interacted with the game in the past 10 minutes, but has at least one session open
  • green the user is online and has interacted with the room in the past 10 minutes
  • blue this row corresponds to you

On the adjacent column is the user’s name and there is another column for the number of interrupts. There aren’t any more because there isn’t much space.

Clicking on any of the rows brings up a popover which gives a more complete breakdown of the user and his or her performance. That dialog can be dismissed by clicking (or tapping, for touch devices) again on the same row.

Single Player/Offline

The single player mode has aesthetic similarities to the detailed popover of the multiplayer mode and this is hardly a coincidence. The default view for single player mode is just an abridged version of the popover’s view including only the score, the number correct, the number of interrupts and the total number of guesses. Clicking on the single player interface anywhere will reveal the more detailed interface (with purty sliding effects courtesy of jQuery).

On the single player header the “offline” badge lights up, well, quite obviously, when the user is offline.

It also has shiny transitions between single player and multiplayer modes.

Debugging

The debugging panel is the furthest down and takes the form of what looks like a table. The first two rows sound vaguely technical (with an underscore too!), latency and sync_offset. And you’d be right, they’re both just little metrics which go indicate the health of the network connection.

The titling of “Debugging” is a tad disingenuous, it’s mostly networking. But then again, networking is like the vast majority of what can actually fail.

The very last in the menu is a link that quite plainly says “Disconnect”, and opposite is a piece of text representing the application cache’s status. Disconnect is nice for when the network’s flakey and you know your connection won’t last long anyway. Or if you’re like antisocial and need to make your own little corner to cry in. Or something like that. Sure. Okay. Next.

The Main Interface

I think I’m going to start over with this manual.

Blog Post

For the past few weeks, I’ve been working on a project which was tentatively titled “Protobowl”, short for “Prototype Quizbowl Application”. And that probably suffices as some sort of cursory description of what it is, it’s an app which tries to mimic the experience of a quiz bowl competition online. Most importantly, it was designed for multiplayer from day one.

Backstory

The github page for this project was created two weeks ago at time of writing, but the history of the project actually extends much further back. For the past two years, I’ve been a member of my school’s quiz bowl team, not a particularly illustrious team, but like any non-dysfunctional organization, we do have a sense of enthusiasm for the game. I’ve wanted to do something which had something to do with it for a while, but never found a sufficient impetus to do it until late in May.

On May 25th of this year, I was introduced to the website quizbowldb.com which has a pretty cool question reader. My friend, who belongs to a team which is significantly better (as an understatement) was using that tool to prepare for a tournament. I spent a few minutes looking at the site and felt mildly disappointed that multiplayer didn’t work, but at least I found some niche that was worth catering to.

Bayesian Classifier

The first thing which needed to be done was getting some set of labeled questions for the database. I looked around and found a few sources but most of them weren’t really large or labeled. I had finished the online AI class a few months ago, and felt like applying it in a fairly obvious scenario, so over the course of the next few hours I built a simple naive bayesian classifier to give categories to questions.

But, in order to do that, I had to first have some manually labeled corpus. I certainly wasn’t up for the tedium, so I decided to pull some from the quizbowldb.com site to seed the algorithm. At that point in time, for some reason, I was using some preview release of Windows 8 and rather regrettably couldn’t use wget. But I coped by writing a short python script (rather than the more reasonable solution of rebooting into another operating system) which would incrementally clone the site’s database.

Because of rate limiting or something to that effect, the script only pulled about 10 questions every 40 seconds, and given that there were 30,000 questions in the database, the process wasn’t terribly fast. I eventually booted back into linux and rewrote the script so that it was resumable (for the event that the script might be interrupted for some reason on its scheduled week-long run).

However, by the next day, some number of them were downloaded and that small number was enough to train the classifier. The first thing was trying to run the classifier on the question set itself and comparing how well it fared. The results were actually surprisingly good, and after careful combing through the exceptions, it appeared that most of the errors manifested from the manual mislabeling of the original corpus.

After that initial successful result, I did a bit of work on a question extraction algorithm which would use the command line version of Abiword to convert .doc and .docx files that packets are usually distributed as into some machine readable format for feeding into the classifier. However, I never quite finished that, as school work caught up with me and I waited for the database to finish downloading.

Hiatus

On June 7th, about two weeks after I built the first promising prototype of the question parser, I had noticed that the background task had finished. Actually, it finished something like a week earlier, but I hadn’t noticed. Lurking around through Wikipedia, I discovered that Roger Craig, the famed Jeopardy! contestant who wrote a program to help study had actually attended my high school.

But nothing really happened after that. The school year was nearing a close and the after school clubs were having their last meetings of the year. There was the standardized testing and the anxiety about the imminent end of the year, and I had other projects which seemed more enticing at the time. Narrower in scope and more doable in shorter periods of time. This app was always on the radar, it just gradually slipped further and further back.

It’s Ac Attack

I wasn’t the only person in my school’s quiz bowl team who was interested in pursuing the idea. In fact, it was Ben who actually started the app first. I didn’t write code for the actual project until two weeks ago.

Early in July (or possibly late in June), Ben wanted to build that quiz bowl application when I was still working on a few other projects as well as an internship. I gave him the big JSON file of questions and he toiled away for the next few days, experimenting first with Ruby on Rails and then later settling with Node.JS (and the accompanying popular stacks).

Within a few days, he had put out a pretty impressive application using Mongoose, Jade, Twitter Bootstrap, Express, and Node running on Heroku at the (now defunct) its-ac-attack URL. Soon afterwards search was fairly functional and I started to take an interest in the project again (having almost finished another project, though at time of writing, I have yet to begin writing the blog post for that project).

I felt like exploiting this opportunity to get acquainted somewhat with popular NodeJS frameworks and tools. I haven’t done much with Node for over two years, and the landscape has changed considerably in the intervening years. It’s a fast growing and developing community. However, I never really found interest in building the entire app. Managing users and doing search has always struck me as somewhat boring, in part because it’s hardly untreaded territory.

I wanted to skip straight into the low latency websocket-driven synchronized multiplayer. Ben was still acclimating to the multimodal (if I’m using these buzzwords right) paradigm of writing code meant to be executed on both the client and the server, so I decided to prototype a multiplayer environment.

Prototype Quizbowl Application

Chronologically, this places us at about two weeks ago. That’s actually quite confusing because I wrote that sentence a bit over a month ago, where “a month ago” means August 8th, and “now” means September 12th. That is, you can imagine the gap between the first sentence in this paragraph and the sentence immediately succeeding it as joined by that scene from Monty Python where the animator dies of some fatal heart attack, in which case that animator is me, and I just sort of gave up on writing the readme.

And just like that, in the blink of a sentence, it’s been another month.

Question Set Management via Github

Folder Structure:

qb/
    2013/
        ACF Nationals/
            Round 1
            Round 2
            Round 3
            Round 4

Here’s what the JSON is like from Mongo

{ "__v" : 0, "_id" : { "$oid" : "506e3e938b3d831d6a000001" }, "answer" : "The {Bronze Horseman}: A Petersburg Tale [accept The Copper Horseman or Mednyy vsadnik: Peterburgskaya povest’]", "category" : "Literature", "difficulty" : "Open", "fixed" : 1, "inc_random" : 53.99425264270977, "num" : 1, "question" : "", "round" : "2011-ACFNationals-BrownFinal.doc", "seen" : 36, "tournament" : "ACF Nationals", "type" : "qb", "year" : 2011 }

The question thingy

###
tournament: ACF Nationals
year: 2011
###


---
difficulty: Open
num: 1
category: Literature
round: 2011-ACFNationals-BrownFinal.doc
answer: The {Bronze Horseman}: A Petersburg Tale [accept The Copper Horseman or Mednyy vsadnik: Peterburgskaya povest’]

In a footnote to this work, the author claims that Mickiewicz’s depiction of the same event as this work, while beautiful, was inaccurate. This poem opens with a man standing on the shore imagining a time when “all flags will visit us/ and we will celebrate in freedom,” after thinking that Nature intends for him to carve a window. This work contains a famous passage expressing the poet’s love for a city possessed of an “Admiralty needle,” and this poem’s protagonist is alleged to have the name he does because the poet is comfortable with it, a reference to that poet’s earlier work. The protagonist of this work sits on a marble sentry lion looking at his lover’s house instead of the title character behind him. After the death of his lover Parasha in a flood, this poem’s protagonist goes mad and wanders the streets before confronting the title figure. For ten points, identify this poem in which Yevgeniy imagines the title equestrian statue of Peter the Great chasing him through the streets of St. Petersburg, a poem by Alexander Pushkin.
---

Meta Analytics 17 August 2012

I’ve been maintaining this blog, or at least the content inside it for about five years now. It’s been through a handful of incarnations, often paired with significant changes in web hosting. I’ve had a blog for a little bit longer, but I don’t think I have the medium figured out. The structure of the posts and the style has changed over the past few years, but I can’t at this point call it evolution, a positive progression. Part of the power which lies in analyzing data is the ability to realize patterns, often at a different scale from human observation (spans of months or years) which are equally if not more insightful.

That’s been my personal attraction to data science. I’ve had a couple of personal experiments involving collecting data about my daily activities, my old writing and code in hopes of distilling the changes that I’m too conceited to admit without the infallible hand of statistics. For nearly two years now, I’ve logged my entire life within precision of approximately 30 minutes from Google Calendar (or the Calendar app on iPad which syncs to Google Calendar). Actually, the label is slightly off, I quite often dedicate large spans of time to more or less useless labels like “not productive”. But this temporal information falls apart in terms of its richness, for my schedule is dictated more so by the mandatory rhythms of school life than the drifting cadence of other behavior.

But I digress. This isn’t about why I collect data so much as “I have this data, now what?”. In this case, I had a hypothesis, a rather simple albeit morbid one at that “my blog is dying”. It’s not hard to see how I’m coming at the conclusion. I’m pretty much struggling at this point to meet my goal of one post per month (itself not a particularly difficult goal, but as time has gone on and my posts have become more infrequent, I feel more compelled to write obscenely long posts to compensate, but of course this also leads to big posts sitting there unfinished for long durations losing the sort of one post = one sitting mentality). But before I ramble for too long, I’ll cut to the chase and answer the question posed at the beginning of this paragraph: “Graphs.” (you could imagine those haunting glyphs levitating in the midst of air caught in the invisible grasp of Giorgio A. Tsoukalos, or better yet, I can spare your cognitive abilities by making it real)

Here’s a pretty little graph I made in R (sorry for the mess on the horizontal axis, and I just realized I have no idea as to how to interpret the dates, I’m assuming that they’re linear and it’s just some odd aliasing issue that makes even-numbered years repeat twice), it’s a histogram of the dates of posts that I’ve made to this blog (extracted with a simple Python script and Wordpress’s built-in Export button).You can probably actually tell that the blog’s demise is quite a long way’s coming. Every annual peak ends up shallower the following year and the first time gaps have actually existed was this fateful year, 2012.

It’s actually sort of interesting that these peaks exist, but I can’t really tell during what months that happened during (since these axes are labeled so terribly, it’d be nice if I knew some nice interactive graph engine that worked with histograms, something like that cool time series viewer that Google had for Finance for like ever but for histograms, but I guess that just shows how much of a non-scientist I am, to have no idea how to fluently articulate in a statistical or graphical language of my choice).

For more graph fun, here’s a scatter plot of word lengths as a function of year. I wasn’t dedicated enough to figure out how to get NLTK to tell me the Gunning-Fog, Flesch-Kincaid or ARI value for individual posts, and I doubt that would end up showing anything particularly insightful. But yeah, so here it is. Charts. Charts of words. Note that thing that sticks out clocking in at around 3724 words is my first Music Alpha post.

Actually, I won’t mind that Wordpress isn’t yet self aware (‘ello Skynet) and still sends trackbacks and pings (whatever they are) to me when I link to myself. Seriously, you don’t actually need to have a self-aware artificial intelligence in order to learn how to not spam me with emails when I’m quite probably as in super definitely aware of its existence. But anyway, I guess I’ll stomach the lurching pain of a thousand emails (I’m using hyperbole here, in case your rudimentary artificial intelligence algorithms can’t quite distinguish them, but I’m also pretty sure your algorithms wouldn’t be able to handle n-th degrees of meta, so this excruciatingly useless parenthetical wouldn’t be much other than that: excruciatingly useless) and post the last part of the list here.

1340133957.0 , 2012-06-19 19:25:57 , 1178 [http://antimatter15.com/wp/2012/06/pinball/](http://antimatter15.com/wp/2012/06/pinball/)

1333025085.0 , 2012-03-29 12:44:45 , 1302 [http://antimatter15.com/wp/2012/03/musicalpha-v2-0/](http://antimatter15.com/wp/2012/03/musicalpha-v2-0/)

1293394934.0 , 2010-12-26 20:22:14 , 1409 [http://antimatter15.com/wp/2010/12/drag2up-v2-drag-and-drop-uploading-for-all-sites/](http://antimatter15.com/wp/2010/12/drag2up-v2-drag-and-drop-uploading-for-all-sites/)

1317686582.0 , 2011-10-04 00:03:02 , 1565 [Haven't actually published this yet, hmm]

1341591648.0 , 2012-07-06 16:20:48 , 2117 [http://antimatter15.com/wp/2012/07/cloudfall-a-text-editor/](http://antimatter15.com/wp/2012/07/cloudfall-a-text-editor/)

1307064165.0 , 2011-06-03 01:22:45 , 2180 [http://antimatter15.com/wp/2011/06/why-the-chrome-web-store-is-bad-for-the-web/](http://antimatter15.com/wp/2011/06/why-the-chrome-web-store-is-bad-for-the-web/)

1277922545.0 , 2010-06-30 18:29:05 , 2319 [http://antimatter15.com/wp/2010/06/wave-embed-api/](http://antimatter15.com/wp/2010/06/wave-embed-api/)

1294958307.0 , 2011-01-13 22:38:27 , 2762 [http://antimatter15.com/wp/2011/01/the-ambiguity-of-open-and-vp8-vs-h-264/](http://antimatter15.com/wp/2011/01/the-ambiguity-of-open-and-vp8-vs-h-264/)

1308832860.0 , 2011-06-23 12:41:00 , 2872 [http://antimatter15.com/wp/2011/06/samsung-series-5-chromebook/](http://antimatter15.com/wp/2011/06/samsung-series-5-chromebook/)

1305426252.0 , 2011-05-15 02:24:12 , 3724 [http://antimatter15.com/wp/2011/05/uploading-mp3s-to-google-music-beta-from-linux-chrome-os-win-and-mac/](http://antimatter15.com/wp/2011/05/uploading-mp3s-to-google-music-beta-from-linux-chrome-os-win-and-mac/)

That list was compiled by the command cat blogtimes.csv | sort -t',' -k3n | tail, and that’s quite an accomplishment because I had to look up the arguments for the sort command in order to figure that out. Of course, blogtimes.csv is the output of my magical six line python script (which uses BeautifulSoup to extract all the wp:post_dates).

So, with 10 blog posts in that list, every single 8 of them happened after 2011 and 3 of them happened in 2012. Considering that there were 10 things published in 2012 (according to my dataset) and 21 in 2011, that’s a rather significant fraction of the stuff which has been written recently to be insanely long.

Wordpress tells me this post is now at 948 words, so I guess I’ll add a bit of concluding at the end to push it over the magical power-of-ten barrier, so presumably you should brace for the terrible boom which occurs at this point (oh, what’s that? I think that’s my imaginary telephone operator who informs me when I make a factual error, apparently those kinds of booms only happen with waves, and apparently words flowing through word count orders of magnitude don’t count).

The original title of this post was “Meta Analytics & Upcoming Changes”, but in the spirit of the upcoming changes, I’ve moved the “Upcoming Changes” part into its own post (tentatively titled “Upcoming Changes”). You can probably at this point guess that “Upcoming Changes” involves something to tackle the excessive verbosity and to mitigate the absurdly infrequent posts. This probably doesn’t sound nearly as heroic to you as it does to me, because I’m listening to The Avengers soundtrack right now, and “A Promise” is pretty dramatic.


Chrome Multitask Mode for Real with Multi-Pointer Xorg 04 April 2012

Google’s multitask mode was only an April Fool’s joke, but inordinately amused among us (read: me) might even venture into attempting such an irrational feat. While watching the video, it evoked some memories of something which I had stumbled across a few years ago, called Multi-Pointer X. The rationale for creating that wasn’t nearly as insane as the Chrome Multitask video made it out to appear, instead I think it was just the framework to support multitouch and other sorts of alluring interactions on the Linux platform.

So I decided to look into that again and within a few minutes of Googling, it appears that Multi-Pointer X has now been incorporated into the X.org mainline as part of XInput2. So what exactly does it take to conjure and insult the multitasking gods?

The first step is quite logically to plug in another mouse. I happen to have three mice plugged into my computer anyway (there’s a perfectly logical rationale for how this came to be, my keyboard is some wireless Microsoft keyboard+mouse set but I really hate the scrollbar on the Microsoft mouse, so I instead used another mouse for the longest time until it started failing and sheer lethargy prevailed over disconnecting that obsoleted peripheral) but regrettably, I don’t have an additional prehensile appendage to operate the third mouse.

Right now I’m using a pretty much unmodified version of the latest version of Ubuntu 11.10. I just opened up a terminal window and typed in “xinput list

antimatter15@antimatter15-desktop:~/online$ xinput list
⎡ Virtual core pointer id=2 [master pointer (3)]
⎜ ↳ Virtual core XTEST pointer id=4 [slave pointer (2)]
⎜ ↳ A4Tech PS/2+USB Mouse id=8 [slave pointer (2)]
⎜ ↳ Microsft Microsoft Wireless Optical Desktop® 2.20 id=11 [slave pointer (2)]
⎜ ↳ USB Laser Wheel Mouse id=9 [slave pointer (2)]
⎣ Virtual core keyboard id=3 [master keyboard (2)]
↳ Virtual core XTEST keyboard id=5 [slave keyboard (3)]
↳ Power Button id=6 [slave keyboard (3)]
↳ Power Button id=7 [slave keyboard (3)]
↳ Microsft Microsoft Wireless Optical Desktop® 2.20 id=10 [slave keyboard (3)]

Use the command “xinput create-master Auxiliary“ in order to create a new input device, and try running “xinput list“ again to confirm that the command has done your bidding.

antimatter15@antimatter15-desktop:~/online$ xinput create-master Auxiliary

Now it’s time for the third (or is it fourth) and final step, to reattach one of those master pointers into the auxiliary pointer. To do that, pick out some mouse. For me, I picked my “A4Tech PS/2+USB Mouse“ which is was my mouse with a broken scroll wheel. You can see that it’s been given ID=8, so that’s the number I’ll be using for the next step.

antimatter15@antimatter15-desktop:~/online$ xinput reattach 8 "Auxiliary pointer"

And then I can now use two mice at the same time for whatever ungodly purpose I desire.

It’s hard to take screenshots since it appears that every screenshotting or screencasting app which I can find seems to make the wholly unwarranted assumption that a desktop computer only has one cursor, but it does actually work. Though I don’t think I’ll ever be able to multitask through this scheme, it’s mentally jarring in far too many ways.


MusicAlpha v2.0 29 March 2012

MusicAlpha works again, and this wasn’t some minor update that fixed a small hole, the entire app has been rewritten. There’s a new user interface which has really quite beautiful animated polar progress bars, but the most significant part is that the mystical protobufs-over-https protocol has been decoded and documented.

It really started on January 30th, when Simon Weber contacted me for information on the upload process because his Unofficial-Google-Music-API received various user requests to implement uploading. I remembered a message sent a little while earlier offering help decoding the protobufs and tried out the method described, which involved using a disassembler to patch the curl_easy_setopt method to disable checking of SSL certificates. This turned out to be the trick and I used Fiddler to collect a few dumps of data for subsequent analysis. That was really quite a magical moment because everything afterwards was relatively straightforward. The huge stumbling block which had prohibited progress for these past months had been lifted and the task which was left, of reverse engineering the protocol, seemed eminently viable. With only minor hiccups over the next few days, I built a python prototype of the program. I created a .proto file which was somewhat human readable with field names based on guesses.We realized some pretty cool things or at least got a much better understanding of how Google Music seems to work (This blog post was started in the early part of February, but I’m finishing this up halfway through March).

For instance, the reason Music Manager doesn’t work on virtual machines is because they use the computer’s MAC address as a system for identifying individual devices. They have a cap on the number of devices which can be authorized to manage your music, which is currently set to 10. This caused a bit of confusion when I hit that device cap while testing (At one point I assumed that the MAC-esque field was actually a random string, so after fifteen or so tests everything started silently failing). However, it occurs to me that it isn’t particularly interesting to read the narrative of how exactly the current version of the application has come to exist. The technical overview of how the actual protocol works can be found on the google-music-protocol project page on Github. It includes a brief overview of the mechanism as well as sections which detail the specific messages which are sent to the server and also includes a simple reference implementation. Hopefully it should be enough information for anyone who is actually interested in doing something which relates to it (though I am curious if anyone could possibly create something interesting out of it, please contact me if you decide to do anything at all with it). So at this point, I’d like to skip over to the actual application-specific aspect: MusicAlpha itself. This is the story of how the rewrite came to be, and the rather lengthy time it took to to write this blog post describing it.

One of the first things I realized was that the entire application would essentially need to be rewritten. The code dealt primarily with altering the attributes of a song by mimicking the web client’s interface, something which wouldn’t be very useful now that all the metadata is sent via the magical protobufs scheme. Since that code gets scrapped, I figured why not scrap the rest. I wrote the first version while experimenting with darker color schemes and obscenely prominent border radii and gradients. I’m not sure that I can say it’s objectively better, but it does look a bit more interesting, especially that big two-ringed polar chart.

There’s some text to the right side, but it’s not particularly of interest. There’s a big file selection button, but that’s pretty uninteresting. Embroidered on the solid gray backdrop is the logo, a simple and unchanged music symbol- unchanged from the last version. I used, for the first time, custom web typography. Just some fancy looking sans-serif fonts I picked from Google’s Font Library. On the old application, I had two progress bars to show the upload progress, one is for individual songs and the other is the overall progress. I wanted to portray upload progress in a manner which was somewhat smoother and more aesthetically alluring than a mere linear bar.

In trying to create that better progress bar, I created a polar progress bar in canvas. All updates are smoothly interpolated and watching the bars spin around and invert is somewhat reminiscent to staring at an old record spin on a turntable, something which I’ve only seen a few times in my (rather short, at this moment) life, but magical enough to still inspire a sense of nostalgia.

Central to making an aesthetically appealing polar chart is making the animation smooth. This might be doable through various esoteric CSS transforms, but I’m not particularly knowledgeable in that area, and I’m not a huge fan of crafting elaborate animations with CSS. It lacks a certain amount of control (which manually updating and redrawing affords) and tends to feel extremely sluggish on computers which lack extremely modern graphics cards (such as mine). Since things like uploading a file involves a lot of guesswork and various aspects of the upload process are very much stochastic (ie. waiting for the server to respond), the smoothing system needs to be able to create a nice smooth animation out of highly erratic information. But it also needs to simultaneously relay information without having an excessive amount of lag.

In order to strike a sort of balance, it uses an interpolation algorithm which was employed in Steve Wittens’ js1k entry and you can see my code here. In order to minimize the CPU tax of running such an animation, it uses the HTML5 requestAnimationFrame function. This has the nice side effect that when you leave it uploading in the background for a while, it automatically stops updating the animation and quickly slides into position when you return.

There are two rings, the outer one being dark gray and the inner one being a somewhat lighter shade of gray (fitting into the mostly unsaturated theme of the interface). The outer ring represents the total upload progress while the inner ring represents the progress of an individual file. The outer ring “spins” in a sense, in a clockwise manner while the inner ring rotates in the opposite direction. Every time a ring completes a revolution, it “inverts”, for instance, when the outer ring fills up and becomes a solid circle, the ring begins to open from the opposite direction so that the imaginary end of the rotation continues unimpeded. The animation is somewhat difficult to describe in words so here’s a demo of the progress animation.

When the page first loads, the polar plot exists in the background, behind the prominent logo. It represents the song quota, the percentage out of the 20,000 song quota which Google Music imposes.

While implementing the algorithm, I had a little trouble because of SSL certificates. The domain which accepts all the protobufs encoded information has a certificate signed for the wrong domain, and browsers have a tendency to care a lot about certificate validity. To get around it, those requests are proxied by a simple server which is made with Google App Engine. All communication with that server itself is signed as well, and it doesn’t do any logging.

When I was about to publish this post and first announced that it works again, there was a bit of trouble which people had getting it to work. This was actually because Google Music allows each MAC address to be associated with a few accounts but forbids any additional accounts. The first released version would use a single mac address for all clients, but it has since been replaced with a sort of unique identifier which is stored in localStorage.

Apparently a browser based uploader may be something coming soon to Google Music. Enjoy this app while it lasts.

 


Google Circles was Launched on Tau Day 13 January 2012

I realized this a while ago, June 29th and started this blog post and never got around to filling it with content, but I’m pretty sure the the title can stand on its own. For those unfamiliar with tau, there’s this movement that says that pi is a bad circle constant - since more often than not, mathematical equations use 2pi rather than pi, and for purity sake, that should be the value for the circle constant.

Google+ was launched on June 28, 2011 (or at least that was the day the limited field testing began, but I was part of that, so that’s the date I consider). In other words, 6/28, or 6.28.


Surplus 4 29 October 2011

Recently Surplus stopped working. Well, it hasn’t been working for a lot of people for a long time already, but that’s besides the point. It stopped working entirely. Surplus has always been a rather fragile creature. It operates like a kid on a high speed scooter attempting to carry a house of cards between two strangers. That house of cards is part of a delicate system of frames inside frames inside frames inside frames that move between frames. Surplus is this fairly atrocious mess of frames.

Framing things works out fine until you discover that whatever you’re framing is trying to break out. Meet the X-Frame-Options header, the source Surplus’s recent predicament. It has well meaning motives: to prevent Google from suffering from evil attacks like Clickjacking, XSRF and other nasty things. Incidentally, security-wise, Surplus would probably belong closer to something of that nature than a legitimate application. It doesn’t use an API because applications generally wouldn’t find it useful.

Recently, all Google properties started including that X-Frame-Options header, and now can’t be embedded in frames. It wasn’t an absolutely unprecedented move, because just a few weeks earlier Google Video had started sending out the header (which led to an update which moved from a Google Video host frame). But now it was across all Google Sites, and there was no short term hack that could be done.

The solution was to take a random Google page which didn’t send out the header and mimic all the postMessage messages that are sent from the Google Plus notifications frame. Consequently, the entire frame signaling and attachment system had to be rewritten, and that system was so deeply tied into everything else that Surplus 4 ended up being almost an entire rewrite (the inner frame actions, the options page and the notifications parser did not change).

https://chrome.google.com/webstore/detail/pfphgaimeghgekhncbkfblhdhfaiaipf


Surplus 19 August 2011

In a continuation of my rather unhelpful habit of documenting my activities on this blog long after you probably already know about it, I guess it’s time for me to discuss Surplus, my wildly popular (at time of writing) chrome extension which integrates Google+ notifications into Chrome.

Even more impressive, the name, which is a fairly common word is actually on the first page of a Google search for the word (around eighth result). It peaked at around 53,000 users and at one point made me the 329th most followed person on Google+.

 


Google+ Profile Suspended 24 July 2011

My Google+ profile was suspended on grounds that were not wholly unexpected, but still shocking. I’m likely violating many of their conditions, as I’m 16 (under 18 at time of writing) though when my Google Account was registered, I probably lied and said that I was 30 (At that time, when I was 11, I tended to do that with all services to evade the nonsense COICA on Neopets). I’m not really the victim here (Though I’d like to pretend that I’m pretending that I am so I have an ostensible excuse for narcissism without admitting it by admitting it). I hardly ever share anything on Google+, so I lose nothing except for the ability to make an occasional uninsightful post and the ability to brag about my 3.4k+ followers.

I assume it’s the fault of every social network to obsess over its user base, to define a subset of humanity and designate those as members. To praise individualism, differences and variance yet rely so heavily on the homogenuity and uniformity of the users. A network can not be inclusive and exclusive at the same time, to assume that’s possible is the definition of insanity.

It’s a pity how a minor issue can inflate so much. A minor annoyance in the legalese which every average legally inept and time starved human that we all assumed Google would turn a blind eye to in order to focus on user adoption. “Use the name which people know you by”. It’s a startlingly obvious notion for social networks, if a software system is attempting to duplicate the nuanced structure of physical and digital relationships in a unified and coherent implementation, it makes sense to retain the basic handles which we rely upon. 

But rather than draw upon this logic, that sentence is followed upon by the possibly inconsistent idea to “Use your real name”. What if your “real name” isn’t what you’re known as? What if you don’t feel comfortable revealing this information? This attempt at recognizing nuance fails so blatantly at the most basic principle of consistency and blatantly disregards the considerations of a rather significant portion of web users.

Some say that this principle is good, that it filters out those who should not belong in the network anyway. That those who tend to use pseudonyms do not belong on a good faith network which depends on the cooperation and trust of every individuals. That pseudonyms are employed by criminals and those with malignant intent, and therefore all precautions must be taken to eliminate that risk.

A network has a different role in dealing with its users than a specific group does. There’s a rather interesting notion that is presented by Google’s terming of the rules as “Community Standards”: the idea that everyone who uses Google+ is a part of some sort of community. Not a loosely associated network of individuals through transparent circles, but a community as if every one had to occasionally encounter and take offense to each other.

Postel’s law, the idea that a software framework should be liberal in what it accepts and conservative in what it sends, should apply with the massive-social-network hopeful known as Google+. At the “web scale” which Google operates at, it should be clear that humans often have very few attributes in common. The slightest interpersonal misalignments turn into gaping canyons on the magnitude of billions.

Pseudonyms mean quite literally a fake name. Now, why would someone want a fake name? The simplest answer which by no means encompasses all or even most uses, is that someone has something to hide. As the evil voldemort-controlled ministry of magic in harry potter and the deathly hollows part 1 the movie leader said “You have nothing to be afraid of if you have nothing to hide”, which I found shockingly similar to Eric Schmidt’s “If you have something you dont want shared online, maybe you shouldn’t be doing it in the first place”. 

Does pseudonymity necessarily produce a culture incompatible with civilized society? I would say no, and even if the answer was yes, it doesn’t matter. The internet is a big and diverse place and it doesn’t matter if there are people who have questionable motives exist as they aren’t forcing their ideas on others. 

In Romeo and Juliet, Shakespeare (through Juliet) asks “What’s in a name?”, expressing the notion that the name is a meaningless and artificial convention which only serves to create an equally artifical and antagonistic gap between the protagonists. We can choose to subscribe to these existing and equally artificial distinctions and we should be able to choose not to. 


Chrome OS, the delusion of native, and perfection 22 May 2011

Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.

Antoine de Saint-Exupery

Warning: This is half a post about how I really want Google to send me a Cr-48, and half of my inability to work on an essay at three in the morning. Also, parts of this post were written on a Logitech Revue Google TV, that was just sent today. So, as this post is pretty broad and tangential anyway, I’ll cram a consise review of the Google TV system in here as well; The initial impression isn’t that great. Setup is longer than it should be, and the immediate uses aren’t really clear. Initially, it seems like an unnecessarily complex over simplified (somewhat an oxymoron, but that’s what Google TV is) media center PC. But after playing with it for a few hours, it does seem much nicer. The keyboard is nice and the touchpad is surprisingly usable (my only gripe is that the click button is too close to the back button). Finding videos and watching them is quite nice, the twitter client is surprisingly usable and the browser is great most of the time.

The delusion of native

Critics of Chrome OS often say that Chrome OS is useless for it’s lack of “native” applications. But “native” is really just another word that has lost its intrinsic meaning, like HTML5, Ajax and Web2.0. Most people consider Android’s App Marketplace apps “native”, despite that they’re Dalvik VM bytecode and not raw binary ARM code. Even in iOS, a native app usually refers to, not the binary that comes out of XCode, but the application’s usage of the Cocoa Touch graphics framework. The only case when the word “native” poses any real meaning is in gaming, where frames need to be drawn as quickly as possible, where the core loops are often hand written optimized assembly. But iOS, Android and Chrome OS support this in the form of the NDK and Native Client. That leaves a single primary connotation of native applications: consistent interface look and feel.

The web as a platform is relatively low level. The only user interface widgets provided are form components such as checkboxes, buttons and text fields. This low level interface forces for the most part, people to implement their own extended widget sets, and in many cases, choose to reimplement the native controls to introduce consistency across browsers.


MusicAlpha Upload to Google Music Beta from Linux and Chrome OS 14 May 2011

Rather pleasant update: This app works again. After some epic hackery, it now works again. Install it now, in its fully functional, redesigned glory.


Rather depressing update: This sadly no longer works. It was fun and somewhat useful while it lasted. I have no idea why it doesn’t work, and I would really appreciate if someone could find out why it fails. However, it would be a good idea to install it anyway and I guess.


 

Google Music Beta is a pretty cool web app, but the Music Manager sadly only works on Windows and Mac. As a Linux user who unfortunately feels neglected by the service, but still appreciative of being invited to the service a mere day after it’s announcement, I decided to do something to remedy the situation. Not to mention the irony that one can’t even upload music to the service from Google’s own Chrome OS. So MusicAlpha was born.

This project was a pretty interesting hack, so I guess I’ll try to document the process of how I made it and how it currently works. And also, for any prospective filmmakers, this story might make for an interesting abstract action movie. But for any of you who just want to install the app, click here.

Inception

I started this only days after finishing the new revision of Cloud Save, the extension that I never quite understood, but everyone else seemed to get. One part of the newest revision was adding support for Amazon’s Cloud Drive service, which does not have an established API. However, the interactions between the javascript web interface and the server are pretty simple to understand, thanks to the almighty web inspector. The only weird thing was that the actual file upload was actually conducted through Flash, which felt unnecessary but it wasn’t an insurmountable task. But it was all built on carefully learning the way of the web client.

There was one feature request for Cloud Save to enable saving to Google Music, and that’s when the idea sort of started.

The first step was to get a Google Music account, nothing surprising there. A few hours after the announcement was the first opportunity for me to access a computer, and also when I hit the “request invite” button. In a shockingly short amount of time of a mere day, I received a nice email saying that I had been invited to the service. Yay. Now the letdown of not being able to upload from Linux begins, as well as the scheming to reverse engineer it.

A few months ago, I had the idea to do something similar, but with another Google product: Google Goggles. But after hooking up the anrdoid simulator with a http proxy (Charles), and looking at the results. I was rather dismayed, but not particularly surprised by the results: all the communication was encoded using Protobuf. Protobuf, or Protocol Buffers is “Google’s data interchange format”, according to it’s Google Code project page. It’s a structured system for encoding binary data where a .proto template is already known and compiled. Protobufs are bad in much the same way minification is. There are totally justifiable reasons to minify source code, but the elimination of a useful View Source detracts from the open ideal of the web. Though protocol buffers aren’t encrypted, they are not comprehensible without a .proto file to decode them. With a lot of work, one might be able to reverse engineer a .proto file, but it’s certainly much harder than a JSON protocol.

Before I began, this was the worst fear. That everything as encoded with protocol buffers, and my attempts to mimic it would be rendered futile.

Packets

I borrowed a computer which ran Windows 7, and installed the Music Manager in hopes of deciphering the secret enigma of the great google. I needed a simple packet sniffer, and so I used nirsoft’s smartsniff. I didn’t use wireshark because it wasn’t my computer, and I didn’t want to install and download a huge app with an intimidating user interface. Smartsniff worked pretty well, and I thought I was almost there. Packet sniffers generally can’t decode https data without some MITM-esque certificate faking, and I have reason to believe that that wouldn’t work either since running the strings command on the MusicManager.exe file includes something that resembles a public key.

Anyway, I found two raw unencrypted HTTP requests, and sort of fixated myself on those two.

POST /uploadsj/rupio HTTP/1.1

User-Agent: Music Manager (1, 0, 12, 3443 - Windows)

Host: uploadsj.clients.google.com

Accept: /

Cookie: SID=[redacted] domain=.google.com path=/

Expect: None

Content-Length: 851

Content-Type: application/x-www-form-urlencoded

{“clientId”:”Jumper Uploader”,”createSessionRequest”:{“fields”:[{“inlined”:{“content”:”jumper-uploader-title-19Redacted”,”contentType”:”text/plain”,”name”:”title”}},

{“external”:{“filename”:”Redacted.mp3”,”name”:”C:\Users\Redacted.mp3”,”put”:{},”size”:610Redacted}},

{“inlined”:{“content”:”0”,”name”:”AlbumArtLength”}},

{“inlined”:{“content”:”0”,”name”:”AlbumArtStart”}},

{“inlined”:{“content”:”3rBe9dCa%c2tBeJdD”,”name”:”ClientId”}},

{“inlined”:{“content”:”00:1F:R3:DA:C7:ED”,”name”:”MachineIdentifier”}},

{“inlined”:{“content”:”5a5de21sd-54dsfe-Redacted”,”name”:”ServerId”}},

{“inlined”:{“content”:”true”,”name”:”SyncNow”}},

{“inlined”:{“content”:”153”,”name”:”TrackBitRate”}},

{“inlined”:{“content”:”false”,”name”:”TrackDoNotRematch”}}

]},”protocolVersion”:”0.8”} I noticed the second request was rather huge, 3556KB. Just enough to fit a normal-sized MP3. I knew what was in there. PUT /uploadsj/rupio?upload_id=REDACTED&file_id=000 HTTP/1.1

User-Agent: Music Manager (1, 0, 12, 3443 - Windows)

Host: uploadsj.clients.google.com

Accept: /

Cookie: SID=REDACTED domain=.google.com path=/

Content-Type: audio/mpeg

Expect: None

Content-Length: 3141REDACTED

ID3…..%vTENC….@..WXXX……..TIT2……. Redacted At this point, I was relatively ecstatic. Or at least, I will pretend that I was, because everything looked elegantly and remarkably simple. I had no idea where the upload_id came from, but I figured that it was part of the response from the first POST request. Part of the problem was that SmartSniff didn’t give me the actual responses to the HTTP requests, only the request data. Or at least from the five minutes that I bothered using it. The cookie looked like a generic cookie that would exist in a normal browser session.

Prototype

Since it used generic browser cookies, the simplest way to start would be to make a browser extension. Specifically, a Chrome App. That way, I could get the necessary permissions to do all the cross domain security protocol violations and probe around a bunch of requests. Since XHRs already include cookies for a specific domain, there’s nothing I need to do to set the cookies. I hoped and suspected that the User-Agent, Accept, and Expect headers were basically ignored, and the content-type would be rather trivial to set.

A lesson in premature optimization was the huge block of JSON which got sent with the first POST request. I had no idea what was really essential, so basically, I just deleted almost everything there except for what I thought might be really important: the file name and size. There were no errors in the POST request, so I figured it was okay to delete all of that stuff. I ran the POST and just as I suspected, the JSON that resulted included the upload_id, in fact, it included the entire PUT url, which was pretty nice.

{“sessionStatus”:{“state”:”OPEN”,”externalFieldTransfers”:[{“name”:”REDACTED.mp3”,”status”:”IN_PROGRESS”,”bytesTransferred”:0,”bytesTotal”:314REDACTED,”putInfo”:{“url”:”http://uploadsj.clients.google.com/uploadsj/rupio?upload_id=AEdREDACTED&file_id=000"},"content_type":"audio/mpeg"}],"upload_id":"AEdREDACTED"}} Then, when I executed the subsequent PUT request. I was scared. {“errorMessage”:{“reason”:”REQUEST_REJECTED”,”additionalInfo”:{“uploader_service.GoogleRupioAdditionalInfo”:{“completionInfo”:{“status”:”REJECTED”,”customerSpecificInfo”:{“ResponseCode”:404}},”requestRejectedInfo”:{“reasonDescription”:”agent_rejected”}}},”upload_id”:”AEdREDACTED”}} Scary. The rejection reason was “agent_rejected”, which made me think about the User-Agent header, and I wondered if that was supposed to matter. If that did matter, then I would have to prototype it in a different language since XMLHttpRequests forbid the setting of the User-Agent and other headers for security reasons (even operating in a privledged environment!).

But thankfully, before attempting the drastic route, I pasted back in the alleged cruft, the jumper-uploader-title, TrackDoNotRematch, TrackBitRate, SyncNow, ServerId, MachineIdentifier, ClientID, AlbumArtStart (hey that rhymes!), and AlbumArtLength. Magically now, it worked. Yay.

Rupio

During my quest to understand the bizarre agent_rejected error (because I totally fear rejection, they should have used a nicer word), I tried googling GoogleRupioAdditionalInfo. (I had googled “Rupio” before, since that was part of the POST url, but that didn’t yield any relevant results, it just ended up being a bunch of people’s names). Searching GoogleRupioAdditionalInfo yielded a more limited subset which happened to be more interesting.

The first result, from the SMEStorage Blog described an error emitted by the Google Docs platform. The next result was on the Picasa help forums, about “Upload video results in Error”. Then was a french language forum which mentioned an error which included GoogleRupioAdditionalInfo, this time on the url “upload.youtube.com”.

So, it seems “Rupio” is the codename (or just the name, but it’s much more fun imagining that there are subliminal codes everywhere) of a unified Google data upload/storage platform. That’s pretty cool. It probably makes sense that they have a sort of unified file storage system across Google Docs, Picasa, and Youtube, since it would probably be easier to maintain a system which is internally consistent. But to inject a sensationalist twist where occam’s razor shows that it’s not justified, this is a hint at an upcoming unified Google file storage service. A sort of file-browser dashboard which links all media and document files together in an accessible and uniform manner.

Shiny

It worked. Or at least, it seemed to work. Vaguely. At this point, I was reasonably satisfied and began sculpting a nice pretty interface for it. Following my usual way, I just stole the Tango icon for an audio file mime type and built the entire interface around it. I remembered that earlier that day, I had discovered something called Layer Styles, which is a photoshop gradient/style clone that instead generates CSS3 gradients. I thought that was cool and I remembered it, so I played around with it and made a centered rectangle. Gray on Gray.

I decided on a name for it: MusicAlpha. It’s a sort of play on how this Google product features Beta in a way which is much more prominently than usual, with the gray “beta” the same size as the actual “music”. The Alpha sort of says something about how it will almost eternally be stuck in this unstable and incomplete form. It also reminded me of one of my favorite websites: wolfram alpha, and how the logo is stylized WolframAlpha or Wolfram|Alpha.

I wanted it to be minimal, but I’ve since adopted the belief that drag-and-drop file selection was only a fad. Sure, it’s often really useful, but it’s not great to restrict to that. A hybrid approach is much better, and if you have to have only one, then the standard browse button is better since you can drag and drop files to that button (on chrome, anyway) without any special code. Drag and Drop is often quite terrible on laptops, and I’m not sure if you can even drag and drop files with Chrome OS, since the file browser might be modal.

There’s not much to the interface, it’s probably as minimal as it can get. Two links, one to the service, the other to me. A button. That’s it.

Dismay

At some point in every good story (not implying that this is a good story), there’s false hope, where the protagonist (not implying that I’m the protagonist here, either) believes that he or she (not implying that I’m a she) is closer to the end than is factually warranted (I’m actually implying that that’s exactly what happened). I checked the Google Music song list, and lo and behold, the song was there.

Sort of. Not really. Something was. Not sure it was a song.

There was a blank row. Double click it, and something does play. And it is the song. But, there’s no way to seek because incidentally, it doesn’t know how long the track is, so there’s no way to render the position of the song.

It was late, and this was frustrating, so I just posted a terse blog post announcing the beginning of the project:

So pretty soon, I hacked together something that almost sort of worked, with one rather significant caveat: It doesn’t pick up any tag data, name, time, artist, album, etc. I can’t figure out why. I guess I’ll try some more tomorrow. And by the way, I totally lied. I didn’t try at all the next day.

Yesterday, that is, May 13th, the delightful Friday the Thirteenth, I tried again. I installed Music Manager on a VMWare installation of Windows XP, and sniffed the traffic with Wireshark running on the host computer. Didn’t notice anything new in the plethora of data which happened to get exchanged. I tried running smartsniff from the VM, but every time I tried to start sniffing, VMWare magically crashed.

SkyJam

At this point, I thought of looking at the MusicManager.exe binary. Usually, binaries have some random strings in them, which you can look at to give some hint at what it does, without actually decompiling something. I’m now going to proceed to anachronistically mention something about android, because of a certain URL that I find out later in this chronology: android.clients.google.com. So I used strings MusicManager.exe | grep android, and found some rather interesting things

2-.wireless_android_skyjam.CopyrightStatus.Type

2”.wireless_android_skyjam.ProductId

wireless_android_skyjam.Uits

Metadata.ParentalAdvisoryType There appears to be 214 mentions of the phrase “skyjam” in the executable, which I assume is not some new solar powered Google sandwich filling distribution mechanism powered by artificial intelligence driven aeronautically mobile condiment dispensers (obviously using Arduinos).

I have no idea what Skyjam is, but that’s totally a much cooler name than Google Music, regardless of how big you write “beta”. And what’s wireless_android_skyjam? Why is skyjam so deeply tied with wireless and android? Maybe it’s got something to do with flying robot sandwich machines. Or not.

Maybe it has something to do with that Simplify Media company which Google bought last year. Or maybe I’m just a little sad that the product ended. According to TechCrunch,

Google VP Vic Gundotra said that Simplify’s technology will be used to offer a desktop app that will give you access to all of your (DRM-free) media on your Android devices remotely, using Google’s new iTunes competitor on the web.

> Desktop App: Check. Android: Check. New iTunes Competitor on the Web: Check. Simplify’s Technology: Not so much.

For those who aren’t familiar with what Simplify did (which I assume is nigh everyone), it streamed your music by making your desktop into something of a server, so your huge cache of music can be accessible by any mobile device remotely.

Fishtank

I had a random false lead. I thought that the server still read ID3 data on itself, and noticed that the HTTP dump was somehow slightly different from the original file. It was the same length. The exact same length, but some of the bytes were different. Most were the same, and they were all in the same place, but for example, all 0x00’s seemed to have been swapped with 0xe4’s. Some other bytes were also different, and in retrospect this was probably because the program which created the http dump sucked and encoded all the bytes wrong or something. But I tried opening it up in Ghex and I swapped every 0xe4 with a 0x00, and magically the broken file now played, with one exception: the audio now sounded like a fishtank. It’s like everything’s just water, magestically moving around with exquisite sublimity.

Sniffing

I borrowed another Windows computer this time, and tried again. This time, I also tried the HTTP Proxy called Fiddler, which seemed to be able to capture more information, or at least give me more of the information that I would find useful and less of what I wouldn’t find useful. I noticed that for every file which was uploaded, there were in fact three requests, not two as I had previously observed.

So where had this covert HTTP request been hiding? Behind in the magical realm of HTTPS/TLS encryption. I couldn’t peek into the actual content of the HTTPS requests because of the fact it was encrypted. And as the strings command had revealed earlier, there was a huge block of base64 encoded text which appeared to look like a sort of public key. It suddenly connected, the Music Manager probably has its own list of trusted certificate authorities, and that’s why adding a mere certificate to Windows wouldn’t do a thing.

But the unencrypted CONNECT request, which initiates a HTTPS connection said enough.

There was a request to android.clients.google.com and it was encoded with gasp: application/x-protobufs. If this was some sort of action movie, here would be the scene where the antagonist from the previous movie reveals that he is still alive (with some facial scarring) and the person responsible for all the disappearing bodies (of data).

I was about to give up at this point, but then a new idea sturck.

A New Hope

This is the second movie-title article sub-title that I’m aware of, unless there’s a movie called “Packet Sniffing”. For the prospective filmmaker, this is when the protagonist has the flashback and remembers that time, long long ago (though here it’s only like four days, you can take creative liberties), when refining the art of hackery, a task involved crafting an implementation of a storage system from the web interface of a specific cloud drive service.

So, I noticed that I had spent so much time exploring what enigma lay beyond the surface of Google Music, that I totally forgot to actually use the service. And when I did, I realized that there was an Edit Song Info context button. Trying the feature out, it seems to be a simple JSON post to a certain modifyentries URL.

{“entries”:[{
“genre”:”REDACTED”,
“beatsPerMinute”:0,
“albumArtistNorm”:””,
“album”:”REDACTED”,
“artistNorm”:””,
“lastPlayed”:1305419744984200,
“durationMillis”:192168,
“url”:””,
“id”:”564ads4f8e4-REDACTED”,
“composer”:””,
“creationDate”:130541974874500,
“title”:”REDACTED”,
“albumArtist”:”REDACTED”,
“playCount”:0,
“name”:”REDACTED”,
“artist”:”REDACTED”,
“titleNorm”:””,
“rating”:0,
“comment”:”REDACTED”,
“albumNorm”:””,
“year”:”2004”,
“track”:”04”,
“totalTracks”:””,
“disc”:””,
“totalDiscs”:””

}]}
Amazing. I just stumbled on the magical code which would allow me to finish the project. Yay. Now I had to figure out what the xt= parameter means (after determining it’s necessity). My first hunch was to search the HTML source of the page, because the magical access codes for Amazon Cloud Drive were put inside a hidden input element. But no, it wasn’t there. Now the fear starts settling in, what if it’s a strategically computed magical super string? But why would they ever do that?

Somehow I noticed somewhere on every time the page is loaded, the network logs indicate a Set-Cookie: xt=AM-REDACTED. Now I knew how to get it: Just send an XHR to the cloud player and copy the cookie.

ServerID

Now, the important pat of all the items in the entries object is probably the id. Everything else seems somewhat generic, and probably exists for any song that you throw at the service, but id clearly has a definite, important, non-random, predetermined value. I thought that there was no correlation between the data passed from the uploaded data and the id which was here, since the upload_id was way too long and had nothing in common with this song id.

Since the songs that get uploaded have no title, they always appear first on the alphabetical list of songs, so I figured I would just pull a list of the songs, alphabetically, and then take the first one to get the ID. Then I discovered the auto-playlists, which included one for most recently added.

Unless you have the power of omniscience, you don’t know that the format of the IDs here were like das8f7-adf0w8f-adf87we-mwere, because I slapped a huge REDACTED on the latter half. The random string generating code that I’ve used for years out of convenience is Math.random().toString(36).substr(2), which basically generates a random float, converts the fractional part to base 36 (0-9 a-z) and strips the leading “0.” from the string. This incidentally ends up being quite short (especially on chrome, it’s much longer on firefox), and lacks the dashes.

So I looked at the ID for one of the streams and noticed that the id string was unusually short. I started wondering why something like that would happen, and then it struck me, not quite like lightning because I’m still alive (though whether or not I am when you’re reading this is a totally different question). This file ID was the same thing as the ServerID which I thought was just a random useless value that was arbitrarily required for it to work. I deleted the whole getRecent routine and suddenly everything started to make sense.

Fish

You probably don’t remember my old blog post, almost a year ago, titled A Bright Coloured Fish: Parsing ID3v2 Tags in Javascript and ExtensionFM. Recently, by fate, I happen to have used that code more times that I’m comfortable with, and it seems that once again I must revive this old memory. It’s fairly outdated, and some things are very nasty. Not being able to parse ID3v1 is pretty terrible, though I don’t think I have any songs using v1.

I hoped that Google’s magical server farm would have parsed out the ID3 data, but that doesn’t seem to be the case. I have to manually parse them in the client.

I had to change the id3v2 library a slight bit: the picture parsing code now includes a blob attribute, since the the album art must be uploaded to Google’s servers in order to be included in the metadata.

I guess that fishtank sound was an omen. I’m sure the prospective filmmaker could use it in some way like this.

Time

Rather auspiciously, the modifyentries command included durationMillis. Since it became evident that the music player app had no way of knowing the length of a song without some tag data, that it would be impossible to seek to a certain portion of the song without this ability, it was really important. But since this was from the API for a user-facing operation, and editing the length of a song isn’t something that a user can or should be able to edit, it was pretty fortunate that that was included as part of the API.

This wasn’t without problems though. It’s not that easy to measure the length of a song. The Length tag of a MP3 file is actually seldom included, so that clearly can’t work. I looked at how to measure the length, and it was pretty scary, since it doesn’t seem like there’s any way other than parsing all the frames out and multiplying the bitrates.

But this time, HTML5 comes to the rescue. I never thought I would be so happy to use . I just take the file, convert it to a blob URL through that nest of if/else statements that handles different versions of chrome. I open it with new Audio and wait for the metadata event, indicating that I now know the length of the song to thirteen places after the decimal. Because of course, I would hate to miss the last picosecond of my song.

I felt sad for the little femtoseconds and attoseconds who were neglected by Google when they decided that the song lengths should be milliseconds. Not quantized divisions of Planck time.

Epilogue

At this point, nothing interesting really happened. I fixed up the interface, added queueing, and other insignificant things. I published it and spent the rest of day writing a hopelessly long document describing it.

Protocol

For those who want to build their own programs which upload to Google Music, I guess I’ll write a summary of how it works.

  1. Get Song Metadata (ID3 data, Duration, primarily)
  2. Login to Google
  3. POST to uploadsj.clients.google.com/uploadsj/rupio
  4. PUT to putUrl as part of the response of previous operation
  5. Load music.google.com/music/listen and capture xt value from Set-Cookie response header
  6. (Optional) Upload Album Art to music.google.com/music/services/imageupload?u=0&xt=
  7. Set file metadata via music.google.com/music/services/modifyentries?u=0&xt=

Friendly Reminder

To get the app described in this lengthy narrative, click here.


The Ambiguity of "Open" and VP8 vs. H.264 13 January 2011

Google has recently announced their intention to remove the H.264 video codec from its Chrome browser. This decision has been smeared as an evil campaign for controlling video on the web, akin to not-invented-here syndrome. It’s also been lauded as the push that the web needs to remain open and free. Mostly, it’s been marked as inconsistent, due to the bundling of Adobe’s proprietary Flash player.

Richard Stallman doesn’t like the term Open Source because it fails to embody the true meaning of “free software”, and the one thing that’s worse is the word “open”. This debate can’t simply be labeled as one for or against openness (even ignoring the technical details).

H.264 is an open standard. It was developed by a committee, standardized, reviewed by many engineers and developers for multiple companies and has been standardized for use with a multitude of containers and devices. However, H.264 is not royalty free. Software patents in many countries restrict the distribution of software that utilizes the codec to those who pay the MPEG-LA.

VP8 is not a standard. It was developed secretly by a single company, and until recently, had only a single working implementation. The public wasn’t open to collaboration on the specification until the bitstream spec was frozen, including the bugs that existed within. Now, the source code and reference implementation are available under liberal licenses, and all the related patents are irrevocably royalty-free.

Adobe Flash, while not synonymous with a video codec (unless you mean .flv flash videos, which are either VP6 or Sorenson Spark), is going here anyway because everyone feels like comparing it. Flash’s SWF format is not standard, but it is open. There are a few implementations (Swfdec, Gnash, GameSWF, Gordon, etc), but none of them are as complete as the official proprietary implementation. There’s a bitstream specification that anyone can read to create an independent implementation of the player. Implementing and using the Flash player is still royalty-free (Since the Flash VM can decode H.264, obviously that part, not controlled by Adobe still has royalties), anyone can make software that can export SWF animations without paying Adobe.

Implementation vs. Distribution

Or: How Bundling Flash doesn’t violate the Novikov self-consistency principle

Name Standardization Implementation Distribution Dev History
Theora Standardized? Open Source Royalty Free Mostly Open
VP8 Not Standardized Open Source Royalty Free Mixed
H.264 Standardized Open Source Royalties Open
Flash* Not Standardized Proprietary Royalty Free Proprietary
* Adobe Flash isn’t a video codec

Each column of the chart can be treated as one part of “open”. The <video> debate now primarily concerns VP8 and H.264, as Theora is inferior for technical reasons and Flash isn’t a video codec. I don’t know if Theora’s actually standardized, but it has multiple implementations and was developed by the Xiph.Org foundation, and is the closest thing to an actual standard, short of being a product of MPEG. Theora was created from On2’s VP3 code, but was changed significantly by the Xiph.org Foundation before release, so the development history can be considered mostly open. VP8 as announced, and the bitstream was frozen shortly afterwards, leaving little community involvement. For the years before its debut as part of WebM, the format was secret, patent encumbered, and proprietary. At least now, the source code is open and people are free to do whatever they please, but the open source video community probably had very little say in the development of the codec. H.264 has an open software reference implementation, as does VP8 and Theora. Flash’s de-facto implementation is Adobe’s, which is proprietary.

The one that concerns Mozilla, Opera and Google is the distribution rights; whether or not royalties have to be paid. Notably, this is where the parallel drawn by most critics of Google’s move with H.264 and Flash comes into question. Many people believe that Flash is the epitome of evil, and proprietary software, and embodies everything wrong with proprietary software. The distribution column is arguably the most important, and it’s the one that fundamentally determines whether or not something acts as a detriment to “open innovation”. Flash animations and applications can be created and distributed without paying royalty fees to Adobe, regardless of the viewership. Innovation is still permitted as long as the distribution is free (except with changing the inner workings of the proprietary implementation).

The Ambiguity of “Plugin”

TODO: Use this subtitle to mention random tangential thoughts.

It seems there are lots of misconceptions about WebM, and new ones as well, because of the use of the word “plugin” by Google in their “More about Chrome HTML Codec Change“. Here’s what the relevant part of the post says (it’s buried into the last paragraph).

This is why we’re joining others in the community to invest in WebM and encouraging every browser vendor to adopt it for the emerging HTML video platform (the WebM Project team will soon release plugins that enable WebM support in Safari and IE9 via the HTML standard <video> tag). Microsoft and Apple through IE9 and Safari, respectively, rely on the underlying operating system’s multimedia frameworks to handle <video> decoding. Chrome and Chromium bundle a customized version of the ffmpeg framework, while Opera uses gstreamer and Firefox has its own framework built on various open source codec libraries.

Browser WebM H.264 Theora Anything Imaginable
Firefox Version 4 Never Version 3.5 Never
Chrome Version 6 Removed in Version 10 Version 3 Never
Opera Version 10.60 Never Version 10.50 Never
Safari QuickTime Default QuickTime QuickTime Default QuickTime*
IE9 Windows Media Default Windows Media Windows Media Default Windows Media
_ Well, not really anything imaginable, but a lot of them_

QuickTime and Windows Media are pluggable multimedia frameworks. And since Safari and IE9 use them for <video> support, any codec or container format supported by the respective frameworks, works in the browser. For QuickTime, the list for videos goes along the lines of 3GP, Apple Video, AVI, DV, Cinepak, H.261, H.262, H.263, H.264, Microsoft Video 1, MPEG-1, MPEG-4 Part 2, Motion JPEG, Pixlet, Planar RGB, Sorenson Video, Qtch, QuickTime Movie, and QuickTime VR. But these media frameworks are pluggable, which means they play whatever codecs are installed. It just happens that those codecs listed above are plugins that are installed by default, somewhat analogous to how Google Chrome bundles Flash. The plugin frameworks for multimedia don’t treat plugins any different from “native” codecs. There’s absolutely no way to tell the difference from a user perspective. Right clicking a video will give you the same ordinary context menu.

Codec packs are often installed by users anyway. A codec pack consists of a set of plugins for the operating system’s multimedia framework. Once one of those codecs are installed to the system, support is available to the applications that rely on the OS’s framework: QuickTime Player, Windows Media Player, IE9 and Safari.

Notably absent are the Opera, Chrome and Firefox browsers that ignore whatever is installed, to use their own bundled decoders. There’s a reason for this, and it’s pretty easy to spot. Firefox, Chrome and Opera are also the only ones on the list that aren’t tied to a specific operating system, and bundling your own media framework lets you keep the functionality consistent. IE and Safari aren’t cross platform. Well, Safari works on Windows, but it still requires QuickTime for Windows to be installed to play the video content.

You also don’t want all the other video codecs thrown into the mix. Standards are about standardization, where you have a limited number of codecs instead of an entire forest of them. This brings us to a bit of History, I’ll begin with the olden days. I can’t tell you too much, as much of the first section happened when I was 6 years old- and certainly wasn’t into the negative user experience of platform-specific multimedia plugins.

A Brief History of Stuff Named “A Brief History…”

Uh… I mean, Video on the Web.

Before the popularity of HTML5 Video, or event Flash, video was viewed on the web using the RealVideo, MPlayer, VLC, Windows Media Player, and Quicktime Plugins. Note that pretty little plugins aren’t plugins to the underlying multimedia frameworks: these are the nasty type of plugins that Flash is among. They’re the plugins that take a long time to load, have interfaces that never look like the surrounding page or browser, only work on certain operating systems, at best, inconsistently. This is the epitome of what standardization is meant to prevent, and the distillation of everything that’s wrong with plugin based video.

Flash’s popularity (probably thanks to YouTube), brought a semblance of standardization to the industry. Video on the web was delivered through the flv container, encoded either with Sorenson Spark or, in later versions of Flash, On2’s VP6. A while later, H.264/AVC which was later added to Flash, and as the superior codec, most people switched to that and FLV is slowly fading away.

HTML5 video became popular recently. I’ll say it’s probably because of iOS, since that’s the only way to get web video to play there and the rest of the aggressive marketing that Apple does. This brings us to today (or at least the general time period of early January 2011 in which I’m writing this post in). Google has just announced it’s intention to remove H.264 from an upcoming revision of the Chrome browser (much like how the Chromium browser never supported H.264). And everyone on the internet that cares enough to say a word is going insane. Which brings us back to the last section, about how Safari and IE9 use the operating system’s underlying multimedia frameworks and how all the other browsers (Firefox/Chrome/Opera) that work on my beloved operating system (Linux) bundle their own.

If Firefox, Chrome and Opera used the OS’s media frameworks, we would be set back into the dark ages when people used every video codec imaginable and nobody would be happy. Standards exist for consistency, and it would be terrible if people started to go on a cost saving measure to never again transcode video, and serve it straight from the server’s filesystem as AVI files containing DV or god forbid MJPEG (the same format the users’s uploaded!) because all the Safaris and IE9s could play it fine. The fact Firefox, Chrome and Opera bundle their own multimedia frameworks forbids the possibility of this, because those browsers (in the forseeable future) will never support anything other than WebM, Theora and potentially H.264.

There’s more to H.264 than just H.264

This part doesn’t make much sense.

Often, the argument against VP8 is that it’s inferior to the H.264 codec, and to me, this seems like the most ideologically valid concern. But in a lot of cases, it stems from a misunderstanding of how H.264 works. H.264 is not a single video codec (not even to mention multimedia container formats), but rather has several Profiles that work on different devices and implementations. Rarely are videos encoded only in MPEG-4/AVC H.264 Extended Profile at 1080p and 60fps. Go on the Apple store and you’ll see that every iPod device you can find (including the iPhone), includes what video codecs are supported. And it doesn’t just say H.264, but rather, something much more wordy like

Video formats supported: H.264 video up to 720p, 30 frames per second, Main Profile level 3.1 with AAC-LC audio up to 160 Kbps, 48kHz, stereo audio in .m4v, .mp4, and .mov file formats For the insanely great iPhone 4. Google is meaner and doesn’t make the information about profile support on the Nexus S quite as accessible (though it’s not the only reason I’m an iPhone user), but it should be safe to assume it’s something along the lines of what the iPhone 4 supports. The iPod Classic has a nice, even longer string, that represents even less support for H.264: H.264 video, up to 1.5 Mbps, 640 by 480 pixels, 30 frames per second, Low-Complexity version of the H.264 Baseline Profile with AAC-LC audio up to 160 Kbps, 48kHz, stereo audio in .m4v, .mp4, and .mov file formats; H.264 video, up to 2.5 Mbps, 640 by 480 pixels, 30 frames per second, Baseline Profile up to Level 3.0 with AAC-LC audio up to 160 Kbps, 48kHz, stereo audio in .m4v, .mp4, and .mov file formats; This is because very few devices can actually utilize all the great features that H.264 defines. Wikipedia has a nice pretty chart. So the point of all of this is to say, even though VP8 is inferior to H.264 from a purely technical standpoint, you probably can’t just use the Main or Extended profiles to support all the devices that “support H.264”. Does this invalidate the inferiority argument? Nope. Dark Shikari said “I expect VP8 to be more comparable to VC-1 or H.264 Baseline Profile than with H.264”. But the large number of devices that support H.264 might actually only support the baseline profiles.

My Hopeless Ideals

Better be blunt with what will probably never happen.

VP8 is a bit worse than H.264, and had it been a patent encumbered video format, there would be almost no reason to prefer it over AVC. The <video> part of the HTML5 specification states:

It would be helpful for interoperability if all browsers could support the same codecs. However, there are no known codecs that satisfy all the current players: we need a codec that is known to not require per-unit or per-distributor licensing, that is compatible with the open source development model, that is of sufficient quality as to be usable, and that is not an additional submarine patent risk for large companies. This is an ongoing issue and this section will be updated once more information is available The only major codecs that are royalty free are Theora and VP8, and the former probably isn’t of sufficient quality. Both of them come with patent risks (or at least that’s what MPEG-LA wants people to believe), leaving a set of zero acceptable codecs. For this to work at all, something needs to be compromised.

H.264 defines a baseline profile that all decoders could be reasonably expected to handle. Such a profile exists so that the video can be viewed, albeit with inferior compression on a variety of devices and platforms with limited computational ability. The internet was built on interoperability, and HTML5 needs an equivalent “baseline codec” for the web. Something that compresses video at sufficient, though not bleeding-edge quality. Something that can be implemented and distributed openly on all platforms.

For HTML5 <audio>, nearly all browsers implement the basic WAV codec. It serves as an acceptable baseline for small snippets of audio to be played in a cross-brower manner. It’s usually uncompressed, and for most applications, inferior to MP3 and Vorbis. <video> needs similar treatment. That niche is, at this time, best filled by WebM/VP8. If Apple and Microsoft were to add support for the WebM format, it would only improve the environment for open innovation. H.264 can be considered the “bleeding-edge” codec, heck, they could even add HEVC/H.265 to set off an environment of useful competition (or not). But a “baseline codec” should be established first. Something that publishers can encode their videos in, so that any modern browser can view.

Right now, the discussion has been polarized between free software advocates who often seek to eradicate proprietary or patent-encumbered ideas from the face of reality and those who hold a disregard for the open source values. This way of thinking, and of treating innovation is profoundly dangerous for both the free software community and the patent-creators. The free software community can not afford to be constantly twenty years behind the times in terms of innovation (especially if you subscribe to the law of accelerating returns), and the patent-creators can’t afford for the free software community to involved in actively foiling their innovations. It’s truly great what MPEG has done, and it’s terrible that such a controversy exists around its adoption. The best scenario would be for MPEG or the ISO standards bodies to eliminate royalties. The fact multiple industries can agree upon a single, high quality, interoperable codec should be enough of a market and innovation advantage to waive the relatively nominal money from licensing.

Will this actually happen? I doubt it. Apple seems firmly invested in the success of H.264. Microsoft may be more willing to add VP8 support if and when the format gains popularity (and especially if Google adds patent indemnification just so they can see a lawsuit made). Mozilla will likely never compromise on principle and Opera might add AVC one day. Given the huge backlash, Google’s the most likely to revert its opinion and add AVC back into Chrome in a future release. Yay for hopeless idealism! And since I’m espousing hypothetical and impossible ideas anyway, why not vouch for reform of the U.S. patent system as well? If you think of it, the fact it’s such a controversy and that people need to use an inferior product in order to innovate, that smacks in the face of the very Article 1, Section 8, Clause 8 of the United States Constitution:

To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries. And this ends what may be the longest blog post I’ve ever written, revised three times. If you’ve actually read this through, please consider commenting or sending me a tweet at @antimatter15 (or following me, that would be great!). Also, if I’m misinformed, please inform me and I might revise this yet again.


drag2up 2.0 - drag and drop uploading for all sites 26 December 2010

drag2up was a browser extension I built a few months ago, and recently bumped it up to 2.0. The basic idea is to use HTML5 File API and the drag/drop integration that Chrome and Firefox implements to enable uploads to any website by simply dragging and dropping the file from your computer onto the respective site. Instead of the trouble of opening a new tab, navigating to your favorite file provider, waiting for it to load, pressing the browse button, navigating to the folder with your image, pressing “Open”, then hitting the submit button, waiting for the upload to finish, copy the link, find the original tab among the mess of tabs that fills your tab bar and finally scrolling down, selecting the box and pasting the URL. All to share a three megabyte file. drag2up streamlines the process into a single, swift gesture where you drag the file onto the text entry field. The link is instantly added while the file is being uploaded in the background. If someone navigates to the link before it’s done uploading, the page uses the Google App Engine Channel API to stream real time upload status.

It still has a pretty simple user interface that works with zero initial setup. In addition to using gist and imgur for text files and image uploading respectively, it includes many additional services, configurable through a simple drag and drop interface. The new version also sports Firefox support.

New Features

  • Firefox + Chrome
  • Background Uploading
  • imgur, imm.io, ImageShack
  • Flickr, Picasa
  • gist, pastebin, mysticpaste
  • Chemical Servers, DAFK, Hotfile
  • Dropbox, CloudApp
  • Built in URL shorteners Right now, the only service that doesn’t quite work is Dropbox. The application hasn’t been approved for production status yet. And a number of the hosts do not work with Chrome 8 because of typed arrays and binary XHR issues.

Try out the extension now :) I would really appreciate any and all feedback.

Technical Information

There’s some pretty cool stuff going on in the new release. This release is really close to the bleeding edge of what’s capable on the web and with browsers. On Chrome, to upload files, the multipart/form-data xhr is being pieced together with array buffers and typed arrays, stuff from the WebGL specification. The Firefox version is based on the latest beta release of Mozilla Jetpack (hacked slightly so that the resulting xpi works with 3.6 as well as 4.0). Transferring data between the background page and the individual content scripts (or pageMods in Jetpack’s terms) is done with the createBlobURL (also called createObjectURL) function and binary XMLHttpRequest. On older versions of Firefox, FileReader is used to load files into base64 encoded data urls. Inter-frame communication is done with postMessage and native JSON serialization and parsing functions. Picasa and Dropbox support are built on Javascript implementations of OAuth (based on Clip It Good and Chromepad). So, yeah, there’s lots of new and super cool stuff in there.

The content script was more or less rewritten in order to support Gmail, which led to some interesting design changes. Firstly, the new version establishes a sort of hierarchy, a difference between a root frame and the most subordinated one. Each drag event is sent to the top using postMessage, where the root frame decides whether or not to render the targets or to remove them. Once the decision is made, it trickles downward to each of it’s immediate frames and then trickles back down to each subordinate frame. Whenever a frame is created, a loop detects that one has been added and attempts to access it’s DOM in order to insert a script element that initializes the code. Interestingly, content scripts (in chrome) are unable to access the child frames so a script needs to be inserted into the immediate page in order to insert a script into a child page. The scripts also set document.__drag2up to ensure there aren’t any frames that load themselves twice (interestingly creating an event and dispatching it seeing if there’s a event listener that does preventDefault isn’t a reliable indicator. I wanted to ideally create a system that was mostly undetectable by the parent page, but eventually settled for this simpler and more reliable implementation).

Once the “trickle_reactivate” message (every message happens to be eighteen characters, because of certain really weird design ideas) is recieved by all the frames, they loop through all the elements on the page, searching for elements that meet the isDroppable criteria. It does not trigger on elements who have a width*height < 100, so no really tiny boxes. It needs to be an input element type text or textarea that isn’t read only or disabled, or it needs to be an element whose contents are editable. Then it checks the positioning of the element using the document.elementFromPoint function twice (one accounting for scrolling and one that doesn’t to differentiate fixed positioning from others). The drop target is a div with a massive style attribute. It has rounded corners, css transitions and is positioned either absolutely or fixed based on the positioning of the associated input element. It’s centered around the element or completely covers it for smaller areas. When the user hovers over it, the opacity changes (the behavior is actually reversed from the last version). The original version faded out slightly when hovered over, which, after some thought, I realized made no sense. Similar to the drop target is a purple settings box on the lower right corner. I’m actually quite displeased with it. It’s not a great user experience, but at least it’s noticable. Primarily, it was added because I have no idea how to get the Preferences button to work with Jetpack and Google manages to bury the Options button somewhere deep within the depths of Chrome’s single menu.

Once the file is dropped, the content script decides whether it was a file that was dropped or an image that was dropped from some other page. If it’s an image (because that’s the simpler case to deal with), it does something that’s actually slightly counterintuitive. Using getData(‘text/uri-list’) isn’t reliable as in many cases, the image is wrapped in a link tag and that only gives the link URL instead of that image. So instead, it reads text/html and inserts it into a temporary div and pulls out the source attribute which is then sent to the background page. If the thing that was dropped is a file, first, it checks if the browser supports one of the Blob URI creation methods and if that’s available, it uses that to create a URL. If not, the file is read using FileReader as a base64 encoded data url and sent to the background page. But not immediately, because often the script is running in an unprivledged environment, so it does a postMessage to the topmost parent which should be privileged. The privileged content script sends the message to the even more privileged background page that initiates the actual uploading process.

Before going into what the background page actually does, there’s the settings page. I’m actually pretty proud of how it turned out. With the large number of hosts that are supported, I needed a way for the user to select hosts for different types of files. I could have used select boxes, but they’re not generally great when dealing with image based concepts, because a company’s logo is often more recognizable than the name (odd that I’m saying that, seeing as this blog has neither a distinctive name nor even a favicon). So instead, I built this little grid where the user can drag hosts into boxes on the right. It’s made with jQuery UI, because I don’t particularly like the API’s provided in HTML5.

The background page handles the uploading. Basically, there’s a bunch of javascript files that are loaded into it. The hardest part was by far, OAuth. If anything, this project has only refined my dislike of OAuth. And because this post is already insanely long, I’ll go and rant about the problems of OAuth anyway. There’s already a great article on how terrible of an idea it is for OAuth to be used in any application that runs code on the client. But Photobucket, Picasa and Dropbox still don’t understand that (or don’t care) and only provide OAuth. But that’s not really a problem, I just hate OAuth because even though It’s a standard, each service provider has some little quirks that take hours to debug and ends up being extremely frustrating. Maybe the OAuth gods are angry with me or something, but it’s incredibly troublesome.

Try out the extension now :) I would really appreciate any and all feedback.



Weppy Updates Opera, Chrome and Firefox support and simpler usage 09 October 2010

With help from @Frenzie and @paul_irish, the latest not-yet-versioned release of Weppy, my Javascript WebP to WebM conversion library, or something of a polyfill for a format that is yet to be part of any specification (HTML5 seems to specifically reference the image src attribute are examples such as PNG, GIF, JPEG, APNG, PDF, XML, SVG, SMIL, and MNG). The new release brings some awesome new features that really don’t change much and shouldn’t really be used in the real world because most browsers in the world still aren’t Firefox, Chrome or Opera - that is, a large portion of the browser market includes browsers like Safari and IE, either suffering from antiquity (IE6! aah!) or just liking h264 (IE9 + Safari).

The new release supports Opera. I never bothered debugging Opera, I figured it was another huge issue that would demand a rewrite (as supporting Firefox had needed, because the order of the object keys isn’t preserved and breaks the EBML result, or at least for Firefox’s parser which seems to be somewhat stricter than Chrome’s, is that ffmpeg?). And after premature optimization (stripping “unnecessary” EBML tags), my code didn’t work in chrome, so I had to revert to an earlier revision. All my testing code was based on file drag-drop stuff, and Opera doesn’t support that. Until I saw this mozillazine topic, I didn’t care, but it was a lot easier to fix than I feared.

Part of the solution was getting rid of the canvas stage. Admittedly, the canvas stage was pretty useless once the toDataURL() stage was removed before the first public release, but I didn’t feel like deleting code, so it stayed there. Also, I noticed that the global variable that gets introduced was accidentally named “WebM”, which is wrong, it should be “WebP”, but because of the uncreative format naming and similarities, I didn’t notice. Not sure, but it seems to be more stable now.

Chrome probably will add WebP soon, and it needs to be future proof, detecting whether or not a browser supports the WebP format. To do that, it creates an Image, sets the src to a data url of a 4x4 webp image and listens to the onload and onerror events, checking if the size is correct and there were no errors loading it. The routine is expected to error and totally untested as there aren’t any browsers that support the feature yet for me to try.

Another change, is that by default, it will automatically load all the same-origin (because of the limitations of XHR) webp images (from <img> tags), on the DOMContentLoaded event, so the library is practically drop-in now. In any web page, you can pretty much add <script src=”http://antimatter15.github.com/weppy/weppy.js“></script> and on the supported browsers, it should automatically load and replace all WebP images, though not something I would really recommended.

The demo is the same place it always was: http://antimatter15.github.com/weppy/demo.html

There is also this nifty hack that uses <canvas> to add an alpha channel to the WebP image (adapted from the original JPEG one): http://antimatter15.github.com/weppy/alpha/alpha.html

Also, please follow me on twitter.


Anony-bot 30 June 2010

A little robot using the Robots V2 API to expose a wave to a URL. If the URL is shared, then anyone can post anonymously on the wave.


μwave 30 June 2010

microwave-screen630
microwave-screen630-wave
μwave is the first true third-party wave client which is compatible with Google Wave. It’s free to use at http://micro-wave.appspot.com/ and works great on mobile devices. It supports searching for waves, opening them and writing replies. The source code for the server component is open-source and can be found on github (though it’s slightly outdated, but the important stuff is there). It’s fairly simple (It’s based on the original example code so I’m going to have the same MIT license), but one of the few python scripts which can do authentication with google and pass commands to the data api. It relies on thePython OAuth Library. A list of posts related to this can be found here.

Wave Reader 30 June 2010

Information 2 - Chromium_080
Wave Reader started around early December of 2009 to fill a somewhat significant void. Embedded waves could not be accessed by people who were not logged into their Google accounts, there was no way to link to a specific blip and the official wave client was insanely slow. Most of these issues have been mitigated, so the project is virtually defunct.

μwave updates again 30 June 2010

microwave-screen630-wave

Edit mode is no longer experimental, a new implementation includes a tiny diff engine which allows editing a post without necessarily destroying layout. Root blip editing is now possible. There is a new tag list on the bottom of each wave, also including an “Add Tag” button. Search results are now formatted with modification date, number of blips, number of unread blips and read/unread state. There is a new settings panel when you click the logo. Added support for the internet exploder browser starting at version six. Owner_utils is a setting which adds utilities like “set everyone as read only”. The New Wave feature no longer creates pop-up prompts, but rather silently creates and opens an empty wave. It renders the live-editing cursors. There is a new multipane interface for desktop. Gadget support has greatly improved. It handles rotation on a mobile webkit device better. It now uses Wave Data/Robots Protocol 0.22 and renders using the newly exposed conversation model.


μwave updates 03 June 2010

Microwave June 2 Search

Over a few days, things can change fairly quickly. There have been several speed improvements, a new Forum-Style blip rendering option which arranges blip linearly by the time edited with each containing a formatted quote of the parent to establish context. Attachments are now fully supported, including thumbnails and download links. The operations engine was totally rewritten which uses asynchronous XMLHttpRequest, a new callback based system and support for a batch operations (which means fewer requests and faster responses). A wavelet header containing a list of all participants in the entire wave has been added, as well as an Add Participant button. A specialized, extremely fast gadget viewer was added, which allows for blazingly fast rendering of two popular gadgets (and more will come), it works by bypassing the entire gadget infrastructure and loading trusted code directly inline with the DOM. There is a “New Wave” button which allows people to create new waves directly from the client. The OAuth backend was authenticated with google, for more secure login transactions. Blips have a new context menu which allows for features such as Delete Blip, Edit Blip and Change Title. A full changelog can be found here.

Try it out.


μwave lightweight mobile wave client 29 May 2010

Microwave Search-View May 29

I like bragging when I do something nobody else has done before. And μwave is the first true third-party wave client which is compatible with Google Wave. It’s free to use at http://micro-wave.appspot.com/ and works great on mobile devices. It supports searching for waves, opening them and writing replies.

Currently it doesn’t know read/unread state of the waves from the search panel and doesn’t know read/unread blips, but as of time of writing, its a limitation and flaw in the current version of the Google Wave Data API (introduced just ten days ago at Google I/O). Expect this to be resolved in the near future with upcoming versions of the API and this application.

The source code for the server component is open-source and can be found on github (though it’s slightly outdated, but the important stuff is there). It’s fairly simple (It’s based on the original example code so I’m going to have the same MIT license), but one of the few python scripts which can do authentication with google and pass commands to the data api. It relies on the Python OAuth Library.

The Blip renderer component of this is licensed under the MIT license and can be found on the old microwave repo. The only part left is the interface, which is going to be the usual GPLv3.

For a little bit of history, this isn’t exactly a new project. The google code project has existed since January 9th, 2010. The purpose was to create a mobile-friendly version of Wave Reader. But that goes even deeper, I can trace it back to the original Static-Bot dated to October 18, 2009. Then, the Google Wave embed API allowed people to view waves only if someone had a Google Wave account and was logged in at the time. This was quite problematic as Wave was still a limited preview which not many people had and probably hampered adoption.

Another separate but eventually convergent issue which led to the microwave project was “Desktop Wave Reader + GWave Client/Server Protocol“ post which I made on October 29th of 2009.

During late October of last year, I reverse-engineered some of the features of the Google Wave client. Up until then, the only published specs were the federation protocols, which dealt with how multiple wave servers would use a common protocol to allow multiple users without a central authority and for the gadget and robot apis. Notably missing was a client/server api, for a user of specifically the google wave client, which did not yet support federation (and to date, preview still does not), and to browse/view the waves in one’s inbox without needing to switch to an entirely new provider. The first component was the ability to read waves. After that was accomplished, I tried to reverse engineer a more complex aspect of the protocol, which was the ability to search waves. I eventually realized that that component, search was part of a larger puzzle, which was the real-time BrowserChannel wire protocol which virtually all of wave was based. I made some progress, but near the end, I gave up in frustration. Luckily, someone else became interested in the same thing, and Silicon Dragon basically got search working.

This happened now in early December. I started on a project called Wave Reader, which merged the ideas of static-bot with the desktop wave reader and a new functional blip rendering engine. At that time, the Google Wave client was still horrendously slow, taking several minutes at times to load large waves.

On January, I began a project to merge Wave Reader and the wire protocol (search). I thought an awesome name would be microwave (or μwave) and started the code repo on January 9th. I worked on it a bit, so that it was mostly complete, with search and loading all working, with one missing component: login. Eventually, I got bored and the project lay abandoned for a few months.

This gets us to basically 4 days ago, when I started working on a renaissance of the μwave project, based on the recently released Google Wave Data API. The first component was creating a new blip renderer specifically designed for parsing the new (much cleaner) json format which is part of the robots api. Then I created a client around that and created a python backend for having it work on app engine.

The Future is always awesome to prophesize about. In the coming weeks or days, google will probably update the data api to allow for information like So, while http://micro-wave.appspot.com will likely remain free and maintained for the forseeable future, I do plan on making a paid iPhone/iPad app. The iPhone app may have some extra features like offline/caching support.



CSS3 Sideways Google 08 December 2009

I was surprised to find that one of the main referrers to my site recently has been Twitter. Looking into it, it seems to all be for my quickly hacked together CSS3 Sideways Google. Which uses the new CSS3 transforms supported by Webkit and Gecko (Firefox, Chrome and Safari) as well as the freaky DirectX stuff Microsoft has (Only major browser that isn’t supported is Opera, who is lagging a bit).

And in the spirit of CSS3 Sideways Google, I present CSS3 elgooG

It also appears that the awesome rotateme.org website was built entirely off of the original CSS3 Sideways Google and actually got 1500 diggs :)


Ajax Animator Wave Gadget 18 September 2009

https://wave.google.com/a/wavesandbox.com/#minimized:search,restored:wave:wavesandbox.com!w%252B3JUS0MHA%2525A.1

If you have a Google Wave developer account, you can visit the above link to use the gadget. It supports almost all the features of the full normal Ajax Animator and many more. It includes better Text, Images, Rotation, Resizing (still needs work), Layer Visibility, Stability, Platform Support, Export options, etc. However it does notably lack the entire old right (east) panel, which also means no undo, or redo. Also since it uses a different graphics editor (VectorEditor), it does not have all the transform options which were present in OnlyPaths. It also supports the whole real time editing that Wave is so famous for. Two people can concurrently edit the same frame at the same time or one user can watch the animation while the others are editing and see the animation develop.

If you aren’t fortunate enough to use wave, you can use it without the collaborative features at http://antimatter15.com/ajaxanimator/wave/


Ajax Animator Storage 11 September 2009

So I decided to mix up an old project, which I was almost about to migrate to GitHub but it’s still on Google Code (http://code.google.com/p/datastore-service/) which was basically a free service that allows easy prototyping of things by providing basic persistence (using JSONP). So I mixed it up with the Ajax Animator Standalone Player, so the Google Wave version of the ajax animator will have a Publish button which will upload things to a server and give you a shareable URL Such as http://antimatter15.com/ajaxanimator/player/player.htm?1e025941543678 while it would be great if a) the URL was shorter and b) the URL was more customizable, it works basically.


Ajax Animator + Wave 07 September 2009

sadly it led to this

So I think that the Ajax Animator Wave thingy is almost done. I think it’s really awesome, there’s some new stuff in there that may help in collaboration. There is still a bit of dogfooding left (VectorEditor needs to be updated as while the new version of Raphael is being used which eliminates a lot of the hacks being used, the change in path APIs means that lines, polylines and paths all fail). So after a bit of more bug testing, I think it’s going to be pretty cool. It will definately be out by the time the 100,000 people join Google Wave (I can’t wait!). But I’m not sure if it will be today, next week, or the week after that.

So to show some cool stuff I can do, I’ll publish the first time this blog has ever really seen an animation by me. But here’s the first animation I’ve made (It uses the Wave, center and flip plugins which you can access in the script executor, but someone could easily and tediously do this by hand too):

Try ignoring the probable trademark infringement.

The cool stuff being used here are first the ability to draw a rectangle, to multi-select, multi-copy, multi-paste and manual repeatition. After that, it’s multi-dragged down, and next frame then, the Ax.plugins\["center"]() plugin is called which obviously centers all of them (by the Y axis, preserving the X one). Then it goes to the next keyframe and using Ax.plugins["Sine"](100,0.01) (first arg defines multiplier for y axis and latter defines something I forgot, i think multiplier for the current X axis). Then the same function but with (100,0.02) and then Ax.plugins.flip() to make it look like the wave logo. Do some multi-select and set the color. After that cool stuff, it gets saved as text and opened in OnlyPaths Ajax Animator (which also demos a really cool feature called forwards-compatibility). It gets saved as a GIF and uploaded to my blog after that.


Wave Gadget API issues (again) 07 September 2009

So I’ve been having issues with the Wave Gadget API (again). This one is simple, Wave isn’t really real time. So right after doing a submitDelta, you can’t get() the data and expect to have the new one instantly.

It’s been giving me some problems, but I’m getting around it by using my awesome wave gadget library which will now magically apply the changes immediately so you can access it even before things actually happen.

Also, partially due to this, things would be far more useful if you could get the date of each insertion or deletion.


Migrating small stuff to GitHub 03 September 2009

I’ll take the liberties of plagarizing this as who wouldn’t recognize its source?

So I’m moving the small totally unknown and 1-2 file things that I’ve spammed Google Code with previously. Small things like js-xdb, mental-interpreter, js-tpl-engine, js-xdomain, subleq2, vxjs-ajax (a whole project for a single function? crazy stuff).

So i’m shrinking my google code profile to reduce my spamminess, becasue I used to feel like it would be awesome to have a project for everything I spent more than 2 minutes on doing in hopes that someone would eventually find it interesting.

Hopefully someone would find it interesting on github.

I’m also adding some lost projects, like my backups of stuff that got lost when appjet shut down, a substition code cracker a few jetpacks and still adding more.

http://github.com/antimatter15/antimatter15/tree/master

Those are all the tiny projects not big enough to deserve a actual repo or google code project page.


JS vs Python 27 August 2009

I sorta expected it due to the new V8, Tracemonkey, Nitro, and SquirrelFish engines. But I’m thinking of making a port of ShinyTouch to JS and I was looking into what differences it might end up as.

I have to say I’m really quite suprised. It’s a simple piece of code:

setTimeout(function(){
    var start = (new Date).getTime();
    var n = 0;
    for(var i = 0; i &lt; 10000000; i++){
        n += i;
    }
    var end = (new Date).getTime();
    alert(end-start);
},5000)

Just doing a loop a huge number of times and adding some numbers. But the unscientific results are quite amazing:

Python: 2640, 2110, 2000, 2190

Firefox 3.0 Spidermonkey: 777, 672, 685, 665

Firefox 3.5 TraceMonkey: 659, 365, 629, 629

Chromium Nightly: 146, 150, 147, 152

While these only test basic arithmetic and recursion, The browser is 15 times faster than Python, it just feels quite incredable.


Totally failed mini-project svg-edit + svgweb 24 August 2009

So I tried to get svg-edit to work with svgweb’s flash shim (IE support). Sadly, it’s a total failure!

http://antimatter15.com/misc/svgweb-svgedit/svg-editor.html?svg.render.forceflash=true

I only got far enough to get drawing lines, paths, polygons and rectangles working. Ellipses and text do not work. Selecting a shape shows the tracker, but if you drag it then it just flies over to (0,0) and dies. Partly because I didn’t spend any more than an hour getting the port to work, but the issues I’ve encountered is that svgweb doesn’t implement shape.getBBox() (but it did provide shape._getX(), shape._getY(), shape._getWidth(), and shape._getHeight() so it wasn’t hard to implement that). Then was that since it’s using a flash shim, the events which they hook on the container with jQuery only show the evt.target as the <embed> which the flash resides in. But I got a strange hack where you have 2 copies of the event listeners and kill the selects from the jQuery one, but doing that makes unselecting impossible.


How I Would Design The Browser 2 Addons 21 August 2009

So I was watching Aza Raskin’s TechTalk on Jetpack, and I was thinking on how I would design an extension system. I would have to say to not have one, it’s just too complex, and why restrict the sound recording functionality to a taskbar. Even worse, why fragment the API and require someone to use Flash or <audio> in the page space and have a nice jetpack.future.import(“audio”) for a taskbar?

I think a good idea would to expose the power to web pages. The page could request special capabilities through a magical button dropdown or bouncy annoying notifier on a corner of the page saying permissions, populated by checkboxes of whatever features that the page wants to be able to use.

I think bookmarklets are almost perfect. Adding some more greasemonkey-like features would make it just about perfect. Scripts can run with the same permissions as the page, and the page’s permissions can be granted easily by the user (and the permissions persist through refreshes and browser restarts). Again, if functionality is not supported, things can gracefully degrade with partial functionality.

After that, is the idea of background tabs or alternatively, merging the statusbar type widgets into the tab bar. This is logical with everything merged into the page, and allows things to gracefully degrade if they don’t support the feature. You also get the benefits of being able to reorder remove, get info (which would be the contents of an extension page), etc. I think the interface for a plugin that operates in the background (like a gmail notifier) would be just a small tab that only has an icon, with special flag that makes it run on browser start (I think this could be one of the things for the permissions panel).

So one problem I see in the way Jetpack works, is that it doesn’t easily allow you to make a jetpack that hacks another running jetpack. Sure you can “fork” it, but that defeats the purpose of extensions, rather than having extensions only 1-level deep, make it work all the way down. The easy way I see is just to use the bookmarklet philosophy, and everything can mess around with anything within the page. So if you have a GMail notifier, that came out before the tab persist feature existed, you could just add a simple bookmarklet-type-greasemonkey thing that adds something to the permissions box that says “Persist Page” and then the user could check that in order to make a background GMail Notifier that runs on browser startup.

Malware is easy to fight now. Imagine if every application was forced to have a icon in the taskbar of windows at all times. Finding malware is as easy as looking for things you dont want running and closing it. And if some tab-bar autohide is to be implemented on the system, only people who are quite experienced would use 10+ extension/notifier pages and it would still be easier to recognize than finding some other strange wcultns.exe or whatever when half of the system things look like that.

With these features, Browser as an OS would really make sense. I wouldn’t be suprised if Google Chrome OS implements some stuff that are similar to what I’ve listed here.


I GOT A GOOGLE WAVE INVITE! 13 August 2009

IM STILL WAITING TO GET MY ACCOUNT BUT I JUST GOT AN INVITE AND IM SOOO HAPPY!!!!!!!!!!! SEE? LOOK I’M USING CAPS LOCK TO EXPRESS MY HAPPYNESS! IT’S REALLY REALLY REALLY AWESOME EVEN THOUGH I HAVENT ACTUALLY TRIED IT OUT YET! BUT ITS SOOOO COOL! AND LETS BE HONEST, AT LEAST THIS POST ISNT IN COMIC SANS MS.

EDIT: i got my wave account. It’s cool, but there’s nobody to talk to.


VectorEditor on Wave 12 August 2009

So I don’t actually have wave yet, but for all 2 of you (likely less) who have used pygowave, an open source third party implementation of the wave protocol. So there was how I developed it, I read through the APIs and tested on pygowave. So what does VectorEditor do that svg-edit and… erm… svg-edit don’t do?

First, VectorEditor for wave is really really real time. Waaay more real time than svg-edit (not really). But VectorEdit (VectorWave might be a nice name.. I’m going to try using that name from now on in this post) transmits the data such as even while the shape is still being drawn, rotated, or moved (rotation needs work). Another nice feature is that the transitions are animated so things are even more seamless.

Another important feature is shape locking. So when someone selects a shape, it gets locked and can’t be edited by anyone else. If anyone tries, an alert box appears saying “Shape shape:5sdfwef98dfe3ssdf is locked by user antimatter15@pygowave.p2k-network.org”. svg-edit (the latter) doesn’t support moving things after they’re created so it doesn’t really matter then, and I’m quite certain the former doesn’t do any type of locking.

And lets not forget the likely most important, yet totally untested feature that seperates VectorEditor from the rest: IE support, which is inherent since it uses Raphael for rendering, but it may not be necessary since google may be making some shim-type system of hacking svg awesomeness onto IE and making the whole VectorEditor project useless.

So if you want to try it out, go to pygowave, sign up, create a wave and add the


Google Wave Gadget API "Flaws" 12 August 2009

Edit: This post is mostly crap. I figured out how to solve my problems while writing this. But I’m posting it anyway because I feel obligated to spam the internet with my outdated thoughts

One of the main features of Google Wave is the ability to do live concurrent realtime editing. Sadly, this functionality isn’t easy or as far as I know even possible on Google Wave. Most of the time it doesn’t matter. The only time it would matter is if you are using live concurrent text editing within the gadget. Of course, that’s what I tried doing and that’s why i’m writing this blog post. So I still haven’t gotten my Google Wave Invite yet (hope soon!) so I’m experimenting with the pygowave project, which is a third party open source implementation of the Wave Protocol. The interface is missing something that is quite important to wave: multi-user text editing. So I decided to try implementing it as a Wave Gadget.

I really did understimate it’s complexity. While implementing isn’t usually too hard, the structure of the Wave Gadget API makes it more difficult than it could/should be. What the wave gadget API does is it has a real time updating key-value table. It’s quite flexible and useable most of the time, but for real time editing, not quite as useful as when something is changed.

For instance, if 2 people are editing the same thing, then whoever submits the data last is the one which wrote the data and the first person’s edits are ignored. Very rarely do 2 people need to edit the exact same thing. But when they do, it’s not easy to merge the things. A more chat-like system could work.

But while implementing that chat-type system on top of Wave is possible, it feels very inefficent, partly due to everything being cached at all states (to support Playback) and worry about something akin to garbage collection to delete things after everyone has patched their running copy.


Google Chrome OS 04 August 2009

The idea of a Web Browser as the OS is nothing new, and most people know it. As for current things, they’re basically very restricted normal operating systems. Google wouldn’t do something like that without tons of innovation and killer features.

I’ve been watching some google techtalks in the past few days, and I think it was interesting how the one on Wuala (a distributed filesystem) was started with the google guy saying “this talk is being recorded so please refrain from mentioning any google sensitive information in your questions”. While it may be referring to the Google File System, I don’t think it’s too similar to Wuala. So I think the idea of cloud storage distributed among peers is an attractive idea since what would you actually use the hard disk which most netbooks do have, and keeping all your data locally doesn’t really contribute to the whole idea to cloud computing.


Google and Microsoft 13 July 2009

Microsoft and Google have fundamentally different in their business models. Google uses Advertising and Search, with around 98% of their revenue comming from Advertising. Microsoft owns a monopoly on the Operating System business. Especially with the new Google Chrome OS that’s been recently announced, It really brings the question of their positions. Can Google really take Microsoft down? What kind of financial prowess, consumer brand loyalty or user lock-in does it really need to take on Microsoft?

Google is on much shakier territory. I could leave Google just by typing in “bing.com” or “yahoo.com” in the URL bar. Simple as that. Totally intuitive, something that (hopefully) nobody needs to call Tech Support to guide them. Typing 8 letters into the URL bar and pressing enter is all it takes to destroy the Google empire.

However, what about Microsoft. They own a monopoly on the Operating System market. How easy is it to install another operating system? Well, you need about 1-4GB of data for a modern OS, you need to either burn it or buy it from a store (I figure there are probably tons of tech support calls at this point), and then likely reconfigure BIOS, go through menus, fill out several forms and select the specific partition to install the OS to. This point is already unfathomable for a great majority of the userspace.

Google is very different from Microsoft, from their core business models. The Windows monopoly isn’t going anywhere in the near future. Google could be gone tomorrow.


Action Limit Exceeded 30 December 2008

Google
 

Action Limit Exceeded

What happened?

You have performed the requested action too many times in a 24-hour time period. Or, you have performed the requested action too many times since the creation of your account.

We place limits on the number of actions that can be performed by each user in order to reduce the potential for abuse. We feel that we have set these limits high enough that legitimate use will very rarely reach them. Without these limits, a few abusive users could degrade the quality of this site for everyone.

Your options:

  • Wait 24 hours and then try this action again.
  • Ask another member of your project to perform the action for you.
  • Contact usfor further assistance.
 

Scary.


Google Chrome 02 September 2008

It’s awesome btw. The tab bar location is like opera, the new tab button is like IE, the password manager is like firefox, and buttons are like IE 8. I’m actually posting from IE 8, and it (being based on Webkit) works with the Ajax Animator.


Login System 16 August 2008

What kind of login system do you want? Openid? Google? Custom-Ajax Animator user-management code (like in previous versions)?