somewhere to talk about random ideas and projects like everyone else



MusicAlpha Upload to Google Music Beta from Linux and Chrome OS 14 May 2011

Rather pleasant update: This app works again. After some epic hackery, it now works again. Install it now, in its fully functional, redesigned glory.

Rather depressing update: This sadly no longer works. It was fun and somewhat useful while it lasted. I have no idea why it doesn’t work, and I would really appreciate if someone could find out why it fails. However, it would be a good idea to install it anyway and I guess.


Google Music Beta is a pretty cool web app, but the Music Manager sadly only works on Windows and Mac. As a Linux user who unfortunately feels neglected by the service, but still appreciative of being invited to the service a mere day after it’s announcement, I decided to do something to remedy the situation. Not to mention the irony that one can’t even upload music to the service from Google’s own Chrome OS. So MusicAlpha was born.

This project was a pretty interesting hack, so I guess I’ll try to document the process of how I made it and how it currently works. And also, for any prospective filmmakers, this story might make for an interesting abstract action movie. But for any of you who just want to install the app, click here.


I started this only days after finishing the new revision of Cloud Save, the extension that I never quite understood, but everyone else seemed to get. One part of the newest revision was adding support for Amazon’s Cloud Drive service, which does not have an established API. However, the interactions between the javascript web interface and the server are pretty simple to understand, thanks to the almighty web inspector. The only weird thing was that the actual file upload was actually conducted through Flash, which felt unnecessary but it wasn’t an insurmountable task. But it was all built on carefully learning the way of the web client.

There was one feature request for Cloud Save to enable saving to Google Music, and that’s when the idea sort of started.

The first step was to get a Google Music account, nothing surprising there. A few hours after the announcement was the first opportunity for me to access a computer, and also when I hit the “request invite” button. In a shockingly short amount of time of a mere day, I received a nice email saying that I had been invited to the service. Yay. Now the letdown of not being able to upload from Linux begins, as well as the scheming to reverse engineer it.

A few months ago, I had the idea to do something similar, but with another Google product: Google Goggles. But after hooking up the anrdoid simulator with a http proxy (Charles), and looking at the results. I was rather dismayed, but not particularly surprised by the results: all the communication was encoded using Protobuf. Protobuf, or Protocol Buffers is “Google’s data interchange format”, according to it’s Google Code project page. It’s a structured system for encoding binary data where a .proto template is already known and compiled. Protobufs are bad in much the same way minification is. There are totally justifiable reasons to minify source code, but the elimination of a useful View Source detracts from the open ideal of the web. Though protocol buffers aren’t encrypted, they are not comprehensible without a .proto file to decode them. With a lot of work, one might be able to reverse engineer a .proto file, but it’s certainly much harder than a JSON protocol.

Before I began, this was the worst fear. That everything as encoded with protocol buffers, and my attempts to mimic it would be rendered futile.


I borrowed a computer which ran Windows 7, and installed the Music Manager in hopes of deciphering the secret enigma of the great google. I needed a simple packet sniffer, and so I used nirsoft’s smartsniff. I didn’t use wireshark because it wasn’t my computer, and I didn’t want to install and download a huge app with an intimidating user interface. Smartsniff worked pretty well, and I thought I was almost there. Packet sniffers generally can’t decode https data without some MITM-esque certificate faking, and I have reason to believe that that wouldn’t work either since running the strings command on the MusicManager.exe file includes something that resembles a public key.

Anyway, I found two raw unencrypted HTTP requests, and sort of fixated myself on those two.

POST /uploadsj/rupio HTTP/1.1

User-Agent: Music Manager (1, 0, 12, 3443 - Windows)


Accept: /

Cookie: SID=[redacted] path=/

Expect: None

Content-Length: 851

Content-Type: application/x-www-form-urlencoded

{“clientId”:”Jumper Uploader”,”createSessionRequest”:{“fields”:[{“inlined”:{“content”:”jumper-uploader-title-19Redacted”,”contentType”:”text/plain”,”name”:”title”}},










]},”protocolVersion”:”0.8”} I noticed the second request was rather huge, 3556KB. Just enough to fit a normal-sized MP3. I knew what was in there. PUT /uploadsj/rupio?upload_id=REDACTED&file_id=000 HTTP/1.1

User-Agent: Music Manager (1, 0, 12, 3443 - Windows)


Accept: /

Cookie: SID=REDACTED path=/

Content-Type: audio/mpeg

Expect: None

Content-Length: 3141REDACTED

ID3…..%vTENC….@..WXXX……..TIT2……. Redacted At this point, I was relatively ecstatic. Or at least, I will pretend that I was, because everything looked elegantly and remarkably simple. I had no idea where the upload_id came from, but I figured that it was part of the response from the first POST request. Part of the problem was that SmartSniff didn’t give me the actual responses to the HTTP requests, only the request data. Or at least from the five minutes that I bothered using it. The cookie looked like a generic cookie that would exist in a normal browser session.


Since it used generic browser cookies, the simplest way to start would be to make a browser extension. Specifically, a Chrome App. That way, I could get the necessary permissions to do all the cross domain security protocol violations and probe around a bunch of requests. Since XHRs already include cookies for a specific domain, there’s nothing I need to do to set the cookies. I hoped and suspected that the User-Agent, Accept, and Expect headers were basically ignored, and the content-type would be rather trivial to set.

A lesson in premature optimization was the huge block of JSON which got sent with the first POST request. I had no idea what was really essential, so basically, I just deleted almost everything there except for what I thought might be really important: the file name and size. There were no errors in the POST request, so I figured it was okay to delete all of that stuff. I ran the POST and just as I suspected, the JSON that resulted included the upload_id, in fact, it included the entire PUT url, which was pretty nice.

{“sessionStatus”:{“state”:”OPEN”,”externalFieldTransfers”:[{“name”:”REDACTED.mp3”,”status”:”IN_PROGRESS”,”bytesTransferred”:0,”bytesTotal”:314REDACTED,”putInfo”:{“url”:”"},"content_type":"audio/mpeg"}],"upload_id":"AEdREDACTED"}} Then, when I executed the subsequent PUT request. I was scared. {“errorMessage”:{“reason”:”REQUEST_REJECTED”,”additionalInfo”:{“uploader_service.GoogleRupioAdditionalInfo”:{“completionInfo”:{“status”:”REJECTED”,”customerSpecificInfo”:{“ResponseCode”:404}},”requestRejectedInfo”:{“reasonDescription”:”agent_rejected”}}},”upload_id”:”AEdREDACTED”}} Scary. The rejection reason was “agent_rejected”, which made me think about the User-Agent header, and I wondered if that was supposed to matter. If that did matter, then I would have to prototype it in a different language since XMLHttpRequests forbid the setting of the User-Agent and other headers for security reasons (even operating in a privledged environment!).

But thankfully, before attempting the drastic route, I pasted back in the alleged cruft, the jumper-uploader-title, TrackDoNotRematch, TrackBitRate, SyncNow, ServerId, MachineIdentifier, ClientID, AlbumArtStart (hey that rhymes!), and AlbumArtLength. Magically now, it worked. Yay.


During my quest to understand the bizarre agent_rejected error (because I totally fear rejection, they should have used a nicer word), I tried googling GoogleRupioAdditionalInfo. (I had googled “Rupio” before, since that was part of the POST url, but that didn’t yield any relevant results, it just ended up being a bunch of people’s names). Searching GoogleRupioAdditionalInfo yielded a more limited subset which happened to be more interesting.

The first result, from the SMEStorage Blog described an error emitted by the Google Docs platform. The next result was on the Picasa help forums, about “Upload video results in Error”. Then was a french language forum which mentioned an error which included GoogleRupioAdditionalInfo, this time on the url “”.

So, it seems “Rupio” is the codename (or just the name, but it’s much more fun imagining that there are subliminal codes everywhere) of a unified Google data upload/storage platform. That’s pretty cool. It probably makes sense that they have a sort of unified file storage system across Google Docs, Picasa, and Youtube, since it would probably be easier to maintain a system which is internally consistent. But to inject a sensationalist twist where occam’s razor shows that it’s not justified, this is a hint at an upcoming unified Google file storage service. A sort of file-browser dashboard which links all media and document files together in an accessible and uniform manner.


It worked. Or at least, it seemed to work. Vaguely. At this point, I was reasonably satisfied and began sculpting a nice pretty interface for it. Following my usual way, I just stole the Tango icon for an audio file mime type and built the entire interface around it. I remembered that earlier that day, I had discovered something called Layer Styles, which is a photoshop gradient/style clone that instead generates CSS3 gradients. I thought that was cool and I remembered it, so I played around with it and made a centered rectangle. Gray on Gray.

I decided on a name for it: MusicAlpha. It’s a sort of play on how this Google product features Beta in a way which is much more prominently than usual, with the gray “beta” the same size as the actual “music”. The Alpha sort of says something about how it will almost eternally be stuck in this unstable and incomplete form. It also reminded me of one of my favorite websites: wolfram alpha, and how the logo is stylized WolframAlpha or Wolfram|Alpha.

I wanted it to be minimal, but I’ve since adopted the belief that drag-and-drop file selection was only a fad. Sure, it’s often really useful, but it’s not great to restrict to that. A hybrid approach is much better, and if you have to have only one, then the standard browse button is better since you can drag and drop files to that button (on chrome, anyway) without any special code. Drag and Drop is often quite terrible on laptops, and I’m not sure if you can even drag and drop files with Chrome OS, since the file browser might be modal.

There’s not much to the interface, it’s probably as minimal as it can get. Two links, one to the service, the other to me. A button. That’s it.


At some point in every good story (not implying that this is a good story), there’s false hope, where the protagonist (not implying that I’m the protagonist here, either) believes that he or she (not implying that I’m a she) is closer to the end than is factually warranted (I’m actually implying that that’s exactly what happened). I checked the Google Music song list, and lo and behold, the song was there.

Sort of. Not really. Something was. Not sure it was a song.

There was a blank row. Double click it, and something does play. And it is the song. But, there’s no way to seek because incidentally, it doesn’t know how long the track is, so there’s no way to render the position of the song.

It was late, and this was frustrating, so I just posted a terse blog post announcing the beginning of the project:

So pretty soon, I hacked together something that almost sort of worked, with one rather significant caveat: It doesn’t pick up any tag data, name, time, artist, album, etc. I can’t figure out why. I guess I’ll try some more tomorrow. And by the way, I totally lied. I didn’t try at all the next day.

Yesterday, that is, May 13th, the delightful Friday the Thirteenth, I tried again. I installed Music Manager on a VMWare installation of Windows XP, and sniffed the traffic with Wireshark running on the host computer. Didn’t notice anything new in the plethora of data which happened to get exchanged. I tried running smartsniff from the VM, but every time I tried to start sniffing, VMWare magically crashed.


At this point, I thought of looking at the MusicManager.exe binary. Usually, binaries have some random strings in them, which you can look at to give some hint at what it does, without actually decompiling something. I’m now going to proceed to anachronistically mention something about android, because of a certain URL that I find out later in this chronology: So I used strings MusicManager.exe | grep android, and found some rather interesting things




Metadata.ParentalAdvisoryType There appears to be 214 mentions of the phrase “skyjam” in the executable, which I assume is not some new solar powered Google sandwich filling distribution mechanism powered by artificial intelligence driven aeronautically mobile condiment dispensers (obviously using Arduinos).

I have no idea what Skyjam is, but that’s totally a much cooler name than Google Music, regardless of how big you write “beta”. And what’s wireless_android_skyjam? Why is skyjam so deeply tied with wireless and android? Maybe it’s got something to do with flying robot sandwich machines. Or not.

Maybe it has something to do with that Simplify Media company which Google bought last year. Or maybe I’m just a little sad that the product ended. According to TechCrunch,

Google VP Vic Gundotra said that Simplify’s technology will be used to offer a desktop app that will give you access to all of your (DRM-free) media on your Android devices remotely, using Google’s new iTunes competitor on the web.

> Desktop App: Check. Android: Check. New iTunes Competitor on the Web: Check. Simplify’s Technology: Not so much.

For those who aren’t familiar with what Simplify did (which I assume is nigh everyone), it streamed your music by making your desktop into something of a server, so your huge cache of music can be accessible by any mobile device remotely.


I had a random false lead. I thought that the server still read ID3 data on itself, and noticed that the HTTP dump was somehow slightly different from the original file. It was the same length. The exact same length, but some of the bytes were different. Most were the same, and they were all in the same place, but for example, all 0x00’s seemed to have been swapped with 0xe4’s. Some other bytes were also different, and in retrospect this was probably because the program which created the http dump sucked and encoded all the bytes wrong or something. But I tried opening it up in Ghex and I swapped every 0xe4 with a 0x00, and magically the broken file now played, with one exception: the audio now sounded like a fishtank. It’s like everything’s just water, magestically moving around with exquisite sublimity.


I borrowed another Windows computer this time, and tried again. This time, I also tried the HTTP Proxy called Fiddler, which seemed to be able to capture more information, or at least give me more of the information that I would find useful and less of what I wouldn’t find useful. I noticed that for every file which was uploaded, there were in fact three requests, not two as I had previously observed.

So where had this covert HTTP request been hiding? Behind in the magical realm of HTTPS/TLS encryption. I couldn’t peek into the actual content of the HTTPS requests because of the fact it was encrypted. And as the strings command had revealed earlier, there was a huge block of base64 encoded text which appeared to look like a sort of public key. It suddenly connected, the Music Manager probably has its own list of trusted certificate authorities, and that’s why adding a mere certificate to Windows wouldn’t do a thing.

But the unencrypted CONNECT request, which initiates a HTTPS connection said enough.

There was a request to and it was encoded with gasp: application/x-protobufs. If this was some sort of action movie, here would be the scene where the antagonist from the previous movie reveals that he is still alive (with some facial scarring) and the person responsible for all the disappearing bodies (of data).

I was about to give up at this point, but then a new idea sturck.

A New Hope

This is the second movie-title article sub-title that I’m aware of, unless there’s a movie called “Packet Sniffing”. For the prospective filmmaker, this is when the protagonist has the flashback and remembers that time, long long ago (though here it’s only like four days, you can take creative liberties), when refining the art of hackery, a task involved crafting an implementation of a storage system from the web interface of a specific cloud drive service.

So, I noticed that I had spent so much time exploring what enigma lay beyond the surface of Google Music, that I totally forgot to actually use the service. And when I did, I realized that there was an Edit Song Info context button. Trying the feature out, it seems to be a simple JSON post to a certain modifyentries URL.


Amazing. I just stumbled on the magical code which would allow me to finish the project. Yay. Now I had to figure out what the xt= parameter means (after determining it’s necessity). My first hunch was to search the HTML source of the page, because the magical access codes for Amazon Cloud Drive were put inside a hidden input element. But no, it wasn’t there. Now the fear starts settling in, what if it’s a strategically computed magical super string? But why would they ever do that?

Somehow I noticed somewhere on every time the page is loaded, the network logs indicate a Set-Cookie: xt=AM-REDACTED. Now I knew how to get it: Just send an XHR to the cloud player and copy the cookie.


Now, the important pat of all the items in the entries object is probably the id. Everything else seems somewhat generic, and probably exists for any song that you throw at the service, but id clearly has a definite, important, non-random, predetermined value. I thought that there was no correlation between the data passed from the uploaded data and the id which was here, since the upload_id was way too long and had nothing in common with this song id.

Since the songs that get uploaded have no title, they always appear first on the alphabetical list of songs, so I figured I would just pull a list of the songs, alphabetically, and then take the first one to get the ID. Then I discovered the auto-playlists, which included one for most recently added.

Unless you have the power of omniscience, you don’t know that the format of the IDs here were like das8f7-adf0w8f-adf87we-mwere, because I slapped a huge REDACTED on the latter half. The random string generating code that I’ve used for years out of convenience is Math.random().toString(36).substr(2), which basically generates a random float, converts the fractional part to base 36 (0-9 a-z) and strips the leading “0.” from the string. This incidentally ends up being quite short (especially on chrome, it’s much longer on firefox), and lacks the dashes.

So I looked at the ID for one of the streams and noticed that the id string was unusually short. I started wondering why something like that would happen, and then it struck me, not quite like lightning because I’m still alive (though whether or not I am when you’re reading this is a totally different question). This file ID was the same thing as the ServerID which I thought was just a random useless value that was arbitrarily required for it to work. I deleted the whole getRecent routine and suddenly everything started to make sense.


You probably don’t remember my old blog post, almost a year ago, titled A Bright Coloured Fish: Parsing ID3v2 Tags in Javascript and ExtensionFM. Recently, by fate, I happen to have used that code more times that I’m comfortable with, and it seems that once again I must revive this old memory. It’s fairly outdated, and some things are very nasty. Not being able to parse ID3v1 is pretty terrible, though I don’t think I have any songs using v1.

I hoped that Google’s magical server farm would have parsed out the ID3 data, but that doesn’t seem to be the case. I have to manually parse them in the client.

I had to change the id3v2 library a slight bit: the picture parsing code now includes a blob attribute, since the the album art must be uploaded to Google’s servers in order to be included in the metadata.

I guess that fishtank sound was an omen. I’m sure the prospective filmmaker could use it in some way like this.


Rather auspiciously, the modifyentries command included durationMillis. Since it became evident that the music player app had no way of knowing the length of a song without some tag data, that it would be impossible to seek to a certain portion of the song without this ability, it was really important. But since this was from the API for a user-facing operation, and editing the length of a song isn’t something that a user can or should be able to edit, it was pretty fortunate that that was included as part of the API.

This wasn’t without problems though. It’s not that easy to measure the length of a song. The Length tag of a MP3 file is actually seldom included, so that clearly can’t work. I looked at how to measure the length, and it was pretty scary, since it doesn’t seem like there’s any way other than parsing all the frames out and multiplying the bitrates.

But this time, HTML5 comes to the rescue. I never thought I would be so happy to use . I just take the file, convert it to a blob URL through that nest of if/else statements that handles different versions of chrome. I open it with new Audio and wait for the metadata event, indicating that I now know the length of the song to thirteen places after the decimal. Because of course, I would hate to miss the last picosecond of my song.

I felt sad for the little femtoseconds and attoseconds who were neglected by Google when they decided that the song lengths should be milliseconds. Not quantized divisions of Planck time.


At this point, nothing interesting really happened. I fixed up the interface, added queueing, and other insignificant things. I published it and spent the rest of day writing a hopelessly long document describing it.


For those who want to build their own programs which upload to Google Music, I guess I’ll write a summary of how it works.

  1. Get Song Metadata (ID3 data, Duration, primarily)
  2. Login to Google
  3. POST to
  4. PUT to putUrl as part of the response of previous operation
  5. Load and capture xt value from Set-Cookie response header
  6. (Optional) Upload Album Art to
  7. Set file metadata via

Friendly Reminder

To get the app described in this lengthy narrative, click here.

Cloud Save 1.3 10 May 2011

So, I finally updated my most popular browser extension: Cloud Save. This adoption has always sort of puzzled me, as it was originally simply a 30 minute hack on drag2up. So I’m trying to be careful not to introduce any huge changes in fear of alienating that magical use case that everyone else seems to have stumbled upon.

Nevertheless, 1.3 introduces lots of useful features. First and foremost is the introduction of a progress bar. I really can’t imagine using it without the progress bar or some other indicator - like how it was in the last version (yet how people still found it useful absolutely bewilders me). The progress dialog is awesome and probably the most significant feature, and after that was just a few backend changes which enabled this.

Published retroactively on 10/22/11

MP3 Player 26 March 2011

This is my take at an MP3 player. I would consider this one of my better designs (I’m no designer or artist, and that sort of shows). It’s fairly minimalistic, a product both of my design and the fact it was created a few hours. But this isn’t about design. No, it’s a music player that operates entirely in your browser with files stored on your hard drive.

Here’s basically how it works. There’s an <input type=file webkitdirectory> so you can go and select your music folder. It gets a list of all files, reads the first 128 kilobytes of each mp3 file and parses it for ID3 tags, constructs a library and makes it searchable.

The interface is composed of four main items: a huge search bar, the music library, a playlist and the audio controls. It’s pretty self explanatory.

Here it is:

It should work on Chrome, it might work on Safari, and will partially work on Firefox 4 (No directory select, no MP3). Definitely won’t work on IE9 and probably won’t work on Opera.

Cloud Save 07 March 2011


I didn’t really want to write a post about yet another chrome extension, as the last five posts have somehow or another related to Google Chrome. Actually, the post I was planning to write before this was “Why the Chrome Webstore is broken”, which would be sort of less fanboy-ish. Anyway, this extension is rather simple, so I’ll probably go into the reasoning as to why I made it, where it might be headed and how I made it. There probably won’t be too much interesting information here.

I wanted a Cr-48. Why? I’m not sure, partially because I don’t actually have a laptop of my own, though my brother’s Macbook Pro (which I’m typing this post on) is pretty awesome. Plus, the platform is new, non-intimidating, more or less open, and there’s such a lack of the most basic tools, that I could probably get a few twitter followers by creating web apps which did things that are really basic yet somehow the web is lacking with. Things like an offline dictionary or wikipedia dump reader. So, Chrome OS seemed cool, and probably guaranteed bragging rights, at least more so than a Google TV probably would. Due partially to my age, I’m pretty scared of using money and have this feeling that I shouldn’t spend anything on anything more than a can of soda. I guess I’ve gotten off-topic enough, and so, I wanted a Cr-48 (for free, of course).

In my opinion, Google’s pretty good at copying Apple. I don’t mean that in a bad way. I wouldn’t say it’s the intention, but at least they can recognize a good feature and can copy the essence of it in a pretty functional way while for the most part, distancing from the less good parts. Unlike my feelings of what Microsoft would do, which would be to copy most of the bad parts wholesale and add some pretty fascinating and novel parts. So, if any company were to give me a free laptop, that would be awesome, but Apple certainly isn’t going to give things away, and Google’s the only company I think can properly copy the trackpad (though it appears they can’t even do that, from what I’ve read).

So, that’s probably random enough, and you’re wondering how this relates to anything at all. Well, part of my quest to attain a Cr-48 involved building some pretty interesting pieces of software targeted at chrome os (but not by any means exclusive to chrome os). This included the offline dictionary and wikipedia reader. That way, if I didn’t get a Cr-48, I could have an excuse to hate Google and I might be less frequently arguing in their support. But this backup plan failed (fortunately), and I won the LucidChart Cr-48 competition by drawing a picture of a Cr48 out of flow chart components.

I started using Google Groups because I could. I wasn’t spammed by google in the great spamming of some time in february, which means Google hadn’t magically picked for me to have one of those devices (I think this was before I won the LucidChart competition). So I later joined the non-involuntary group for chrome notebook pilots so I could eagerly await the knock on my door from UPS and be prepared for what to do when that happened.

I skimmed through tons of random posts and eventually I noticed a pattern. People hated the file system and wanted a way to basically get rid of it. The irony is that this new Cloud that is being created, is a static collection of walled gardens. So much for progress. There’s no standard for interoperability and it hasn’t really been too important, but somehow, because of Chrome OS’s probably bad file system, people are recognizing that that this isn’t right to have an intermediary step to get data from one application to another.

I’ve always held that Browsers are to improve the user experience as much as possible while keeping all of the internet on a balanced and equal platform. I felt that Extensions were the means to trigger change to a specific group of websites or a general heuristic in order to make a more perfect experience. I thought of that while making drag2up, which creates a novel and useful feature which should be used by everybody. As part of building it, I ended up with a sort of OCD toward creating an implementation of every imaginable file host.

Cloud Save’s heritage is probably as much owed to drag2up as it is to Clip It Good, the latter of which I’ve never actually used, but found it inspiring nonetheless (and I ripped the Picasa implementation out of it too). Clip It Good was the general idea for Cloud Save, except that Cloud Save had more hosts. I made Cloud Save in thirty one minutes and thirty nine seconds, give or take a minute or so. The fact it was made in a mere half hour shows how the idea isn’t novel at all. In fact, most of those minutes were spent setting up the directory structure, manifest, installing inkscape, downloading the tango icon set and unzipping the icons to steal the save action icon (much like how I stole the up arrow icon used as drag2up’s logo). Nearly all of Cloud Save was the code needed to create the context menu. The downloading from URL, authentication and upload stuff was already in drag2up (I myself was pretty impressed about this. Evidently, I forgot how many features I had put into drag2up.).

Cloud Save wasn’t meant as a sort of glorified bookmark system. Or as a means to politely reshare images without hotlinking. I thought the need was to bypass the physical filesystem. That’s why the application is target primarily toward services which provide a virtual file system: a directory structure, files, privacy, etc. It wasn’t ever really intended as a means to share files, but I guess this is what people want it to do, so I’ll probably make the extension more sharing oriented in the future.

I realized I just forgot the rest of what I was about to write about, so I guess I’ll end it rather abruptly here. This morning, at 11 AM (though I don’t know if this date has been adjusted for my time zone) when I was still probably in school, Lifehacker posted about it and now people are using it. Awesome. I didn’t expect this to be that significant.

drag2up 2.0 - drag and drop uploading for all sites 26 December 2010

drag2up was a browser extension I built a few months ago, and recently bumped it up to 2.0. The basic idea is to use HTML5 File API and the drag/drop integration that Chrome and Firefox implements to enable uploads to any website by simply dragging and dropping the file from your computer onto the respective site. Instead of the trouble of opening a new tab, navigating to your favorite file provider, waiting for it to load, pressing the browse button, navigating to the folder with your image, pressing “Open”, then hitting the submit button, waiting for the upload to finish, copy the link, find the original tab among the mess of tabs that fills your tab bar and finally scrolling down, selecting the box and pasting the URL. All to share a three megabyte file. drag2up streamlines the process into a single, swift gesture where you drag the file onto the text entry field. The link is instantly added while the file is being uploaded in the background. If someone navigates to the link before it’s done uploading, the page uses the Google App Engine Channel API to stream real time upload status.

It still has a pretty simple user interface that works with zero initial setup. In addition to using gist and imgur for text files and image uploading respectively, it includes many additional services, configurable through a simple drag and drop interface. The new version also sports Firefox support.

New Features

  • Firefox + Chrome
  • Background Uploading
  • imgur,, ImageShack
  • Flickr, Picasa
  • gist, pastebin, mysticpaste
  • Chemical Servers, DAFK, Hotfile
  • Dropbox, CloudApp
  • Built in URL shorteners Right now, the only service that doesn’t quite work is Dropbox. The application hasn’t been approved for production status yet. And a number of the hosts do not work with Chrome 8 because of typed arrays and binary XHR issues.

Try out the extension now :) I would really appreciate any and all feedback.

Technical Information

There’s some pretty cool stuff going on in the new release. This release is really close to the bleeding edge of what’s capable on the web and with browsers. On Chrome, to upload files, the multipart/form-data xhr is being pieced together with array buffers and typed arrays, stuff from the WebGL specification. The Firefox version is based on the latest beta release of Mozilla Jetpack (hacked slightly so that the resulting xpi works with 3.6 as well as 4.0). Transferring data between the background page and the individual content scripts (or pageMods in Jetpack’s terms) is done with the createBlobURL (also called createObjectURL) function and binary XMLHttpRequest. On older versions of Firefox, FileReader is used to load files into base64 encoded data urls. Inter-frame communication is done with postMessage and native JSON serialization and parsing functions. Picasa and Dropbox support are built on Javascript implementations of OAuth (based on Clip It Good and Chromepad). So, yeah, there’s lots of new and super cool stuff in there.

The content script was more or less rewritten in order to support Gmail, which led to some interesting design changes. Firstly, the new version establishes a sort of hierarchy, a difference between a root frame and the most subordinated one. Each drag event is sent to the top using postMessage, where the root frame decides whether or not to render the targets or to remove them. Once the decision is made, it trickles downward to each of it’s immediate frames and then trickles back down to each subordinate frame. Whenever a frame is created, a loop detects that one has been added and attempts to access it’s DOM in order to insert a script element that initializes the code. Interestingly, content scripts (in chrome) are unable to access the child frames so a script needs to be inserted into the immediate page in order to insert a script into a child page. The scripts also set document.__drag2up to ensure there aren’t any frames that load themselves twice (interestingly creating an event and dispatching it seeing if there’s a event listener that does preventDefault isn’t a reliable indicator. I wanted to ideally create a system that was mostly undetectable by the parent page, but eventually settled for this simpler and more reliable implementation).

Once the “trickle_reactivate” message (every message happens to be eighteen characters, because of certain really weird design ideas) is recieved by all the frames, they loop through all the elements on the page, searching for elements that meet the isDroppable criteria. It does not trigger on elements who have a width*height < 100, so no really tiny boxes. It needs to be an input element type text or textarea that isn’t read only or disabled, or it needs to be an element whose contents are editable. Then it checks the positioning of the element using the document.elementFromPoint function twice (one accounting for scrolling and one that doesn’t to differentiate fixed positioning from others). The drop target is a div with a massive style attribute. It has rounded corners, css transitions and is positioned either absolutely or fixed based on the positioning of the associated input element. It’s centered around the element or completely covers it for smaller areas. When the user hovers over it, the opacity changes (the behavior is actually reversed from the last version). The original version faded out slightly when hovered over, which, after some thought, I realized made no sense. Similar to the drop target is a purple settings box on the lower right corner. I’m actually quite displeased with it. It’s not a great user experience, but at least it’s noticable. Primarily, it was added because I have no idea how to get the Preferences button to work with Jetpack and Google manages to bury the Options button somewhere deep within the depths of Chrome’s single menu.

Once the file is dropped, the content script decides whether it was a file that was dropped or an image that was dropped from some other page. If it’s an image (because that’s the simpler case to deal with), it does something that’s actually slightly counterintuitive. Using getData(‘text/uri-list’) isn’t reliable as in many cases, the image is wrapped in a link tag and that only gives the link URL instead of that image. So instead, it reads text/html and inserts it into a temporary div and pulls out the source attribute which is then sent to the background page. If the thing that was dropped is a file, first, it checks if the browser supports one of the Blob URI creation methods and if that’s available, it uses that to create a URL. If not, the file is read using FileReader as a base64 encoded data url and sent to the background page. But not immediately, because often the script is running in an unprivledged environment, so it does a postMessage to the topmost parent which should be privileged. The privileged content script sends the message to the even more privileged background page that initiates the actual uploading process.

Before going into what the background page actually does, there’s the settings page. I’m actually pretty proud of how it turned out. With the large number of hosts that are supported, I needed a way for the user to select hosts for different types of files. I could have used select boxes, but they’re not generally great when dealing with image based concepts, because a company’s logo is often more recognizable than the name (odd that I’m saying that, seeing as this blog has neither a distinctive name nor even a favicon). So instead, I built this little grid where the user can drag hosts into boxes on the right. It’s made with jQuery UI, because I don’t particularly like the API’s provided in HTML5.

The background page handles the uploading. Basically, there’s a bunch of javascript files that are loaded into it. The hardest part was by far, OAuth. If anything, this project has only refined my dislike of OAuth. And because this post is already insanely long, I’ll go and rant about the problems of OAuth anyway. There’s already a great article on how terrible of an idea it is for OAuth to be used in any application that runs code on the client. But Photobucket, Picasa and Dropbox still don’t understand that (or don’t care) and only provide OAuth. But that’s not really a problem, I just hate OAuth because even though It’s a standard, each service provider has some little quirks that take hours to debug and ends up being extremely frustrating. Maybe the OAuth gods are angry with me or something, but it’s incredibly troublesome.

Try out the extension now :) I would really appreciate any and all feedback.

drag2up – Drag files onto any site 01 October 2010

Drag2Up, is an app named horribly, as I suck at naming things. It’s very simple in terms of capabilities - no buttons, no interface, which is pretty much the ideal type of user experience, something that’s intuitive and lends focus to content, not chrome. Really, this post isn’t useful in any way. What you should do, is install it and try it out for yourself (Chrome only, It would be possible to make a firefox version, but I have no experience with Firefox extensions, and the new Jetpack isn’t quite out yet, though you can probably adapt the code relatively easily. Maybe a greasemonkey script by using GM_XHR).

Drag and drop is probably one of the greatest forms of interaction ever, it’s intuitive and exemplifies the usefulness of a GUI. Recently, browsers have implemented drag and drop APIs (actually a feature of IE5.5+ but then codified into part of HTML5). Some browsers, namely Chrome and Firefox implement a dataTransfer property of the drop events that allow one to access files dragged from the user’s desktop (or file browser) onto a web application. The application can read the file through the File API, and do whatever it chooses. Probably most notably is Gmail’s utilization of the feature that allows dragging and dropping files from the desktop onto a message to upload it as an attachment. When it was announced, some people remarked how excited everyone was of a feature that was present in desktop clients for several decades already.

Browser vendors have the role of enabling developers to do cool stuff. It’s up to a web developer or service to maintain an application, to update it and add new features. Undoubtedly, many do, living on the bleeding edge of innovation, but most don’t, and stay locked in a certain state for years if not longer. Extension developers have no obligation to be entirely objective in terms of how it deals with websites, and many are site-specific, adding features to sites that lack them, integrated into the interface, utilizing product-specific APIs, or document structure. Browser vendors enable innovation, site developers integrate standards and browser extensions add features to sites, because the developer is unwilling or because of restrictions on the functionality of web apps (x-domain, persistence, etc).

Uploading images is a fairly common task, and the standard go-to-file-host-site, click-the-attach-button, browse-button-hitting, navigation-through-folders and waiting-for-page-to-reload-after-uploading, link-copying, tab-switching and pasting. Some applications like CloudApp look promising in alleviating this needlessly tedious process, but only reduce the interaction steps a little. The ideal would be making that third-party file upload service integrate seamlessly into all web applications, with a level of integration akin to Gmail, drag and drop.

As awesome as it sounds, it’s not as easy to implement as one might like. The basic idea was to detect when a drag event started, and to iterate all the elements, adding an absolutely positioned, semitransparent mask to indicate that it’s a drop target. A few issues occur with that, I have a little writeup on those issues.

A lot of WYSIWYG editors use iframes, and creating content scripts with all_frames: true doesn’t necessarily mean it runs on all frames (for some very odd reason). And also for some reason, there’s no way to access frames from window.frames in content scripts. So I have this hack where if there are frames, it injects some javascript into the real page executed in the page’s context to run it within the iframe (if it’s the same domain, as it always is in wysiwyg editors). Then events get propagated from the sub-frame into the parent window, through postMessage that is subsequently handled by the content script, forwarded to a background page using chrome.extensions.sendRequest, and it all happens in reverse once the server gives a response. As such, it’s a really bad idea to use this for large files since it needs to be passed around several times over several pages, and apparently V8 isn’t the best with big strings.

Aside from that, uploading to imgur for some odd reason takes a long time. I don’t like sites that don’t have simple JSON POST APIs, and there really aren’t many that fit that criterion. So with the initial release, images are uploaded with imgur and text files are all uploaded to a github gist.

stick figure animator 22 June 2010

One thing the ajax animator’s pretty bad at is stick figures. Sure it’s not impossible, but it can’t really compare with the ol-fashioned frame-by-frame joint-manipulation likeness of Pivot. It’s called stick2 because the original experiment with stickfigures was named stick.html, and when I went to extend it and didn’t feel like setting up a git/svn repo, I copied the file and named it stick2.html, and with no good project-naming skills, it stayed that way.

Anyway, this was a project that got pretty close to completion in early march, but I never bothered to blog about it until now. It should work pretty not-bad on an iPad J(except the color picker), though honestly, I’ve never tried it.

The interface is pure jQuery/html/css. The graphics are done with Raphael, but the player actually uses <canvas> for no particular reason.

Basically, it’s organized into two panels, the left-side figures-box and the bottom timeline. The figures-box contains figures (amazing!) and clicking on them adds them to your canvas. The two defaults are the pivot-style stickman, and something called “blank” which is a root node with no additional nodes. Though it shows up as a orange dot, unless you add something to it, it doesn’t have any actual look when viewed in the player.

On top is the context-sensitive buttons. Well the buttons in my screenshot aren’t context sensitive, they’re permanent. But when you click on a node, a new set of buttons (and words too!) appears. One is a line and the other is a circle. Click them to add a new segment or circle to the currently selected node. Then are various settings for the current segment (each node other than the root one is associated with a segment). Clicking those allows you to modify them. Also, a red X appears on the right, and that basically means remove the node and the child nodes.

So, now you have some extra nodes, how do you change them? Simply hold it down and drag, and the the segments move as well. But note that the length of the segment doesn’t change as you move it. That’s because by default, it locks the length of the segments. There are two ways to get around it. The first is to hold shift while dragging. The second is to tap the little lock icon on the top left.

On the bottom, is the timeline with live-previews of your frames with a semitransparent gray backdrop of numbers. Switch between each one by clicking on them and add one at the end by hitting the green “Add new frame” button.

On the canvas, there are two yellow squares, those allow you to resize the canvas.

On the very left of the top toolbar, is the play button. Hit it and the figures toolbox minimizes and it plays out your animation. Click it again to get back. Then is a little upload button. Hit it and then a little box pops up with a link to where you can find your animation in a way that you can share and to edit (not actually edit, but more like fork, as each save is given a unique id). Next is the download button which you hit, and get prompted by a big prompt-box which you use to paste in the ID of the animation you (or someone else) has saved, so you can edit it. Most of the time that’s useless as when you send a link with the player, it has a button which says “Edit”.

Sample animation:

Try the application out: