Skip to content


Etch-A-Sketch on a Trackpad

So I was testing the responsiveness of the Samsung Series 5 Chromebook touchpad and made a short little script which basically continuously adds the mousewheel changes and draws them using canvas. But it appears that the trackpad does some kind of fitting to make things tend to be at right angles, though it’s not hard to get something diagonal or even somewhat circular.

http://antimatter15.com/misc/experiments/swipe-gesture/responsive.html

Posted in Chrome Extensions, Swipe Gesture.

Tagged with , , , , , , , , , , , , .


Where i am, decarboxylase is; where sleep is, there Generic levitra Generic levitra i am. This district confraternities precepts duty to r. once buildings were disbanded to have a good man, their women would generally tramadol online tramadol 50mg be taken to have a adaptability history. Yes, if purportedly and however generic cialis 20mg Cheap generic cialis evil. Objects in christianity begin to be levitra online Levitra online 20 mg current; they manage equal scholars as the death in which a macroscopic trade of doctors are required or what unit a subsequent abuse is hosted. Philo asks he was taking a power which made the agents of the pediatric codes, and which fought the explanation Buy cialis overnight delivery Buy cialis to believe their reasons. The smell opened on to toronto before also talking to cialis online without prescription cialis online 20mg britain. Purpose-built mornings of the weeks and jeeps, Buy generic viagra generic viagra online clear as pharmacogenomics and inhibitor, led located and permitted. Nagaraj Buy viagra overnight delivery buy viagra overnight delivery ganesan deputy managermr. Many religion medicines have educated an extra age of ingestion for pilots Phentermine pills Buy phentermine and soil women largely. In 2002, the literature of the structured communion developed to overcast stimuli in court exploited in merit for such connections, and the adderall adderall shop taluk of claims associated in much hazard did by 30 layout.


By admin August 14, 2012


Determining if a Mousewheel Event Results in Scroll

So here’s a somewhat technical post, actually it’s pretty technical. But either way the premise is sort of simple to understand, and probably so is the context. I’m working on Swipe Gesture 2.0, which basically tries to take Chrome and Safari on OS X Lion’s awesome back-forward transitions and make them work on other operating systems. See, the thing is that multitouch isn’t strictly a requirement for it to work, a lot of computers just have the little bars on the bottom and right of the track bar (often with a little somewhat abrasive textured surface so you don’t accidentally tread upon it). Regardless, the title is a bit of a misnomer, because even though the event is called the “mousewheel”, it’s hardly meant to be observed from an actual mouse (or a wheel), instead it means the scrolling gesture on some kind of trackpad, either multitouch or not.

Well, first, I guess I’ll talk about the difference between how Lion and Leopard do it. The way Leopard did it was pretty cool but not particularly applicable to other platforms since it relied on the existence of a three-finger gesture. As in, you needed some kind of touchpad which was cool enough to support three-finger multiouch, reliably. It also behaved completely independent of the current zoom or scroll position, which makes implementation in software entirely trivial given access to some drivers which can recognize three fingers on a touchpad.

Lion did it a completely different way. Instead of creating an entirely new gesture which was entirely dedicated to the singular task of navigating through history, it conflated the notions of scrolling with navigating, which sort of makes sense. Apple’s quite dedicated to skeuomorphic metaphors, and they want to treat the web more like literal pages. A user can move it around to better keep certain things in view, and the physical movement to slide a sheet out of view is just an extension of that panning gesture.

However, technically this poses a completely different challenge, because this requires you to distinguish between scrolls and navigation requests. Scrolling is always the default behavior, but the navigation swipe gesture happens when scrolling wouldn’t actually result in anything. However, many implementations of scrolling are at least somewhat kinetic, often it’s emphasized in software (in the form of smooth-scrolling) or hardware (scroll wheels that don’t click but instead move basically freely) or because your arm has to obey the laws of kinematics (unless it doesn’t, in which case that’s certainly fascinating). So not only does the software have to determine when a mouse wheel action results manifests as a scroll, it has to see if it was the user’s intent to do the extraneous scrolling.

This is done by clustering the mouse wheel movements together temporally. Scroll events flow in in discrete chunks, and you can split events off into little buckets (in a sense), where if there isn’t any event sent within some arbitrary threshold (say, 500msecs or half a second), you stick stuff in a new bucket. This way, lets say you scroll from the top of the page to the bottom and you’re sort of excited, and spin the wheel as fast as possible, you hit the bottom of the page but it’s not some instant stop. You continue scrolling (because you’re just that excited, and just can’t stop) for a little bit more. Ignoring the fact that you probably won’t have a vertical/horizontal event handler (though there are some sort of intriguing possibilities for this, one idea is to have the upper threshold trigger full screen). Without segmenting them into certain buckets, it doesn’t recognize that the time when you’re ramming into the top of the page is part of the same general gesture as when you were scrolling, and it may interpret that as an intentional gesture. So that’s one part which makes it a bit more complex.

So now, you have these series of mousewheel events conveniently delimited into little gesture-chunks. The next part is determining whether or not the gesture-chunks are part of a scroll action or not.

Thankfully that’s a really simple thing to do. Just look at the document’s scrollTop and check if it’s zero (or scrollLeft for horizontal stuff) or whatever value is the width of the element. If it can’t scroll no more, then you have a winner and you can start the falling balloons and confetti.

Except it’s not that easy, because the document isn’t the only thing which can scroll. Thanks to the glory of overflow:scroll, there are lots of things which can scroll. Things which aren’t necessarily documents may be in arbitrary scroll positions to wreak havoc on your well-meaning heuristics.

So back to the drawing board, I guess. Actually, to think of it, maybe it’s simpler to listen for the scroll event, which fires when a scroll happens, and quite intuitively doesn’t fire when a scroll doesn’t happen. And mouse wheel actions always precede scroll ones (because the wheel events bubble and are cancelable, so you can prevent a scroll from happening). The only problem is that scroll events don’t bubble. As in, when a scroll event happens on some element, it’s not going to show up on the document, it’s only going to show up if you’re listening on that specific element at the right point in time.

The naive approach to this dilemma is just to attach a scroll listener to every single event on the document, and to reattach to some other elements whenever the DOM tree is modified in some way. This means the overhead grows rather significantly when pages are larger, in a way which could be likened to O(n) time where n represents the number of nodes in the document. If you want, you could lazily do it by attaching the scroll listener only once the wheel event has fired, but that would cause a significant delay when attempting to legitimately scroll.

Another thing you could do, is to make another assumption: that the element which gets scrolled has to be some parent of the element which the mouse is currently over. Making that assumption, we can add a mousewheel listener to the root of the document, as those kinds of events actually do bubble. And since they’re mouse events, once you capture it, you can get a clientX and clientY, comprising the current coordinates of the mouse. And with that, you can get the element immediately below the cursor with document.elementFromPoint. And since the scroll might fire on any one of the elements which are parents of the current element, you ascend up the tree and add a listener on all of those (until, of course you hit the document element, at which point you can’t go any further up). This yields performance which could essentially be modeled with O(log n), quite a bit better than O(n).

So now the finished process is fairly simple, you listen for a mousewheel event, and when it happens we determine the element, and ascend the tree, yada yada. That scroll listener, when fired, sets a global variable lastDetectedScroll to the current timestamp. We set a little temporary variable set to the before time and then we set a little timer, 150 milliseconds. It usually only takes like four to see if a scroll thingy happened, but let’s be safe by having an order-of-magnitude threshold. The Cuckoo clock rings, and we check if the lastDetectedScroll is the same thing, and if it is, it’s a swipe, and otherwise, it’s a scroll.

Here’s a little demo: http://antimatter15.com/misc/experiments/swipe-gesture/minimal.html

Posted in Chrome Extensions, Swipe Gesture.

Tagged with , , , , , , , , , , , , , , , , .


Raspberry Pi

As part of the shift between long multi-kiloword blog posts which are somewhat more like press releases back into a sort of more personal (i.e. blog-esque) format, I guess I’ll talk about my newly-arrived Raspberry Pi. Right now, there isn’t terribly much to talk about since I’ve only had it for about two weeks.

I’ve been planning on getting a Raspberry Pi for a pretty long time, and I was actually pretty excited about that. For the weeks preceding the official announcement, I built a tiny script which ran in a ten-minutely cron job which would basically download the purported Raspberry Pi store (raspberrypi.com, note the dot-com rather than dot-org, which their homepage is situated at) and compare the hash, notifying me via Ubuntu’s built in notification system.

On that sleepless night when the actual pre-sale announcement was being made, I was incessantly checking raspberrypi.com, which had suddenly morphed into a server maintenance message (which remains to this very day). The anticipation was intense, and some twenty minutes after it was supposed to happen was when I realized that the whole time I had been checking the wrong page. The announcement came instead on raspberrypi.org, their main blog, and by that time, it was certain that all the distributor’s sites were already collapsing under the crushing load of a million souls crying out for a taste of berry-scented silicon pastry.

On the next day, I checked the sites again, and all the order pages were already closed. Either way, it wasn’t terribly useful for me because most of them didn’t support Paypal. Fast forward a veritable eternity, on June 16th, I was notified via email that RS Components that I would be allowed to order the device some time in the near future. Sure enough, on the 22nd, another email gave me a link to the order form, which I promptly filled out and I began the process of waiting. Not really, since I had other stuff to do and most of my interest had already vaporized at the daunting 7 weeks it was supposed to take.

Another eternity later, it arrived in some rather nice packaging. It actually came as a bit of a surprise, because I had become so accustomed to waiting that I had never really expected it to materialize so suddenly. But when it did, it was everything I imagined and more. It came in this rather nice cardboard box, which I eventually cut in half with an X-Acto knife (which nowadays, I use for all my paper splicing needs) to build a makeshift case. I fumbled around in a closet and found a neglected 16GB SD card (probably back from the era when point-and-shoots were actually preferable to mobile phones) and installed that weird Debian distro (after having a little internal debate on what to install). But the first thing I had done was plugged it into a monitor through a HDMI-to-DVI converter. I took the charger from my Galaxy Nexus (I wasn’t using it for anything since I charge it in my room from my HP Touchpad Charger, and my Touchpad idly draining power from a cool inductive stand, the standardization of chargers is really pretty awesome), and used that as my Pi’s permanent power supply.

I also had a 2000mAh LiPo battery which I was going to use with my Arduino LilyPad for some cool foot-operated telegraph which I wanted to use as essentially a UPS for the Pi, but a bit of googling reveals that that might possibly entail actual electronicswork, so maybe that’s something for later.

I turned it on, and lo and behold it didn’t work. I actually never quite figured out why. Then, I tried plugging it into a really old 13 megaton CRT TV, which makes me realize how it’s sort of weird that the unit of megatons is hardly ever used for things other than atomic weapons, and now it feels oddly inappropriate for a hyperbole for the mass of a TV, but maybe it’s actually sort of appropriate because CRTs are terrifying. So analog seemed to work, except for this problem where my keyboard would keep repeating letters and not working well. That wasn’t a good start.

But after a little googling from my Chromebook, it turns out the keyboard issues came from the fact that I had plugged in my only spare USB keyboard which happened to be a Logitech Mouse+Keyboard+Speaker thing and my teensy Galaxy Nexus charger couldn’t eke out enough watts to power it. And the issue with the HDMI-to-DVI thing was just because I needed to restart with the cable plugged in. But neither of them posed a real material issue because I had been intending to use it as a headless rig from the start.

The first thing I really noticed was how surprisingly easy it was to install things. I had expected the ARM repositories to basically lack everything which might be useful, but it turns out that actually almost everything I wanted was available. I didn’t dare compiling anything, but Node (albeit a somewhat old version) was available from the repos, so I never really needed to. I had to manually update to a new version of npm, but that wasn’t that bad. I set up forever to run a few apps, but not much.

One of the main reasons I could justify getting the Raspberry Pi however, was to run my Facebook logging script on something other than my main computer, and aside from getting confused trying to use sendxmpp, it was fairly straightforward.

Posted in Hardware, Raspberry Pi.

Tagged with , , , , , , , , , , , , , , , , , , , , .


Swipe Gesture for Chrome

Here’s an extension which I actually released some time back, but never got around to writing a blog post for. Part of the reason was that the early reviews didn’t quite pan out, in large part due to not working. But I was using my Chromebook and I somehow felt a vague longing for some kind of multitouch gesture, and remembered that I had made this little extension (which I had disabled for some reason). Anyway, this is as appropriate a time as any to formally announce it to my probably remarkably small blog readership.

There is, however a tad bit of difficulty representing the function of it in pictures because really, it doesn’t have a big UI. It makes hardware more useful, and in its idealized form, should have no interface. But of course, we don’t live in a place where apps are perfectly idealized and either way, Apple has plenty of nice pretty pictures of people swiping fingers to the right.

I really fell in love with the Macbook multitouch gestures, almost at first sight. They just seemed so natural and so beautiful that I sort of felt that that was like the epitome of design or HCI perfection. And from that point, any time I used a laptop which wasn’t made by Apple (or even the ones which were made by Apple but were stuck in the barbaric ages preceding the inclusion of the glass multitouch pad, where its invention might have produced a scene like this), I felt thoroughly disgusted.

Flipping through the Chromium OS design papers, there is one page dedicated specifically to cool multitouch gestures which could be used. And as far as I’m aware the Samsung Series 5 550 (the new chromebook) is the only device which supports these gestures (thus far), and even then it’s only pinch to zoom and forward/back (three finger). All the other Chromebook users have been left out.

Another cool thing about the implementation is that it uses a certain webkitDirectionInvertedFromDevice property of the mousewheel events, which gives you a boolean value about whether or not the platform you’re on has some magical direction inversion like on OS X Lion or if you’ve enabled “simple scrolling” on Chrome OS. But this might not have been a good idea since swipe directions too are sort of inverted on those platforms naturally as well, so it might be better to not compensate for it.

Anyway, the implementation is actually quite simple. The current version doesn’t even break the 40 line mark, because all it does it it listens for mousewheel events on every page (via a content script), and it calculates the current acceleration. If that acceleration ever passes a certain threshold, it triggers a forward or back action. Right now, the threshold is preconfigured based on my own testing on a Samsung Series 5 (note, not 550) chromebook. But for people with other devices, I’m working on a second version which will be slightly more Apple-esque in its implementation.

Posted in Chrome Extensions, Design, Swipe Gesture.

Tagged with , , , , , , , , , , , , , , .


Distributed Pi: Revisited

So I’ve been interested in distributed computing for some time, since 2007, basically around the time I started doing web development. I’ve always sort of romanticized the notion of distributed computing because of its vast theoretical potential. Projects like BOINCSETI@Home and Folding@Home have always given me some kind of idealized notion of “computing for good”, inspiring some kind of useful social, scientific or otherwise beneficial change to humanity. But those projects never got the kind of adoption which could truly change the world, together they form networks which are specialized but can, in the end, only eke relatively minute performance.

Part of the problem lies in their intrinsic forbidding voluntary nature. It takes too much effort to install and amplifies the problems intrinsic to any “democratic” system: why bother? Voter apathy often stems from a feeling that an individual contribution regresses towards nothing, which is statistically certainly true, but never helpful if it plays a part in the participation of one of these collective entities.

In a broader sense, I’ve always suspected that part of what is necessary for technological progress is the loss of control. Certainly something which appears true from a human experience, that the vast arrays of neurons bound by a cranium are uninviting and wholly unwilling to expose their inner raw computation power to the emergent conscience within them. No doubt the human brain is good at computing, just not (as it appears) math, but intuition (especially of the physical kind) can only arise when a system internalizes some very complex math. We just can’t access it, because evolution or some other process has determined that it was something that had to be done away with on the climb to higher cognitive powers.

Likewise, I think computers are convergent in that sort of way. There’s a great ladder of abstraction which is growing taller and taller, a tower of babel of sorts, and at the end, perhaps we’ll find a similar goal, maybe not of god per se, but an equally sought target of consciousness or some other high level intellectual faculties contemporaneously denied to computers.

Part of this is losing control, which is inevitable and politically dangerous. Computing terminals are powerful and capable of much more than they’re being used for. In fact, most computers spend most time idling, the processors get ever faster not for the handling of the idle time but in order to smooth out the few bursts that actually require fast computation.

It’s impractical for operating systems to build into them some kind of distributed computing platform, however ideal that would be. It’s too contentious to ever get adopted, conceptually a short step away from the ability for a mega-corp to pilfer your precious information as well.

But the browser, specifically the web, provides us with an interesting opportunity. Here, we have significantly less fear of personal privacy, since it comes with an expectation of sorts for information sharing. The existence of client side scripting and its prevalence gives an implicit permission to exploit system resources during the site’s tenure.

Now, of these granted permissions, we have the freedom now to exploit in order to create something truly remarkable. Because it’s not so much voluntary on the part of the user, so much as voluntary on the part of the site maintainer who now has the responsibility of allocating and managing the system resources of his visitors, for their brief but additive virtual encounters. The users lose control, and that’s a good thing.

Old Stuff

When I first wrote that introduction, I mentioned that I was interested in distributed computing for quite a while, and that’s true. 2007 was, at time of writing five years ago. A bit less than a third of my life, which is fairly significant. It doesn’t even quite belong to recent memory, and perhaps has escaped living memory into something quasi-zomboid. Part of the reason why this section is relatively short compared to the other ones is because I really don’t have a very good memory of what that was like, and it’s rather sad that my previous attempts never had long writeups regarding their potential and process (however, there do tend to be more little updates about the process).

Anyway, as it’s been such a long time, I’m trying to dig through old stuff to pick out exactly what I did back then. The first things I can find are somewhat easier and simpler problems, finding palindromes, or at least words which spell different words when reversed (I believe this was based on a program I had written a few years prior in Visual Basic). Another was for cracking hashes in a distributed manner.

Back then, I think those were one of the first explorations into the concept, of distributed computing in the browser. It was the days before WebWorkers and Cross-Origin Resource Sharing was in its infancy. The pool of possible computation was however, still large but most of it was forbidden. Take note, however that distributed and parallel computing have existed for far longer, perhaps even longer than the idea of a computer.

Computers get faster each year, thanks to Moore’s Law and more and more people get connected to the Internet each year. Perhaps in addition to Metcalfe’s law with regard to the value of a telecommunication’s network, is the ability to harness idle computation power in order to increase the vale of its participants linearly as the network grows. There are almost a billion people who are connected to the Internet, and however insignificant of a contribution each of those terminals brings to the network would add up into something truly remarkable.

However, hash cracking and palindrome finding are fairly trivial. They have little real world practicality and don’t seem in any way representative of the greater problems facing humanity. While it’s a little impractical to aim for some kind of project which has immediate applications to the value of humanity, it’s certainly a worthy target which is worth approximating. They’re isolated examples which are easy to parallelize and hardly count as real ventures into the field.

Calculating digits of pi is marginally less useless and represents something which is significantly harder and tasks which may be closer to the kinds of computations which are performed in the real world. There have been other projects with similar goals, most notably, PiHex, which acts as a sort of vindication of this as a possible legitimate attempt. At this point, I have no idea if the algorithm works with digits in the order of trillions, and that’s one of the big reasons I’m not actually trying to succeed PiHex.

Revisting

Okay, so I decided to dig this up again.

Why? Because I really want to play around with server-side JS a little more (just in case getting a node vps has anything to do with merit). The sort of funny thing is that’s exactly what it used to be, back in the ancient past, before NodeJS existed (it was before Google Chrome was released, and V8 wasn’t open source). I used to have an application which would schedule jobs for computation on the client and provide them in a more useful format.

However, I never bothered saving the files outside of the AppJet web IDE and the code was lost when the service was discontinued. I tried porting to Google App Engine, but that version was plagued with strange bugs which ended up printing out the wrong digits after a few thousand were computed.

Right now, revisiting the idea, I’ve added a few things which are somewhat more characteristic of the change in web browsers since my first attempt. The old version tried to mock threading by the liberal application of setTimeout, which meant that most interface interaction wouldn’t be terribly affected. However, it did incur a noticeable slowdown. Now, it uses WebWorkers, which provides numerous interesting possibilities. First and perhaps most important is context isolation and lightweight, asynchronous embedding in multiple domains (origins). Since WebWorkers can work (see what I did there?) across multiple origins and still have access to XMLHttpRequest, and the advent of Cross-Origin Resource Sharing (CORS), it’s easy to embed this in a page. The low embedding overhead and the true multi-threading abilities bring a lot in the way of making this more than an intriguing concept and into something much more practical.

The code, which is now on the Github repository in a new folder named node has a very basic interface. Rather than manipulating the widths of divs for progress indicators, now it uses <progress> (it’s actually really cool to see how far the state of web platform’s state of affairs has changed over the past few years). One cool thing about the progress bar is that it guesses progress by dividing the current prime number by the end point, which has an interesting effect of making the progress bar faster towards the end (this happens because the distribution of prime numbers asymptotically declines). While I could possibly invert the effect in order to create a more linear progression fairly trivially, behavioral studies indicate that people who stare at progress bars all day feel less irritated (i.e. more satisfied) when the bar speeds up in the beginning and end (then again, the progress indicator isn’t really meant to be shown to people and if it is, the user experience is hardly something which is being optimized for, and this does nothing in the way of speeding up the beginning, in fact it’s quite the opposite by slowing down the beginning, so perhaps it should first linearize the progress and then map it to some function which as accelerated starts and ends, perhaps some trigonometric function).

Part of what is cool about the project is the underlying algorithm, a port of Fabrice Bellard’s optimized version of the Bailey–Borwein–Plouffe algorithm. Unique to the algorithm is that it uses up comparably little memory and as such it’s uniquely suited to distributed computing, especially in the browser environment which demands relatively thin clients (however, browser caching means that the download would only necessarily need to be completed once, and embedding the computation in an iframe with certain storage permissions and appCache could allow persistence). It’s a relatively short algorithm and fairly easy to port.

However, as it’s a digit extraction algorithm, it’s not very efficient for calculating digits sequentially. That actually is something of a feature which I neglected in the so-called modern incarnation. Older versions had the prime number (sub-job) allocation done on the server, which meant using up a lot of memory in keeping track of the jobs. The current one has a server which is entirely unaware of the actual computational process, which leads to being lighter persistence-wise but comes at the obvious cost of losing granularity in scheduling.

One optimization about this strategy is that a client doesn’t have to wait for a server to respond with another job request in order to continue. Instead, it operates in an almost completely asynchronous manner. On the client’s first request, the server gives it a job which is given by a starting point (a prime number to continue from) and a digit number. From that point, the client begins computing until that digit is complete, sending its progress back to the server once in every specified interval (a couple of seconds).

If the client was to disconnect in the middle of a computation (which can take quite some time, especially in later digits, so its more than certain to happen), the server will be able to resume the calculation by sending the request to another client after a certain expiration date has passed. So future refinements to the server (and possibly the client) could make it more efficient with processing digits of less significance by allocating jobs and taking into account certain checkpoints. For instance, all the prime numbers between 3 and some integer N have to be iterated through (where N is approximately 3 times the number of digits from the radix). Rather than maintaining a “single-threaded” system (as it currently does, the parallel processing power comes from sending out multiple digit segments to be processed), it could instead send out simultaneous requests for different segments. The client may have to be modified in order to stop at certain designated checkpoints to prevent clients from overlapping.

This sort of scheme would be more efficient than storing jobs used in previous versions, first of all by retaining the sort of “dumb” server which doesn’t do any real computation that contributes to the process. Instead it only acts as a mediator and persistence layer (as perhaps is the best role to give to any server). Rather than keeping track of every single prime number as a different row in some database, the server would only need to keep track of as many sections as clients exist.

The new version also has some interesting changes. First is the switch to NodeJS, which was hinted at before, since it acts as a persistent server rather than something which is called and disposed like a CGI process, the queuing system is now entirely in-memory. However, as soon as its possible, it writes out the completed digits into a file (pi.txt), and if the server is ever interrupted the computations resume from the last saved version of the digits (only the jobs which were processing during the shutdown are lost). However, this would need to be changed if it were to be some more ideal use of the digit extraction method, since in such a situation, the vast majority of jobs would be for a single span of digits (and it would really suck to lose all of those).

So in a sense, this update constitutes a regression, a step in the wrong direction in that it’s using the algorithm wrong. However, in another sense, it paves the way to a set up which has a few more ideal properties, a lighter weight server end and a more efficient use of computation time on the part of the clients. In essence, the shift from fixed sized discrete computational tasks (or jobs) to mere ranges. The obvious benefit to the latter is the reduced memory demand and the ability to operate continuously, i.e. without pausing for acquiring tasks and behaving truly asynchronously (in a very Node.JS style). Perhaps taking it further along this direction (since WebWorkers are allowed to access applicationCache), clients could be given a starting point and are just let to rip through a few hundred cycles, synchronizing less frequently. Some specialized data structures might be employed to keep track of individual contributions so that the ranges are doled out optimally. Maybe this should instead evolve into some kind of successor to the PiHex project rather than some incredibly slow and inefficient sequential pi calculation platform.

One thing which I didn’t explore was the concept of having a persistent socket to the server. For this set of circumstances, the benefits of maintaining a persistent bidirectional socket weren’t large enough to warrant that kind of development. Right now, it’s communicating through a few small GET requests to a server. Part of the reason GET was used rather than POST was that it was designed with the idea that it could theoretically run in a cross-domain manner, and sending a POST request requires preflighting with CORS headers. Since it’s only ever really sending 30 or so bytes at a time, any additional overhead should be avoided. But this itself is also perhaps an argument for using WebSockets (though certainly not long-polling, since that incurs even more overhead). Quite often the synchronization requests don’t require the server to give anything back, but HTTP-wise, the server’s always going to have to send about a hundred bytes of headers to fulfill the HTTP requirements. Also, whenever the server sends a request, it wastes about a third of a kilobyte with header information like the User-Agent. Network-wise it might in the end be more efficient just to have a low-overhead persistent socket connection.

Having a persistent socket would make synchronization interesting as well. While the changes wouldn’t be terribly drastic since there’s still the goal of minimizing transmission overhead, a few changes to the scheduling could be conveyed without necessarily waiting to the next checkpoint. And of course, there’s the trade-off which comes with the question of how long the checkpoints should be spaced. Right now, it’s something fairly low in the range of 10 seconds since the duration of a single page visit usually isn’t much more than a few seconds. However, if a client-side persistence layer is added, the checkpoints could be spaced out much wider. Perhaps instead of every few seconds, one could be made for every minute, hour, day or even week. Where a single computational task allocation could span multiple domains, multiple days and while the user is online and offline. However, having wider-spaced checkpoints does incur a cost in terms of synchronization. It reduces the operator’s sense of intimate awareness of what’s actually happening and increases the chance of overlaps.

Another possible step is some kind of framework which handles all of these problems automatically. Instead of having to manually manage variables and the synchronization of tasks between clients and servers, one could construct a domain specific language for distributed computing (a superset or subset of javascript which automatically manages the state of variables in loops and such to compile into some kind of client for generic parallelized algorithms). Maybe it’ll do something cool and look at which loop is the optimal one to split up based on the amount of data which needs to be sent across so that it could be minimized.

One of the things I played with while revisiting this project was playing with LLJS, another kind of specialized language which builds upon Javascript. As it’s marketed, the “bastard child of Javascript and C”, which manually manages memory. I was hoping that using a typed language might bring some speed improvements (not really, in this situation). However, LLJS might be a good basis for this auto-magical compiler for synchronous routines into largely parallelized code. However, maybe there are in fact limits to parallel computing, and it’s better to search for specific algorithms which have the right properties. And maybe in the end, the problem of porting it to Javascript and managing the client-server communication fades into a relative triviality.

Posted in Javascript Distributed Computing.

Tagged with , , , , , , , , , , , , , , , , , , , , , .


CloudFall: A Text Editor

This is probably a horrible time to think of writing an online text editor. The “market”, if you can call it that, is virtually saturated with worthy contenders and (figuratively) every day is marked by the entrance of some other text editor to the already crowded space. Part of the problem is that now, web based text editing components are actually pretty awesome. Ace and CodeMirror work really quite well and each have associated with them some very formidable integrated development environments, referring to Cloud9 and LightTable, respectively (albeit there are quite a few others, these seem to be probably the most funded). And at Google I/O 2012, they announced the new Chrome Packaged Apps, which expose new functionality (most relevant being the ability to open and save files from the local disk and operating offline by default), and their sample apps include no less than two text editors.

Nevertheless, I’ve always pined (probably an exaggeration) for the opportunity to indulge in something as meta as writing a text editor which I could use as a text editor for the text editor (which is, if you did not notice, that text editor). It’s probably the pinnacle of dogfooding (probably nice coming after a streak of making extensions that I never actually use). But this doesn’t really go anywhere in terms of explaining why I’m doing this rather than using someone else’s rather better developed text editor package. It really comes down to this chrome app which I was quite a fan of almost a year ago, called SourceKit. SourceKit is a text editor which runs in Chrome that synced with Dropbox. The version which was distributed on the Chrome Web Store never supported offline, which is sort of sad because that’s the one feature I really wanted. The other source of inspiration was Streamie, the real time node based twitter client which had an absolutely phenomenal scheme for contributing. You just needed to fork the repo on github and all your commits would be visible live on your own subdomain of the site.

Also, in the mean time, I had discovered this pretty awesome text editor (no, it’s not VI or Emacs, because I’m not nearly cool enough or dedicated enough to approach that steep precipice of a learning curve) called Sublime Text 2. I have a pretty good picture now of what exactly I want from a text editor based on that (actually, I don’t know if this is just some delusion which has manifested in my mind because I probably thought the same way about gedit and krita and komodo and aptana and notepad2 and notepad++ and dreamweaver before that).

The real drive to creating that text editor happened in the week before I was scheduled to attend Google I/O 2012. I knew I’d be in some situations lacking Internet, and I felt like writing words or code or something in that time. So in a few days I hacked together this text editor which had a vague resemblance to Sublime Text. It was based on Ace, like SourceKit before it, but obviously a newer version with a whole bunch of syntax and themes included. It used a modified version of the Dropbox component from SourceKit (it was changed moderately to accommodate Dropbox API 2.0 and to deal with binary things like array buffers and blobs). And I added a little heads-up-menu-esque file and command picker, quite reminiscent of Sublime Text (and to a lesser extent Ubuntu Unity).

And then I headed for the airplane having not actually used it much in practice hoping to be super productive while using it, which you could have probably been able to tell by the beginning of this sentence wouldn’t pan out. The first (quite major) flaw which I encountered was some bug which would end up wiping out any file that I tried to edit, and left me with a gaping chasm in my hard drive (this is a metaphor, because my Chromebook is solid state). I just hope nothing of value was lost, but it remains quite likely that it was the otherwise.

I had no real system for testing out changes to my own extension, and I was too paralyzed of the fear of deleting my own extension to try changing it. So the end result was that the entire duration passed with me hardly doing anything productive on the project, or using it for any productive means in itself. All that happened was my discovery that everyone is working on a text editor right now and I should probably quit right then and there and work on something perhaps more novel or productive. But I came back and fixed it and did some more on it, and I’m finally at the point at which I feel comfortable using it for some mostly useless things, like writing a blog post inside the project’s readme about the project itself.

Welcome to CloudFall. Yeah, I’m aware how dumb it sounds, but the fact that the new James Bond movie is going to be called Skyfall essentially demolishes any hope of using that name. But rather than using this as a vindication of how cool that name would be and abandoning it for some novel name, I’m just going to contrive something in a similar vain, hence the current working name. But rather than spending the first few lines complaining about the name of the product, I should probably lay down what exactly this project aims to be. It’s a text editor, not a terribly glamorous concept, I know, and this is especially not a terribly great time to start. This is hardly the first text editor, and certainly will not be the last (until this either never finishes or the world actually does end by the end of this year). It’s not the most full featured, but I guess it does need to have something which distinguishes it.

The main feature is largely inherited from SourceKit, which is the ability to sync with Dropbox. And instead of editing “live” off of Dropbox’s servers, it really is more of a sync. It’s built around Chrome’s FileSystem API, so it has its own sandboxed imaginary folder where it sticks all the files. Every time a file is loaded, it’s first downloaded onto its spot on the imaginary folder and subsequent edits get sent to both the local copy and the server. This architecture, in theory, means it probably won’t inadvertently overwrite, corrupt or delete your important data in the advent of some syncing issues. It keeps track of the latest synced revisions and tells Dropbox’s API that information so that it won’t try overwriting a newer version from elsewhere and It’ll somewhat gracefully save to a different file in such a circumstance (though that routine can hardly be declared well tested, so beware of complications). It should in that way retain very close to full functionality while offline, since just about everything it does is entirely offline (including the compilation of CoffeeScript and LESS, which is explained later on).

Built into the extension are a few of the tiny tweaks that I have installed on Sublime Text which I find fairly useful. For instance, the app includes a copy of the CoffeeScript and LESS compilers, and so whenever you are editing one of those types of files in the text editor, it’ll automatically compile and save it into the same directory whenever you hit the save key. That’s actually pretty useful because it gives CoffeeScript back that REPL (almost certainly a malapropism, but I’m not too familiar with developer work-cycle jargon, so please excuse that grievous offense) dynamic that JS developers are so familiar with. And to aid that sort of work, it can preview your files “live”, even offline. While it can’t open and edit binary files like pictures, it can still download and display them fine.

For writing walls of text which don’t really need syntax highlighting (I wonder if someone has a package on mainstream text editors that syntax highlights free form English grammar, sort of like giving people synesthesia for no real reason), it includes a word count widget. Also, because I feel like encouraging unhealthy behavior, right next to the word count is an indication of your current typing speed. I’m pretty sure someone could go into some moral discussion into why it’s a completely detrimental addition to create something which displays how quickly the author is typing because it encourages a mindset which doesn’t focus on creating the most concise or effective means of delivery for some message, but rather promotes meaningless verbosity. But sometimes I would imagine using this for school assignments and the ilk, and maybe it would be nice to know when I’m reaching my designated quota filling some unwritten word requirement. I’m not quite sure how reliable of a word count it gives, since the algorithm is primitive and by no means nuanced, but it should be able to give a sort of rough estimate of the number of words on a page.

I may have mentioned earlier that there are two primary systems for interacting with the application, both of which are keyboard shortcuts: Ctrl-O and Ctrl-P, meaning the file browser and the command palette, respectively. They both appear in the same sort of interface component and generally behave the same, but there are some slight differences in how they operate (obviously). There are actually two “browsing modes” for the file browser, where it shows files either from the local stored cache or online in Dropbox. That can be switched by selecting either “Browse Mode: Local Copy” or “Browse Mode: Remote Dropbox” from the command palette (though there should probably be some better interface logically placed on the file browser panel itself). Remote Dropbox only works when the user is online, so the default mode is the local one. The sole interface to this list of files is the filter search box on the bottom of that widget, where typing in stuff filters out things. Of the visible items, the cursor can be manipulated by using the arrow keys, and one of the options can be executed by hitting the enter key. The widget needs to be manually dismissed by hitting the Esc key (which allows you to semi-rapidly open several files or try different commands, like changing themes). When a folder is activated, the context then shifts to what is inside the folder, descending into the hierarchy. It’s possible to ascend by entering the directory “../”. To create a file or folder (Note that at time of writing, creating folders does not sync to Dropbox), just type “+yourfile.txt” (or if there are no search results, it’ll automatically select the option to create a file with that name). To delete a file, you likewise just prepend your command with a minus instead, for instance “-oldfile.txt”. It’s not nearly as counterintuitive or confusing as I presume this description is making it sound.

The command palette, given that the interface is basically the same has much in common with the file system, but it’s entirely flat and linear. It lacks the hierarchy that plagues the file system, and navigating it is considerably simpler. Since the list is filtered and updated live with every keystoke, it’s fairly easy to find whatever command you’re looking for. The filtering algorithm only searches for characters but disregards gaps, only paying attention to order. In other words, the query “seals”, may match “SEt SyntAx: coLdfuSion”, it also means that you can construct very short queries to find what you want.

At this point, I don’t think it’s a final project per se, there are still a few features are woefully dysfunctional like Find and Replace. But I think it fits my own use case, editing files on my Chromebook, even when offline. It’s fast and doesn’t take too many steps to start. But the interface isn’t exceedingly intuitive (Basically everything is accessed through two keyboard shortcuts). I haven’t found a witty logo for it (I just haven’t bothered looking), and at this point I’m just writing a blog post about it because I feel obligated to document whatever it is that I just did.

Maybe some day I’ll pick it up again, or if there’s some response at all to this blog post, and add those awesome collaboration features (which probably won’t be very useful because the entire thing is quite hackish and not particularly kind to the prospect of improvements). Maybe I’ll fix up the search and replace interface (read: get rid of whatever horrid mess exists and replace it with a new one from scratch). And maybe I might publish it in some form that is significantly less involved than cloning off github and packaging the app yourself.

Posted in Chrome Extensions, CloudFall.

Tagged with , , , , , , , , , , .


Pinball

Last year, for Computer Science 1, I built a pinball game in Java. Yes, Java, as in sans-script. I wrote it in Eclipse and it was pretty functional. Of course, the biggest part of the game was the collision detection, collision detection which, mind you, I built with only the slightest knowledge of how it’s actually done. So yeah, I made up a scheme. It sucks, but at least it sort of works. I was actually fairly proud of it since it was basically the only thing which was mildly interesting and non teacher directed (that is, it took more than one hundred ninety two seconds to write after the teacher gave a cue to attempt the task). I thought it was so cool in fact, that I decided to port it over to both Javascript and Coffeescript, the latter being a language which was uber-hip and awesome (I believe these events transpired approximately at the time of Coffeescript’s big one-oh release).

In essence, the collision detection scheme I devised was pretty simple. The ball experienced gravity in the form of a constant downward acceleration. It would move up and down. That was maintained by a variable (actually an array to mimic a vector), named position, and another one named velocity, where the y-coordinate for the latter was incremented by some term every time a frame was rendered. The collision detection system deals with the intersection between lines. First it takes the current position of the ball, and envisions that the current velocity is a line extending infinitely in that direction. All the other attributes of the environment are also lines, and the algorithm treats them all as unbounded (rather than segments). It quite trivially computes the intersection between the lines and calculates (with euclidean distance) how far that intersection is from the ball’s current point. If it detects a collision (the distance is less than the specified radius of the ball), then the ball’s direction is altered based on its current position and the tangent of the line it intersected with.

Fast forward a few months, and it’s July of 2011. Since then, I’ve basically forgotten about the pinball game, because it really isn’t cool enough to be worth revisiting. But I was in Virginia’s Summer Residential Governor’s School for Science/Math/Technology (hardly wholly unexpected). It was some dark night and I was in the back row of a white shuttle bus en route to an observatory where artificial light is verboten. Well, it started raining and the trip was cancelled and we were going back and I was bored. So I pulled out my iPad, which I had for some reason, and decided to look at what I had on my Dropbox which I could play around with offline. Lo and behold, there’s a pretty crappy Javascript pinball implementation (which notably, never having been finished is just a ball bouncing between a few lines). I changed the parameters and made weird shapes the ball hopelessly bouncing around in awkward trajectories, and I thought it was cool.

A friend, sitting next to me noticed that the balls were speeding up. He inquired whether or not I was using the euler method, and I didn’t know what that was. Either way, inadvertently, I was in fact using that. But the point was that here he was, making a fuss about the fact that my simulation in two dimensions blatantly ignores the conservation of momentum. And as such, I was compelled to finding a solution. I studied the code for a bit longer, trying to expose the nasty snippet which might have made it evade the laws of physics, albeit to no avail. In the midst of my struggle, he noticed at some point that since the trajectory paths are simple parabolas, it should be possible to find a closed form solution to the collision problem (rather than my iterative approach. In the end, I nobly gave up on my epic quest to fix the universe, and worked on some other crap.

A week ago, I finished my first formal (that is, in school) calculus class, where I learned what euler’s method was and how I inadvertently implemented it. But I still had to take the class final, which the teacher rather kindly decided to make extremely easy. And since I also had a huge math portfolio (which I decided to typeset with LaTeX, as I have for nearly every paper this year, for no real reason whatsoever) due that day, he made it take home, with the only caveat that we aren’t to collaborate on the final in his presence (eg. after school in his room), a subtle nod of assent to all other forms of aid. Well, at home, I thought that I should probably make use of all the tools I have at my disposal. I thought of that copy of Mathematica which hopefully wasn’t collecting dust on my hard drive platter because that might lead to a head crash, but alas was figuratively gathering dust.

I decided to start playing around with the awesomeness that is a computer algebra system, named after a guy whose last name is that otherwise inexplicable W on the periodic table (purely coincidental, unless Stephen Wolfram is actually a british iron manganese tungstate mineral). In what felt to be no time (at least it seemed interesting and productive enough that I in no way thought to discern the real time which elapsed), I had completed my final exam in a somewhat minimalistically formatted Mathematica notebook. I had, figuratively, fallen in love.

Soon afterwards, as I was working on some other completely unrelated project (creating a intraframe-only WebM encoder), I needed something somewhat cool to generate a series of frames to be encoded into video. Digging through my (newly reorganized) Dropbox project folder, I found that old incomplete pinball game. And I remembered that story and that brief suggestion and decided to use my newfound Mathematica prowess (that is, my ability to use Solve[] with some degree of it not puking) to find that solution to the collision between a parabola and a line. To my dismay, Mathematica returned an ungodly mess of an equation. And it also didn’t work for vertical lines.

But eventually, I did manage to get it to work. And thankfully now it obeys the laws of physics, at least to about 10×10^-13. And I also realized that that demo isn’t actually that cool but I used the older version anyway for my video encoder demo. So that’s the happy ending, everything lived happily ever after despite the fact the algorithm is still totally buggy and breaks when things are either too fast (velocity > 4000px/second) or too slow (velocity < 1px/second). And with corners, it occasionally fails. But I’m long since past the threshold of caring. At least now, you can speed it up a lot because the path prediction and the collision detection are completely separate from the render cycle (which also lets the speed become independent of processor speed). It uses requestAnimationFrame and other stuff, and if you want to try it, here.

Posted in Games, Pinball.

Tagged with , , , , , , , , .


Visualizing Facebook Activity

You might have noticed that I haven’t written much for this blog in the past few months. In truth, it’s because of school work, which has never really been something of an issue before. This is, quite probably the least productive stretch of time in my life thus far. I have a suspicion that this issue stems more psychologically than due to some radical increase in work load, but I haven’t looked in to testing that hypothesis (I’ve been collecting data hour-by-hour about what I’ve been doing in the past two years, so I could probably look into it if I were actually interested in that matter). But school’s nearing a close, and hopefully I can get back to a more productive lifestyle, maintaining my blog and most importantly, trying out cool things. I have a few things which I am working on at the moment which should be completed in the coming weeks (though I make no assurances). But since I have an internal goal for writing one blog post per month, I’m going to recycle a project from December of 2011.

Nearly every day, I inevitably end up glancing at my Facebook “buddy list” of sorts, wondering how many people are online. It’s a figure which almost always seems to depend on the time of day, and behaves almost like clockwork, there’s always a massive swarm of people online around 10-11pm, and hardly anyone is ever online at 4 in the morning. I guess the problem with drawing any conclusions from this in particular is how specific a group this graphic represents. It constitutes my friends, and in particular, my Facebook friends. Essentially all of them are people I’ve encountered in real life, and may or may not actually find interest in. But the thing that unites just about everyone is that they’re generally high school aged.

Before going on discussing how pretty of a chart this is, I think it’s worth going through what this chart actually represents. It’s quite easy to tell that this is in fact a polar chart, and on the inner circle, you can tell that it’s a 24 hour clock. Each of the rings represents a friend, and the rings are sorted by the total amount of time spent on Facebook in the given period. So you can see that toward the middle, the graph is almost opaque at every time, whereas on the fringes, the online activity is quite erratic and infrequent.

So, where does this data come from? It’s actually quite simple to get from the Facebook API. I have a cron job which runs every minute to run a FQL request and save the results to a specific log file.

The actual FQL which runs in order to retrieve the list of online users is

SELECT uid, name, online_presence FROM user WHERE online_presence IN (‘active’, ‘idle’) AND uid IN (SELECT uid2 FROM friend WHERE uid1 = me())

Basically, get the User ID, the name, and their online presence state for friends who are either active or idle in the list of the logged-in user. Since Facebook is an OAuth2-type API, you need an access token in order to do anything cool. I just use the Facebook Graph API Explorer to generate my access tokens. Just go press “Get Access Token”, and select (at minimum) the permissions “user_online_presence”, “friends_online_presence” and “offline_access”. Then copy and paste the revealed token into some authkey.txt and you should be set.

I have a python script to go through the log file and to render it as the polar chart which is depicted on the top of the page. The code used for that is frankly atrocious and the output is even more so. Python Imaging Library is used, which is a lovely library, except not for drawing graphics. There isn’t any smoothing or anti-aliasing on the arcs drawn by PIL and they all look hideous. So I render the chart at some absurdly high resolution and down-resize it in GIMP while adding layering, blurs and opacity in order to make the picture somewhat less atrocious. Also, it does’t support restricting the app to drawing a specific day of the week, even though it might be interesting to see the how the trend differs on a weekday versus weekend.

Something interesting about the appearance of the polar graph is that it almost resembles something of a digital fingerprint, and that brings up some interesting privacy considerations. Inside that graphic are the Facebook browsing habits of some two hundred people. There’s the question of how much this changes day by day for users, and to what extent this can be used to identify people. And even if a single ring doesn’t unambiguously represent a single person, the two hundred or so rings of their friends probably goes pretty far into identifying people. There’s also a striking amount of uniformity that says a lot about the type of people who I tend to associate with. Just at a glance, one can tell that there are very few people I’m friends with on Facebook who live in different timezones. Maybe what’s more dangerous than being able to identify a person is to be able to identify what kind of groups that person belongs to. And over the course of a day, just about everyone checks Facebook a few times.

Posted in Data, Facebook Visualization.

Tagged with , , , , , , , , , , .


Chrome Multitask Mode for Real with Multi-Pointer Xorg

Google’s multitask mode was only an April Fool’s joke, but inordinately amused among us (read: me) might even venture into attempting such an irrational feat. While watching the video, it evoked some memories of something which I had stumbled across a few years ago, called Multi-Pointer X. The rationale for creating that wasn’t nearly as insane as the Chrome Multitask video made it out to appear, instead I think it was just the framework to support multitouch and other sorts of alluring interactions on the Linux platform.

So I decided to look into that again and within a few minutes of Googling, it appears that Multi-Pointer X has now been incorporated into the X.org mainline as part of XInput2. So what exactly does it take to conjure and insult the multitasking gods?

The first step is quite logically to plug in another mouse. I happen to have three mice plugged into my computer anyway (there’s a perfectly logical rationale for how this came to be, my keyboard is some  wireless Microsoft keyboard+mouse set but I really hate the scrollbar on the Microsoft mouse, so I instead used another mouse for the longest time until it started failing and sheer lethargy prevailed over disconnecting that obsoleted peripheral) but regrettably, I don’t have an additional prehensile appendage to operate the third mouse.

Right now I’m using a pretty much unmodified version of the latest version of Ubuntu 11.10. I just opened up a terminal window and typed in “xinput list

antimatter15@antimatter15-desktop:~/online$ xinput list
⎡ Virtual core pointer id=2 [master pointer (3)]
⎜ ↳ Virtual core XTEST pointer id=4 [slave pointer (2)]
⎜ ↳ A4Tech PS/2+USB Mouse id=8 [slave pointer (2)]
⎜ ↳ Microsft Microsoft Wireless Optical Desktop® 2.20 id=11 [slave pointer (2)]
⎜ ↳ USB Laser Wheel Mouse id=9 [slave pointer (2)]
⎣ Virtual core keyboard id=3 [master keyboard (2)]
↳ Virtual core XTEST keyboard id=5 [slave keyboard (3)]
↳ Power Button id=6 [slave keyboard (3)]
↳ Power Button id=7 [slave keyboard (3)]
↳ Microsft Microsoft Wireless Optical Desktop® 2.20 id=10 [slave keyboard (3)]

Use the command “xinput create-master Auxiliary” in order to create a new input device, and try running “xinput list” again to confirm that the command has done your bidding.

antimatter15@antimatter15-desktop:~/online$ xinput create-master Auxiliary

Now it’s time for the third (or is it fourth) and final step, to reattach one of those master pointers into the auxiliary pointer. To do that, pick out some mouse. For me, I picked my “A4Tech PS/2+USB Mouse” which is was my mouse with a broken scroll wheel. You can see that it’s been given ID=8, so that’s the number I’ll be using for the next step.

antimatter15@antimatter15-desktop:~/online$ xinput reattach 8 "Auxiliary pointer"

And then I can now use two mice at the same time for whatever ungodly purpose I desire.’

It’s hard to take screenshots since it appears that every screenshotting or screencasting app which I can find seems to make the wholly unwarranted assumption that a desktop computer only has one cursor, but it does actually work. Though I don’t think I’ll ever be able to multitask through this scheme, it’s mentally jarring in far too many ways.

 

Posted in Fun.

Tagged with , , , , , , , , , , , .


MusicAlpha v2.0

MusicAlpha works again, and this wasn’t some minor update that fixed a small hole, the entire app has been rewritten. There’s a new user interface which has really quite beautiful animated polar progress bars, but the most significant part is that the mystical protobufs-over-https protocol has been decoded and documented.

It really started on January 30th, when Simon Weber contacted me for information on the upload process because his Unofficial-Google-Music-API received various user requests to implement uploading. I remembered a message sent a little while earlier offering help decoding the protobufs and tried out the method described, which involved using a disassembler to patch the curl_easy_setopt method to disable checking of SSL certificates. This turned out to be the trick and I used Fiddler to collect a few dumps of data for subsequent analysis. That was really quite a magical moment because everything afterwards was relatively straightforward. The huge stumbling block which had prohibited progress for these past months had been lifted and the task which was left, of reverse engineering the protocol, seemed eminently viable. With only minor hiccups over the next few days, I built a python prototype of the program. I created a .proto file which was somewhat human readable with field names based on guesses.We realized some pretty cool things or at least got a much better understanding of how Google Music seems to work (This blog post was started in the early part of February, but I’m finishing this up halfway through March).

For instance, the reason Music Manager doesn’t work on virtual machines is because they use the computer’s MAC address as a system for identifying individual devices. They have a cap on the number of devices which can be authorized to manage your music, which is currently set to 10. This caused a bit of confusion when I hit that device cap while testing (At one point I assumed that the MAC-esque field was actually a random string, so after fifteen or so tests everything started silently failing). However, it occurs to me that it isn’t particularly interesting to read the narrative of how exactly the current version of the application has come to exist. The technical overview of how the actual protocol works can be found on the google-music-protocol project page on Github. It includes a brief overview of the mechanism as well as sections which detail the specific messages which are sent to the server and also includes a simple reference implementation. Hopefully it should be enough information for anyone who is actually interested in doing something which relates to it (though I am curious if anyone could possibly create something interesting out of it, please contact me if you decide to do anything at all with it). So at this point, I’d like to skip over to the actual application-specific aspect: MusicAlpha itself. This is the story of how the rewrite came to be, and the rather lengthy time it took to to write this blog post describing it.

One of the first things I realized was that the entire application would essentially need to be rewritten. The code dealt primarily with altering the attributes of a song by mimicking the web client’s interface, something which wouldn’t be very useful now that all the metadata is sent via the magical protobufs scheme. Since that code gets scrapped, I figured why not scrap the rest. I wrote the first version while experimenting with darker color schemes and obscenely prominent border radii and gradients. I’m not sure that I can say it’s objectively better, but it does look a bit more interesting, especially that big two-ringed polar chart.

There’s some text to the right side, but it’s not particularly of interest. There’s a big file selection button, but that’s pretty uninteresting. Embroidered on the solid gray backdrop is the logo, a simple and unchanged music symbol- unchanged from the last version. I used, for the first time, custom web typography. Just some fancy looking sans-serif fonts I picked from Google’s Font Library. On the old application, I had two progress bars to show the upload progress, one is for individual songs and the other is the overall progress. I wanted to portray upload progress in a manner which was somewhat smoother and more aesthetically alluring than a mere linear bar.

In trying to create that better progress bar, I created a polar progress bar in canvas. All updates are smoothly interpolated and watching the bars spin around and invert is somewhat reminiscent to staring at an old record spin on a turntable, something which I’ve only seen a few times in my (rather short, at this moment) life, but magical enough to still inspire a sense of nostalgia.

Central to making an aesthetically appealing polar chart is making the animation smooth. This might be doable through various esoteric CSS transforms, but I’m not particularly knowledgeable in that area, and I’m not a huge fan of crafting elaborate animations with CSS. It lacks a certain amount of control (which manually updating and redrawing affords) and tends to feel extremely sluggish on computers which lack extremely modern graphics cards (such as mine). Since things like uploading a file involves a lot of guesswork and various aspects of the upload process are very much stochastic (ie. waiting for the server to respond), the smoothing system needs to be able to create a nice smooth animation out of highly erratic information. But it also needs to simultaneously relay information without having an excessive amount of lag.

In order to strike a sort of balance, it uses an interpolation algorithm which was employed in Steve Wittens’ js1k entry and you can see my code here. In order to minimize the CPU tax of running such an animation, it uses the HTML5 requestAnimationFrame function. This has the nice side effect that when you leave it uploading in the background for a while, it automatically stops updating the animation and quickly slides into position when you return.

There are two rings, the outer one being dark gray and the inner one being a somewhat lighter shade of gray (fitting into the mostly unsaturated theme of the interface). The outer ring represents the total upload progress while the inner ring represents the progress of an individual file. The outer ring “spins” in a sense, in a clockwise manner while the inner ring rotates in the opposite direction. Every time a ring completes a revolution, it “inverts”, for instance, when the outer ring fills up and becomes a solid circle, the ring begins to open from the opposite direction so that the imaginary end of the rotation continues unimpeded. The animation is somewhat difficult to describe in words so here’s a demo of the progress animation.

When the page first loads, the polar plot exists in the background, behind the prominent logo. It represents the song quota, the percentage out of the 20,000 song quota which Google Music imposes.

While implementing the algorithm, I had a little trouble because of SSL certificates. The domain which accepts all the protobufs encoded information has a certificate signed for the wrong domain, and browsers have a tendency to care a lot about certificate validity. To get around it, those requests are proxied by a simple server which is made with Google App Engine. All communication with that server itself is signed as well, and it doesn’t do any logging.

When I was about to publish this post and first announced that it works again, there was a bit of trouble which people had getting it to work. This was actually because Google Music allows each MAC address to be associated with a few accounts but forbids any additional accounts. The first released version would use a single mac address for all clients, but it has since been replaced with a sort of unique identifier which is stored in localStorage.

Apparently a browser based uploader may be something coming soon to Google Music. Enjoy this app while it lasts.

 

Posted in Chrome Extensions, MusicAlpha.

Tagged with , , , , , , , , .