Skip to content


Distributed Pi: Revisited

So I’ve been interested in distributed computing for some time, since 2007, basically around the time I started doing web development. I’ve always sort of romanticized the notion of distributed computing because of its vast theoretical potential. Projects like BOINCSETI@Home and Folding@Home have always given me some kind of idealized notion of “computing for good”, inspiring some kind of useful social, scientific or otherwise beneficial change to humanity. But those projects never got the kind of adoption which could truly change the world, together they form networks which are specialized but can, in the end, only eke relatively minute performance.

Part of the problem lies in their intrinsic forbidding voluntary nature. It takes too much effort to install and amplifies the problems intrinsic to any “democratic” system: why bother? Voter apathy often stems from a feeling that an individual contribution regresses towards nothing, which is statistically certainly true, but never helpful if it plays a part in the participation of one of these collective entities.

In a broader sense, I’ve always suspected that part of what is necessary for technological progress is the loss of control. Certainly something which appears true from a human experience, that the vast arrays of neurons bound by a cranium are uninviting and wholly unwilling to expose their inner raw computation power to the emergent conscience within them. No doubt the human brain is good at computing, just not (as it appears) math, but intuition (especially of the physical kind) can only arise when a system internalizes some very complex math. We just can’t access it, because evolution or some other process has determined that it was something that had to be done away with on the climb to higher cognitive powers.

Likewise, I think computers are convergent in that sort of way. There’s a great ladder of abstraction which is growing taller and taller, a tower of babel of sorts, and at the end, perhaps we’ll find a similar goal, maybe not of god per se, but an equally sought target of consciousness or some other high level intellectual faculties contemporaneously denied to computers.

Part of this is losing control, which is inevitable and politically dangerous. Computing terminals are powerful and capable of much more than they’re being used for. In fact, most computers spend most time idling, the processors get ever faster not for the handling of the idle time but in order to smooth out the few bursts that actually require fast computation.

It’s impractical for operating systems to build into them some kind of distributed computing platform, however ideal that would be. It’s too contentious to ever get adopted, conceptually a short step away from the ability for a mega-corp to pilfer your precious information as well.

But the browser, specifically the web, provides us with an interesting opportunity. Here, we have significantly less fear of personal privacy, since it comes with an expectation of sorts for information sharing. The existence of client side scripting and its prevalence gives an implicit permission to exploit system resources during the site’s tenure.

Now, of these granted permissions, we have the freedom now to exploit in order to create something truly remarkable. Because it’s not so much voluntary on the part of the user, so much as voluntary on the part of the site maintainer who now has the responsibility of allocating and managing the system resources of his visitors, for their brief but additive virtual encounters. The users lose control, and that’s a good thing.

Old Stuff

When I first wrote that introduction, I mentioned that I was interested in distributed computing for quite a while, and that’s true. 2007 was, at time of writing five years ago. A bit less than a third of my life, which is fairly significant. It doesn’t even quite belong to recent memory, and perhaps has escaped living memory into something quasi-zomboid. Part of the reason why this section is relatively short compared to the other ones is because I really don’t have a very good memory of what that was like, and it’s rather sad that my previous attempts never had long writeups regarding their potential and process (however, there do tend to be more little updates about the process).

Anyway, as it’s been such a long time, I’m trying to dig through old stuff to pick out exactly what I did back then. The first things I can find are somewhat easier and simpler problems, finding palindromes, or at least words which spell different words when reversed (I believe this was based on a program I had written a few years prior in Visual Basic). Another was for cracking hashes in a distributed manner.

Back then, I think those were one of the first explorations into the concept, of distributed computing in the browser. It was the days before WebWorkers and Cross-Origin Resource Sharing was in its infancy. The pool of possible computation was however, still large but most of it was forbidden. Take note, however that distributed and parallel computing have existed for far longer, perhaps even longer than the idea of a computer.

Computers get faster each year, thanks to Moore’s Law and more and more people get connected to the Internet each year. Perhaps in addition to Metcalfe’s law with regard to the value of a telecommunication’s network, is the ability to harness idle computation power in order to increase the vale of its participants linearly as the network grows. There are almost a billion people who are connected to the Internet, and however insignificant of a contribution each of those terminals brings to the network would add up into something truly remarkable.

However, hash cracking and palindrome finding are fairly trivial. They have little real world practicality and don’t seem in any way representative of the greater problems facing humanity. While it’s a little impractical to aim for some kind of project which has immediate applications to the value of humanity, it’s certainly a worthy target which is worth approximating. They’re isolated examples which are easy to parallelize and hardly count as real ventures into the field.

Calculating digits of pi is marginally less useless and represents something which is significantly harder and tasks which may be closer to the kinds of computations which are performed in the real world. There have been other projects with similar goals, most notably, PiHex, which acts as a sort of vindication of this as a possible legitimate attempt. At this point, I have no idea if the algorithm works with digits in the order of trillions, and that’s one of the big reasons I’m not actually trying to succeed PiHex.

Revisting

Okay, so I decided to dig this up again.

Why? Because I really want to play around with server-side JS a little more (just in case getting a node vps has anything to do with merit). The sort of funny thing is that’s exactly what it used to be, back in the ancient past, before NodeJS existed (it was before Google Chrome was released, and V8 wasn’t open source). I used to have an application which would schedule jobs for computation on the client and provide them in a more useful format.

However, I never bothered saving the files outside of the AppJet web IDE and the code was lost when the service was discontinued. I tried porting to Google App Engine, but that version was plagued with strange bugs which ended up printing out the wrong digits after a few thousand were computed.

Right now, revisiting the idea, I’ve added a few things which are somewhat more characteristic of the change in web browsers since my first attempt. The old version tried to mock threading by the liberal application of setTimeout, which meant that most interface interaction wouldn’t be terribly affected. However, it did incur a noticeable slowdown. Now, it uses WebWorkers, which provides numerous interesting possibilities. First and perhaps most important is context isolation and lightweight, asynchronous embedding in multiple domains (origins). Since WebWorkers can work (see what I did there?) across multiple origins and still have access to XMLHttpRequest, and the advent of Cross-Origin Resource Sharing (CORS), it’s easy to embed this in a page. The low embedding overhead and the true multi-threading abilities bring a lot in the way of making this more than an intriguing concept and into something much more practical.

The code, which is now on the Github repository in a new folder named node has a very basic interface. Rather than manipulating the widths of divs for progress indicators, now it uses <progress> (it’s actually really cool to see how far the state of web platform’s state of affairs has changed over the past few years). One cool thing about the progress bar is that it guesses progress by dividing the current prime number by the end point, which has an interesting effect of making the progress bar faster towards the end (this happens because the distribution of prime numbers asymptotically declines). While I could possibly invert the effect in order to create a more linear progression fairly trivially, behavioral studies indicate that people who stare at progress bars all day feel less irritated (i.e. more satisfied) when the bar speeds up in the beginning and end (then again, the progress indicator isn’t really meant to be shown to people and if it is, the user experience is hardly something which is being optimized for, and this does nothing in the way of speeding up the beginning, in fact it’s quite the opposite by slowing down the beginning, so perhaps it should first linearize the progress and then map it to some function which as accelerated starts and ends, perhaps some trigonometric function).

Part of what is cool about the project is the underlying algorithm, a port of Fabrice Bellard’s optimized version of the Bailey–Borwein–Plouffe algorithm. Unique to the algorithm is that it uses up comparably little memory and as such it’s uniquely suited to distributed computing, especially in the browser environment which demands relatively thin clients (however, browser caching means that the download would only necessarily need to be completed once, and embedding the computation in an iframe with certain storage permissions and appCache could allow persistence). It’s a relatively short algorithm and fairly easy to port.

However, as it’s a digit extraction algorithm, it’s not very efficient for calculating digits sequentially. That actually is something of a feature which I neglected in the so-called modern incarnation. Older versions had the prime number (sub-job) allocation done on the server, which meant using up a lot of memory in keeping track of the jobs. The current one has a server which is entirely unaware of the actual computational process, which leads to being lighter persistence-wise but comes at the obvious cost of losing granularity in scheduling.

One optimization about this strategy is that a client doesn’t have to wait for a server to respond with another job request in order to continue. Instead, it operates in an almost completely asynchronous manner. On the client’s first request, the server gives it a job which is given by a starting point (a prime number to continue from) and a digit number. From that point, the client begins computing until that digit is complete, sending its progress back to the server once in every specified interval (a couple of seconds).

If the client was to disconnect in the middle of a computation (which can take quite some time, especially in later digits, so its more than certain to happen), the server will be able to resume the calculation by sending the request to another client after a certain expiration date has passed. So future refinements to the server (and possibly the client) could make it more efficient with processing digits of less significance by allocating jobs and taking into account certain checkpoints. For instance, all the prime numbers between 3 and some integer N have to be iterated through (where N is approximately 3 times the number of digits from the radix). Rather than maintaining a “single-threaded” system (as it currently does, the parallel processing power comes from sending out multiple digit segments to be processed), it could instead send out simultaneous requests for different segments. The client may have to be modified in order to stop at certain designated checkpoints to prevent clients from overlapping.

This sort of scheme would be more efficient than storing jobs used in previous versions, first of all by retaining the sort of “dumb” server which doesn’t do any real computation that contributes to the process. Instead it only acts as a mediator and persistence layer (as perhaps is the best role to give to any server). Rather than keeping track of every single prime number as a different row in some database, the server would only need to keep track of as many sections as clients exist.

The new version also has some interesting changes. First is the switch to NodeJS, which was hinted at before, since it acts as a persistent server rather than something which is called and disposed like a CGI process, the queuing system is now entirely in-memory. However, as soon as its possible, it writes out the completed digits into a file (pi.txt), and if the server is ever interrupted the computations resume from the last saved version of the digits (only the jobs which were processing during the shutdown are lost). However, this would need to be changed if it were to be some more ideal use of the digit extraction method, since in such a situation, the vast majority of jobs would be for a single span of digits (and it would really suck to lose all of those).

So in a sense, this update constitutes a regression, a step in the wrong direction in that it’s using the algorithm wrong. However, in another sense, it paves the way to a set up which has a few more ideal properties, a lighter weight server end and a more efficient use of computation time on the part of the clients. In essence, the shift from fixed sized discrete computational tasks (or jobs) to mere ranges. The obvious benefit to the latter is the reduced memory demand and the ability to operate continuously, i.e. without pausing for acquiring tasks and behaving truly asynchronously (in a very Node.JS style). Perhaps taking it further along this direction (since WebWorkers are allowed to access applicationCache), clients could be given a starting point and are just let to rip through a few hundred cycles, synchronizing less frequently. Some specialized data structures might be employed to keep track of individual contributions so that the ranges are doled out optimally. Maybe this should instead evolve into some kind of successor to the PiHex project rather than some incredibly slow and inefficient sequential pi calculation platform.

One thing which I didn’t explore was the concept of having a persistent socket to the server. For this set of circumstances, the benefits of maintaining a persistent bidirectional socket weren’t large enough to warrant that kind of development. Right now, it’s communicating through a few small GET requests to a server. Part of the reason GET was used rather than POST was that it was designed with the idea that it could theoretically run in a cross-domain manner, and sending a POST request requires preflighting with CORS headers. Since it’s only ever really sending 30 or so bytes at a time, any additional overhead should be avoided. But this itself is also perhaps an argument for using WebSockets (though certainly not long-polling, since that incurs even more overhead). Quite often the synchronization requests don’t require the server to give anything back, but HTTP-wise, the server’s always going to have to send about a hundred bytes of headers to fulfill the HTTP requirements. Also, whenever the server sends a request, it wastes about a third of a kilobyte with header information like the User-Agent. Network-wise it might in the end be more efficient just to have a low-overhead persistent socket connection.

Having a persistent socket would make synchronization interesting as well. While the changes wouldn’t be terribly drastic since there’s still the goal of minimizing transmission overhead, a few changes to the scheduling could be conveyed without necessarily waiting to the next checkpoint. And of course, there’s the trade-off which comes with the question of how long the checkpoints should be spaced. Right now, it’s something fairly low in the range of 10 seconds since the duration of a single page visit usually isn’t much more than a few seconds. However, if a client-side persistence layer is added, the checkpoints could be spaced out much wider. Perhaps instead of every few seconds, one could be made for every minute, hour, day or even week. Where a single computational task allocation could span multiple domains, multiple days and while the user is online and offline. However, having wider-spaced checkpoints does incur a cost in terms of synchronization. It reduces the operator’s sense of intimate awareness of what’s actually happening and increases the chance of overlaps.

Another possible step is some kind of framework which handles all of these problems automatically. Instead of having to manually manage variables and the synchronization of tasks between clients and servers, one could construct a domain specific language for distributed computing (a superset or subset of javascript which automatically manages the state of variables in loops and such to compile into some kind of client for generic parallelized algorithms). Maybe it’ll do something cool and look at which loop is the optimal one to split up based on the amount of data which needs to be sent across so that it could be minimized.

One of the things I played with while revisiting this project was playing with LLJS, another kind of specialized language which builds upon Javascript. As it’s marketed, the “bastard child of Javascript and C”, which manually manages memory. I was hoping that using a typed language might bring some speed improvements (not really, in this situation). However, LLJS might be a good basis for this auto-magical compiler for synchronous routines into largely parallelized code. However, maybe there are in fact limits to parallel computing, and it’s better to search for specific algorithms which have the right properties. And maybe in the end, the problem of porting it to Javascript and managing the client-server communication fades into a relative triviality.

Posted in Javascript Distributed Computing.

Tagged with , , , , , , , , , , , , , , , , , , , , , .


CloudFall: A Text Editor

This is probably a horrible time to think of writing an online text editor. The “market”, if you can call it that, is virtually saturated with worthy contenders and (figuratively) every day is marked by the entrance of some other text editor to the already crowded space. Part of the problem is that now, web based text editing components are actually pretty awesome. Ace and CodeMirror work really quite well and each have associated with them some very formidable integrated development environments, referring to Cloud9 and LightTable, respectively (albeit there are quite a few others, these seem to be probably the most funded). And at Google I/O 2012, they announced the new Chrome Packaged Apps, which expose new functionality (most relevant being the ability to open and save files from the local disk and operating offline by default), and their sample apps include no less than two text editors.

Nevertheless, I’ve always pined (probably an exaggeration) for the opportunity to indulge in something as meta as writing a text editor which I could use as a text editor for the text editor (which is, if you did not notice, that text editor). It’s probably the pinnacle of dogfooding (probably nice coming after a streak of making extensions that I never actually use). But this doesn’t really go anywhere in terms of explaining why I’m doing this rather than using someone else’s rather better developed text editor package. It really comes down to this chrome app which I was quite a fan of almost a year ago, called SourceKit. SourceKit is a text editor which runs in Chrome that synced with Dropbox. The version which was distributed on the Chrome Web Store never supported offline, which is sort of sad because that’s the one feature I really wanted. The other source of inspiration was Streamie, the real time node based twitter client which had an absolutely phenomenal scheme for contributing. You just needed to fork the repo on github and all your commits would be visible live on your own subdomain of the site.

Also, in the mean time, I had discovered this pretty awesome text editor (no, it’s not VI or Emacs, because I’m not nearly cool enough or dedicated enough to approach that steep precipice of a learning curve) called Sublime Text 2. I have a pretty good picture now of what exactly I want from a text editor based on that (actually, I don’t know if this is just some delusion which has manifested in my mind because I probably thought the same way about gedit and krita and komodo and aptana and notepad2 and notepad++ and dreamweaver before that).

The real drive to creating that text editor happened in the week before I was scheduled to attend Google I/O 2012. I knew I’d be in some situations lacking Internet, and I felt like writing words or code or something in that time. So in a few days I hacked together this text editor which had a vague resemblance to Sublime Text. It was based on Ace, like SourceKit before it, but obviously a newer version with a whole bunch of syntax and themes included. It used a modified version of the Dropbox component from SourceKit (it was changed moderately to accommodate Dropbox API 2.0 and to deal with binary things like array buffers and blobs). And I added a little heads-up-menu-esque file and command picker, quite reminiscent of Sublime Text (and to a lesser extent Ubuntu Unity).

And then I headed for the airplane having not actually used it much in practice hoping to be super productive while using it, which you could have probably been able to tell by the beginning of this sentence wouldn’t pan out. The first (quite major) flaw which I encountered was some bug which would end up wiping out any file that I tried to edit, and left me with a gaping chasm in my hard drive (this is a metaphor, because my Chromebook is solid state). I just hope nothing of value was lost, but it remains quite likely that it was the otherwise.

I had no real system for testing out changes to my own extension, and I was too paralyzed of the fear of deleting my own extension to try changing it. So the end result was that the entire duration passed with me hardly doing anything productive on the project, or using it for any productive means in itself. All that happened was my discovery that everyone is working on a text editor right now and I should probably quit right then and there and work on something perhaps more novel or productive. But I came back and fixed it and did some more on it, and I’m finally at the point at which I feel comfortable using it for some mostly useless things, like writing a blog post inside the project’s readme about the project itself.

Welcome to CloudFall. Yeah, I’m aware how dumb it sounds, but the fact that the new James Bond movie is going to be called Skyfall essentially demolishes any hope of using that name. But rather than using this as a vindication of how cool that name would be and abandoning it for some novel name, I’m just going to contrive something in a similar vain, hence the current working name. But rather than spending the first few lines complaining about the name of the product, I should probably lay down what exactly this project aims to be. It’s a text editor, not a terribly glamorous concept, I know, and this is especially not a terribly great time to start. This is hardly the first text editor, and certainly will not be the last (until this either never finishes or the world actually does end by the end of this year). It’s not the most full featured, but I guess it does need to have something which distinguishes it.

The main feature is largely inherited from SourceKit, which is the ability to sync with Dropbox. And instead of editing “live” off of Dropbox’s servers, it really is more of a sync. It’s built around Chrome’s FileSystem API, so it has its own sandboxed imaginary folder where it sticks all the files. Every time a file is loaded, it’s first downloaded onto its spot on the imaginary folder and subsequent edits get sent to both the local copy and the server. This architecture, in theory, means it probably won’t inadvertently overwrite, corrupt or delete your important data in the advent of some syncing issues. It keeps track of the latest synced revisions and tells Dropbox’s API that information so that it won’t try overwriting a newer version from elsewhere and It’ll somewhat gracefully save to a different file in such a circumstance (though that routine can hardly be declared well tested, so beware of complications). It should in that way retain very close to full functionality while offline, since just about everything it does is entirely offline (including the compilation of CoffeeScript and LESS, which is explained later on).

Built into the extension are a few of the tiny tweaks that I have installed on Sublime Text which I find fairly useful. For instance, the app includes a copy of the CoffeeScript and LESS compilers, and so whenever you are editing one of those types of files in the text editor, it’ll automatically compile and save it into the same directory whenever you hit the save key. That’s actually pretty useful because it gives CoffeeScript back that REPL (almost certainly a malapropism, but I’m not too familiar with developer work-cycle jargon, so please excuse that grievous offense) dynamic that JS developers are so familiar with. And to aid that sort of work, it can preview your files “live”, even offline. While it can’t open and edit binary files like pictures, it can still download and display them fine.

For writing walls of text which don’t really need syntax highlighting (I wonder if someone has a package on mainstream text editors that syntax highlights free form English grammar, sort of like giving people synesthesia for no real reason), it includes a word count widget. Also, because I feel like encouraging unhealthy behavior, right next to the word count is an indication of your current typing speed. I’m pretty sure someone could go into some moral discussion into why it’s a completely detrimental addition to create something which displays how quickly the author is typing because it encourages a mindset which doesn’t focus on creating the most concise or effective means of delivery for some message, but rather promotes meaningless verbosity. But sometimes I would imagine using this for school assignments and the ilk, and maybe it would be nice to know when I’m reaching my designated quota filling some unwritten word requirement. I’m not quite sure how reliable of a word count it gives, since the algorithm is primitive and by no means nuanced, but it should be able to give a sort of rough estimate of the number of words on a page.

I may have mentioned earlier that there are two primary systems for interacting with the application, both of which are keyboard shortcuts: Ctrl-O and Ctrl-P, meaning the file browser and the command palette, respectively. They both appear in the same sort of interface component and generally behave the same, but there are some slight differences in how they operate (obviously). There are actually two “browsing modes” for the file browser, where it shows files either from the local stored cache or online in Dropbox. That can be switched by selecting either “Browse Mode: Local Copy” or “Browse Mode: Remote Dropbox” from the command palette (though there should probably be some better interface logically placed on the file browser panel itself). Remote Dropbox only works when the user is online, so the default mode is the local one. The sole interface to this list of files is the filter search box on the bottom of that widget, where typing in stuff filters out things. Of the visible items, the cursor can be manipulated by using the arrow keys, and one of the options can be executed by hitting the enter key. The widget needs to be manually dismissed by hitting the Esc key (which allows you to semi-rapidly open several files or try different commands, like changing themes). When a folder is activated, the context then shifts to what is inside the folder, descending into the hierarchy. It’s possible to ascend by entering the directory “../”. To create a file or folder (Note that at time of writing, creating folders does not sync to Dropbox), just type “+yourfile.txt” (or if there are no search results, it’ll automatically select the option to create a file with that name). To delete a file, you likewise just prepend your command with a minus instead, for instance “-oldfile.txt”. It’s not nearly as counterintuitive or confusing as I presume this description is making it sound.

The command palette, given that the interface is basically the same has much in common with the file system, but it’s entirely flat and linear. It lacks the hierarchy that plagues the file system, and navigating it is considerably simpler. Since the list is filtered and updated live with every keystoke, it’s fairly easy to find whatever command you’re looking for. The filtering algorithm only searches for characters but disregards gaps, only paying attention to order. In other words, the query “seals”, may match “SEt SyntAx: coLdfuSion”, it also means that you can construct very short queries to find what you want.

At this point, I don’t think it’s a final project per se, there are still a few features are woefully dysfunctional like Find and Replace. But I think it fits my own use case, editing files on my Chromebook, even when offline. It’s fast and doesn’t take too many steps to start. But the interface isn’t exceedingly intuitive (Basically everything is accessed through two keyboard shortcuts). I haven’t found a witty logo for it (I just haven’t bothered looking), and at this point I’m just writing a blog post about it because I feel obligated to document whatever it is that I just did.

Maybe some day I’ll pick it up again, or if there’s some response at all to this blog post, and add those awesome collaboration features (which probably won’t be very useful because the entire thing is quite hackish and not particularly kind to the prospect of improvements). Maybe I’ll fix up the search and replace interface (read: get rid of whatever horrid mess exists and replace it with a new one from scratch). And maybe I might publish it in some form that is significantly less involved than cloning off github and packaging the app yourself.

Posted in Chrome Extensions, CloudFall.

Tagged with , , , , , , , , , , .


Pinball

Last year, for Computer Science 1, I built a pinball game in Java. Yes, Java, as in sans-script. I wrote it in Eclipse and it was pretty functional. Of course, the biggest part of the game was the collision detection, collision detection which, mind you, I built with only the slightest knowledge of how it’s actually done. So yeah, I made up a scheme. It sucks, but at least it sort of works. I was actually fairly proud of it since it was basically the only thing which was mildly interesting and non teacher directed (that is, it took more than one hundred ninety two seconds to write after the teacher gave a cue to attempt the task). I thought it was so cool in fact, that I decided to port it over to both Javascript and Coffeescript, the latter being a language which was uber-hip and awesome (I believe these events transpired approximately at the time of Coffeescript’s big one-oh release).

In essence, the collision detection scheme I devised was pretty simple. The ball experienced gravity in the form of a constant downward acceleration. It would move up and down. That was maintained by a variable (actually an array to mimic a vector), named position, and another one named velocity, where the y-coordinate for the latter was incremented by some term every time a frame was rendered. The collision detection system deals with the intersection between lines. First it takes the current position of the ball, and envisions that the current velocity is a line extending infinitely in that direction. All the other attributes of the environment are also lines, and the algorithm treats them all as unbounded (rather than segments). It quite trivially computes the intersection between the lines and calculates (with euclidean distance) how far that intersection is from the ball’s current point. If it detects a collision (the distance is less than the specified radius of the ball), then the ball’s direction is altered based on its current position and the tangent of the line it intersected with.

Fast forward a few months, and it’s July of 2011. Since then, I’ve basically forgotten about the pinball game, because it really isn’t cool enough to be worth revisiting. But I was in Virginia’s Summer Residential Governor’s School for Science/Math/Technology (hardly wholly unexpected). It was some dark night and I was in the back row of a white shuttle bus en route to an observatory where artificial light is verboten. Well, it started raining and the trip was cancelled and we were going back and I was bored. So I pulled out my iPad, which I had for some reason, and decided to look at what I had on my Dropbox which I could play around with offline. Lo and behold, there’s a pretty crappy Javascript pinball implementation (which notably, never having been finished is just a ball bouncing between a few lines). I changed the parameters and made weird shapes the ball hopelessly bouncing around in awkward trajectories, and I thought it was cool.

A friend, sitting next to me noticed that the balls were speeding up. He inquired whether or not I was using the euler method, and I didn’t know what that was. Either way, inadvertently, I was in fact using that. But the point was that here he was, making a fuss about the fact that my simulation in two dimensions blatantly ignores the conservation of momentum. And as such, I was compelled to finding a solution. I studied the code for a bit longer, trying to expose the nasty snippet which might have made it evade the laws of physics, albeit to no avail. In the midst of my struggle, he noticed at some point that since the trajectory paths are simple parabolas, it should be possible to find a closed form solution to the collision problem (rather than my iterative approach. In the end, I nobly gave up on my epic quest to fix the universe, and worked on some other crap.

A week ago, I finished my first formal (that is, in school) calculus class, where I learned what euler’s method was and how I inadvertently implemented it. But I still had to take the class final, which the teacher rather kindly decided to make extremely easy. And since I also had a huge math portfolio (which I decided to typeset with LaTeX, as I have for nearly every paper this year, for no real reason whatsoever) due that day, he made it take home, with the only caveat that we aren’t to collaborate on the final in his presence (eg. after school in his room), a subtle nod of assent to all other forms of aid. Well, at home, I thought that I should probably make use of all the tools I have at my disposal. I thought of that copy of Mathematica which hopefully wasn’t collecting dust on my hard drive platter because that might lead to a head crash, but alas was figuratively gathering dust.

I decided to start playing around with the awesomeness that is a computer algebra system, named after a guy whose last name is that otherwise inexplicable W on the periodic table (purely coincidental, unless Stephen Wolfram is actually a british iron manganese tungstate mineral). In what felt to be no time (at least it seemed interesting and productive enough that I in no way thought to discern the real time which elapsed), I had completed my final exam in a somewhat minimalistically formatted Mathematica notebook. I had, figuratively, fallen in love.

Soon afterwards, as I was working on some other completely unrelated project (creating a intraframe-only WebM encoder), I needed something somewhat cool to generate a series of frames to be encoded into video. Digging through my (newly reorganized) Dropbox project folder, I found that old incomplete pinball game. And I remembered that story and that brief suggestion and decided to use my newfound Mathematica prowess (that is, my ability to use Solve[] with some degree of it not puking) to find that solution to the collision between a parabola and a line. To my dismay, Mathematica returned an ungodly mess of an equation. And it also didn’t work for vertical lines.

But eventually, I did manage to get it to work. And thankfully now it obeys the laws of physics, at least to about 10×10^-13. And I also realized that that demo isn’t actually that cool but I used the older version anyway for my video encoder demo. So that’s the happy ending, everything lived happily ever after despite the fact the algorithm is still totally buggy and breaks when things are either too fast (velocity > 4000px/second) or too slow (velocity < 1px/second). And with corners, it occasionally fails. But I’m long since past the threshold of caring. At least now, you can speed it up a lot because the path prediction and the collision detection are completely separate from the render cycle (which also lets the speed become independent of processor speed). It uses requestAnimationFrame and other stuff, and if you want to try it, here.

Posted in Games, Pinball.

Tagged with , , , , , , , , .


Visualizing Facebook Activity

You might have noticed that I haven’t written much for this blog in the past few months. In truth, it’s because of school work, which has never really been something of an issue before. This is, quite probably the least productive stretch of time in my life thus far. I have a suspicion that this issue stems more psychologically than due to some radical increase in work load, but I haven’t looked in to testing that hypothesis (I’ve been collecting data hour-by-hour about what I’ve been doing in the past two years, so I could probably look into it if I were actually interested in that matter). But school’s nearing a close, and hopefully I can get back to a more productive lifestyle, maintaining my blog and most importantly, trying out cool things. I have a few things which I am working on at the moment which should be completed in the coming weeks (though I make no assurances). But since I have an internal goal for writing one blog post per month, I’m going to recycle a project from December of 2011.

Nearly every day, I inevitably end up glancing at my Facebook “buddy list” of sorts, wondering how many people are online. It’s a figure which almost always seems to depend on the time of day, and behaves almost like clockwork, there’s always a massive swarm of people online around 10-11pm, and hardly anyone is ever online at 4 in the morning. I guess the problem with drawing any conclusions from this in particular is how specific a group this graphic represents. It constitutes my friends, and in particular, my Facebook friends. Essentially all of them are people I’ve encountered in real life, and may or may not actually find interest in. But the thing that unites just about everyone is that they’re generally high school aged.

Before going on discussing how pretty of a chart this is, I think it’s worth going through what this chart actually represents. It’s quite easy to tell that this is in fact a polar chart, and on the inner circle, you can tell that it’s a 24 hour clock. Each of the rings represents a friend, and the rings are sorted by the total amount of time spent on Facebook in the given period. So you can see that toward the middle, the graph is almost opaque at every time, whereas on the fringes, the online activity is quite erratic and infrequent.

So, where does this data come from? It’s actually quite simple to get from the Facebook API. I have a cron job which runs every minute to run a FQL request and save the results to a specific log file.

The actual FQL which runs in order to retrieve the list of online users is

SELECT uid, name, online_presence FROM user WHERE online_presence IN (‘active’, ‘idle’) AND uid IN (SELECT uid2 FROM friend WHERE uid1 = me())

Basically, get the User ID, the name, and their online presence state for friends who are either active or idle in the list of the logged-in user. Since Facebook is an OAuth2-type API, you need an access token in order to do anything cool. I just use the Facebook Graph API Explorer to generate my access tokens. Just go press “Get Access Token”, and select (at minimum) the permissions “user_online_presence”, “friends_online_presence” and “offline_access”. Then copy and paste the revealed token into some authkey.txt and you should be set.

I have a python script to go through the log file and to render it as the polar chart which is depicted on the top of the page. The code used for that is frankly atrocious and the output is even more so. Python Imaging Library is used, which is a lovely library, except not for drawing graphics. There isn’t any smoothing or anti-aliasing on the arcs drawn by PIL and they all look hideous. So I render the chart at some absurdly high resolution and down-resize it in GIMP while adding layering, blurs and opacity in order to make the picture somewhat less atrocious. Also, it does’t support restricting the app to drawing a specific day of the week, even though it might be interesting to see the how the trend differs on a weekday versus weekend.

Something interesting about the appearance of the polar graph is that it almost resembles something of a digital fingerprint, and that brings up some interesting privacy considerations. Inside that graphic are the Facebook browsing habits of some two hundred people. There’s the question of how much this changes day by day for users, and to what extent this can be used to identify people. And even if a single ring doesn’t unambiguously represent a single person, the two hundred or so rings of their friends probably goes pretty far into identifying people. There’s also a striking amount of uniformity that says a lot about the type of people who I tend to associate with. Just at a glance, one can tell that there are very few people I’m friends with on Facebook who live in different timezones. Maybe what’s more dangerous than being able to identify a person is to be able to identify what kind of groups that person belongs to. And over the course of a day, just about everyone checks Facebook a few times.

Posted in Data, Facebook Visualization.

Tagged with , , , , , , , , , , .


Chrome Multitask Mode for Real with Multi-Pointer Xorg

Google’s multitask mode was only an April Fool’s joke, but inordinately amused among us (read: me) might even venture into attempting such an irrational feat. While watching the video, it evoked some memories of something which I had stumbled across a few years ago, called Multi-Pointer X. The rationale for creating that wasn’t nearly as insane as the Chrome Multitask video made it out to appear, instead I think it was just the framework to support multitouch and other sorts of alluring interactions on the Linux platform.

So I decided to look into that again and within a few minutes of Googling, it appears that Multi-Pointer X has now been incorporated into the X.org mainline as part of XInput2. So what exactly does it take to conjure and insult the multitasking gods?

The first step is quite logically to plug in another mouse. I happen to have three mice plugged into my computer anyway (there’s a perfectly logical rationale for how this came to be, my keyboard is some  wireless Microsoft keyboard+mouse set but I really hate the scrollbar on the Microsoft mouse, so I instead used another mouse for the longest time until it started failing and sheer lethargy prevailed over disconnecting that obsoleted peripheral) but regrettably, I don’t have an additional prehensile appendage to operate the third mouse.

Right now I’m using a pretty much unmodified version of the latest version of Ubuntu 11.10. I just opened up a terminal window and typed in “xinput list

antimatter15@antimatter15-desktop:~/online$ xinput list
⎡ Virtual core pointer id=2 [master pointer (3)]
⎜ ↳ Virtual core XTEST pointer id=4 [slave pointer (2)]
⎜ ↳ A4Tech PS/2+USB Mouse id=8 [slave pointer (2)]
⎜ ↳ Microsft Microsoft Wireless Optical Desktop® 2.20 id=11 [slave pointer (2)]
⎜ ↳ USB Laser Wheel Mouse id=9 [slave pointer (2)]
⎣ Virtual core keyboard id=3 [master keyboard (2)]
↳ Virtual core XTEST keyboard id=5 [slave keyboard (3)]
↳ Power Button id=6 [slave keyboard (3)]
↳ Power Button id=7 [slave keyboard (3)]
↳ Microsft Microsoft Wireless Optical Desktop® 2.20 id=10 [slave keyboard (3)]

Use the command “xinput create-master Auxiliary” in order to create a new input device, and try running “xinput list” again to confirm that the command has done your bidding.

antimatter15@antimatter15-desktop:~/online$ xinput create-master Auxiliary

Now it’s time for the third (or is it fourth) and final step, to reattach one of those master pointers into the auxiliary pointer. To do that, pick out some mouse. For me, I picked my “A4Tech PS/2+USB Mouse” which is was my mouse with a broken scroll wheel. You can see that it’s been given ID=8, so that’s the number I’ll be using for the next step.

antimatter15@antimatter15-desktop:~/online$ xinput reattach 8 "Auxiliary pointer"

And then I can now use two mice at the same time for whatever ungodly purpose I desire.’

It’s hard to take screenshots since it appears that every screenshotting or screencasting app which I can find seems to make the wholly unwarranted assumption that a desktop computer only has one cursor, but it does actually work. Though I don’t think I’ll ever be able to multitask through this scheme, it’s mentally jarring in far too many ways.

 

Posted in Fun.

Tagged with , , , , , , , , , , , .


MusicAlpha v2.0

MusicAlpha works again, and this wasn’t some minor update that fixed a small hole, the entire app has been rewritten. There’s a new user interface which has really quite beautiful animated polar progress bars, but the most significant part is that the mystical protobufs-over-https protocol has been decoded and documented.

It really started on January 30th, when Simon Weber contacted me for information on the upload process because his Unofficial-Google-Music-API received various user requests to implement uploading. I remembered a message sent a little while earlier offering help decoding the protobufs and tried out the method described, which involved using a disassembler to patch the curl_easy_setopt method to disable checking of SSL certificates. This turned out to be the trick and I used Fiddler to collect a few dumps of data for subsequent analysis. That was really quite a magical moment because everything afterwards was relatively straightforward. The huge stumbling block which had prohibited progress for these past months had been lifted and the task which was left, of reverse engineering the protocol, seemed eminently viable. With only minor hiccups over the next few days, I built a python prototype of the program. I created a .proto file which was somewhat human readable with field names based on guesses.We realized some pretty cool things or at least got a much better understanding of how Google Music seems to work (This blog post was started in the early part of February, but I’m finishing this up halfway through March).

For instance, the reason Music Manager doesn’t work on virtual machines is because they use the computer’s MAC address as a system for identifying individual devices. They have a cap on the number of devices which can be authorized to manage your music, which is currently set to 10. This caused a bit of confusion when I hit that device cap while testing (At one point I assumed that the MAC-esque field was actually a random string, so after fifteen or so tests everything started silently failing). However, it occurs to me that it isn’t particularly interesting to read the narrative of how exactly the current version of the application has come to exist. The technical overview of how the actual protocol works can be found on the google-music-protocol project page on Github. It includes a brief overview of the mechanism as well as sections which detail the specific messages which are sent to the server and also includes a simple reference implementation. Hopefully it should be enough information for anyone who is actually interested in doing something which relates to it (though I am curious if anyone could possibly create something interesting out of it, please contact me if you decide to do anything at all with it). So at this point, I’d like to skip over to the actual application-specific aspect: MusicAlpha itself. This is the story of how the rewrite came to be, and the rather lengthy time it took to to write this blog post describing it.

One of the first things I realized was that the entire application would essentially need to be rewritten. The code dealt primarily with altering the attributes of a song by mimicking the web client’s interface, something which wouldn’t be very useful now that all the metadata is sent via the magical protobufs scheme. Since that code gets scrapped, I figured why not scrap the rest. I wrote the first version while experimenting with darker color schemes and obscenely prominent border radii and gradients. I’m not sure that I can say it’s objectively better, but it does look a bit more interesting, especially that big two-ringed polar chart.

There’s some text to the right side, but it’s not particularly of interest. There’s a big file selection button, but that’s pretty uninteresting. Embroidered on the solid gray backdrop is the logo, a simple and unchanged music symbol- unchanged from the last version. I used, for the first time, custom web typography. Just some fancy looking sans-serif fonts I picked from Google’s Font Library. On the old application, I had two progress bars to show the upload progress, one is for individual songs and the other is the overall progress. I wanted to portray upload progress in a manner which was somewhat smoother and more aesthetically alluring than a mere linear bar.

In trying to create that better progress bar, I created a polar progress bar in canvas. All updates are smoothly interpolated and watching the bars spin around and invert is somewhat reminiscent to staring at an old record spin on a turntable, something which I’ve only seen a few times in my (rather short, at this moment) life, but magical enough to still inspire a sense of nostalgia.

Central to making an aesthetically appealing polar chart is making the animation smooth. This might be doable through various esoteric CSS transforms, but I’m not particularly knowledgeable in that area, and I’m not a huge fan of crafting elaborate animations with CSS. It lacks a certain amount of control (which manually updating and redrawing affords) and tends to feel extremely sluggish on computers which lack extremely modern graphics cards (such as mine). Since things like uploading a file involves a lot of guesswork and various aspects of the upload process are very much stochastic (ie. waiting for the server to respond), the smoothing system needs to be able to create a nice smooth animation out of highly erratic information. But it also needs to simultaneously relay information without having an excessive amount of lag.

In order to strike a sort of balance, it uses an interpolation algorithm which was employed in Steve Wittens’ js1k entry and you can see my code here. In order to minimize the CPU tax of running such an animation, it uses the HTML5 requestAnimationFrame function. This has the nice side effect that when you leave it uploading in the background for a while, it automatically stops updating the animation and quickly slides into position when you return.

There are two rings, the outer one being dark gray and the inner one being a somewhat lighter shade of gray (fitting into the mostly unsaturated theme of the interface). The outer ring represents the total upload progress while the inner ring represents the progress of an individual file. The outer ring “spins” in a sense, in a clockwise manner while the inner ring rotates in the opposite direction. Every time a ring completes a revolution, it “inverts”, for instance, when the outer ring fills up and becomes a solid circle, the ring begins to open from the opposite direction so that the imaginary end of the rotation continues unimpeded. The animation is somewhat difficult to describe in words so here’s a demo of the progress animation.

When the page first loads, the polar plot exists in the background, behind the prominent logo. It represents the song quota, the percentage out of the 20,000 song quota which Google Music imposes.

While implementing the algorithm, I had a little trouble because of SSL certificates. The domain which accepts all the protobufs encoded information has a certificate signed for the wrong domain, and browsers have a tendency to care a lot about certificate validity. To get around it, those requests are proxied by a simple server which is made with Google App Engine. All communication with that server itself is signed as well, and it doesn’t do any logging.

When I was about to publish this post and first announced that it works again, there was a bit of trouble which people had getting it to work. This was actually because Google Music allows each MAC address to be associated with a few accounts but forbids any additional accounts. The first released version would use a single mac address for all clients, but it has since been replaced with a sort of unique identifier which is stored in localStorage.

Apparently a browser based uploader may be something coming soon to Google Music. Enjoy this app while it lasts.

 

Posted in Chrome Extensions, MusicAlpha.

Tagged with , , , , , , , , .


liveswifers.org forum mirror

I first registered an account on liveswifers.org on December 11th, 2006. I was 11 years old at the time. It was this time when the Ajax Animator began. I’m not sure of that, because 5 years past certainly constitutes ancient history for a teenager. It was a huge influence on the development of the Ajax Animator, and it was there that I first encountered some of the future contributors to the project. In fact, the community was kind enough to create an entire subsection of the forum intended to nurture discussion of my pet project, which paralleled their eternal vaporware attempt at resurrecting their namesake program.

Over the next few years, the community decayed and the site became desolate and spam-ridden. There was a period in late 2008 when every indication was that the site would come to an abrupt demise when the domain registration was to expire. The still active community created a backup community on some other forum hosting site and prepared for the worst. I did my part by running WinHTTrack and mirroring the site on my hard disk. It turned out the domain was renewed, and the panic was for nought.

But, when the website finally became a desolate and abandoned wasteland a few years later, the domain lapsed and all the content was lost.

In a nostalgic fit earlier today, I dug up that archive and uploaded it to Google Code. Here you can browse the near entirety of the liveswifers forum as it was, frozen in carbonite those three years ago. I can’t place that date, December, 2006 quite in context, but I would expect that to be approximately the date that the seeds of inspiration were planted. And so maybe not for anyone else, but this site and all its content holds a special place in my mind, and deserves a final resting place shielded from the harsh internet.

Posted in Ajax Animator.

Tagged with , , , , , , .


This site is going down tomorrow in protest of SOPA/PIPA

Not that anyone is going to particularly miss my blog (most of my other web pages will still be up because I’m far too lazy to go through all of them in order to make a slightly less futile protest). But this is one of the most interesting internetwide events that I’ve ever encountered, and I should be shamed to not take part in it. Copyright and its ongoing war with libertarian and anarchist mentalities has fascinated me ever since squinting through Free Culture on my iPhone more than five year ago. Going to Jonathan Zittrain’s Free Culture X keynote a few years ago was fun too.

I’m sure most would feel uncomfortable with my characterization of this as some cyberlibertarian movement, but I think it’s an entirely excusable position for someone living through a period of adolescence characterized by rebellion and Ayn Rand. It’s certainly not the only perspective, but it seems the most poignant and consistent, not that consistency is necessarily a goal of any legislative body. Legislative bodies are meant, to borrow from programming jargon, to monkey-patch a framework that ill-approximates the societal expectation of government.

Now excuse me while I contrive a metaphor that relates to something that I don’t know much about and will seem cringe-worthy to anyone knowledgeable in the subject. Lets say that the Earth is the shape of what the ideal government is. Not ideal in a sense that it’s perfect and elegant, but in the sense that it is empirically derived through the scientific process of experiment. But in large part, this mass conforms to a very general principle (maximizing individual freedoms). I would characterize this is approximating the earth as a sphere. For nearly all applications, this rather crude geometric model works astonishingly well, leading some to even believe that this is what the world actually is. That everything else, the obsession with finding imperfection is actually a delusion and everything comes from measurement error (well, that’s when this metaphor breaks down because nobody actually does this). Legislators shape the body of law to conform with the needs of society by poking and adding ridges to where the overlay is incongruous. They do the passive activity of fixing the model, the geoid, to incredible precision. But there’s a more radical way, molding the earth into what our model was. We can forget about the lapses, and soon enough the surface of hte earth will erode down into that perfect sphere of ideology.

There. I think I’ve fulfilled my obligation to write some words with some ostensible meaning with regard to something that pertains rather dearly to my life. If this passes and we descend into a dystopian nightmare reminiscent of 1984, at least I’ll have something to look back on. An old man filled with regret waiting to die alone. Or at least without an internet. I did my best (but still far from enough, as is anything besides martyrdom) to preserve crowd immunity to hazardous legislation, but alas we were stricken.

Posted in Meta.

Tagged with , , , , , , , , , , .


Google Circles was Launched on Tau Day

I realized this a while ago, June 29th and started this blog post and never got around to filling it with content, but I’m pretty sure the the title can stand on its own. For those unfamiliar with tau, there’s this movement that says that pi is a bad circle constant – since more often than not, mathematical equations use 2pi rather than pi, and for purity sake, that should be the value for the circle constant.

Google+ was launched on June 28, 2011 (or at least that was the day the limited field testing began, but I was part of that, so that’s the date I consider). In other words, 6/28, or 6.28.

Posted in Fun.

Tagged with , , , , , , , , , , .


Updated Ajax Animator

Well it’s been the first time in about two years, and it wasn’t a big change.

Posted in Ajax Animator.

Tagged with , , , , , , , .