somewhere to talk about random ideas and projects like everyone else



Facebook Chat Bot, Turing Completeness 31 December 2012

Okay, so I really want to maintain at least one blog post per month, and here I am, right before New Year’s with no apocalypse in sight to rapture me away from having to write a blog post. I don’t have anything particularly interesting ready to share, so here’s something fun that only took a few hours.

The rather long Bayeux tapestry of an image I have crammed to the right marks the culmination of a series of rather odd tangents. It also serves as a reminder for me to abandon hope#Overview_and_vestibule_of_Hell), because the shape of the conversation forces whatever post I plan on writing to be verbose enough as to fill all that vertical space so that the actual text here isn’t dwarfed by the image, which would be aesthetically jarring.

It all started the day before yesterday (I’m pretty sure it wasn’t actually then, but this is my abridged timeline and I’m entitled to indulge in whatever revisionist temptations I have as I’m writing this at a rather late hour, because 2013 is creeping closer at an uncomfortably fast rate and I still have some finite quantity of homework I haven’t attempted to start over the course of the past week), when my friend Robert realized that a rather significant portion of my chat responses include “wat”, “walp”, “lol”, “yeah”, “:P”, “cool”. From that, he logically sought to create a naive bayes predictor of me, because, why have a person nod in consent, when you can have a robot which can, to some degree of accuracy, automatically give that nod of assent without bothering a physical human being.

This isn’t quite the same thing as a traditional chat bot, because traditional chat bots don’t pretend to be actual people, largely because the state-of-the-art of artificial intelligence is quite a ways off from creating something which can suitably pass a Turing test, and even further away from being able to learn the entire knowledge of a person and provide an intelligent response to every imaginable stimulus. A closer approximation would be that this is a sort of semi-automated chat-macro system, whereby my generally useless responses of agreement are outsourced to a rudimentary script, fully capable of replicating my own rather unhelpful self. And from that it’s able to try to behave human by replicating a very very narrow subset of activity, deferring to a real human transparently.

It’s somewhat like how GMail has a priority inbox feature which uses magical algorithms to separate the wheat from the chaff, diverting your attention away from the less subjectively important emails toward the few which actually matter. Except instead of giving the user ultimate discretion over the emails, this one would simply reply with “lol”.

It’s pretty easy to see any realistic application of this as rare. But I could totally be wrong, and I could be seeing this from totally the wrong angle. It’s entirely plausible that some subtle variant of this idea is absolutely brilliant, and will serve as the future of networked communication for decades to come. Maybe as email evolves the way of the Dodo, Instant Messaging will become the semi-permanent high-persistence low-immediacy ironic-initialism that takes email’s place, and the rest of the world becomes burn-on-reading SnapChats. Maybe SnapChat gets combined with lifelogging, and every minute of your life gets divided into four second snippets, and sent to one of your four thousand middle-school friends, so they can bask in the new hyper-intimate super-network, fueled by non-discretionary artificial intelligence filters.

That was Robert’s end anyway. He went on combing through our accumulated ~30,000 messages with his handy Mathematica trying to identify pertinent features which might probably replace me. On my side of the thing, I started with the userscript which would run on my browser to intercept the messages he sent, process them, and to send out a response, if a suitable automated one existed.

This actually turned out to be slightly more difficult than I had anticipated, for a single reason, and it was and has been the same reason that has bothered me for quite some time. Firing events. I really probably should learn how to use Selenium or something to automate things in a nice and clean way.

And that’s completely ignoring the five hundred pound trivial solution in the room, which isn’t to use a 500lb gorilla as a metaphor. Facebook supports XMPP which is a nice open source protocol with a billion implementations and it’s really not that hard to automate. I mean I’ve even done it before. I have a Raspberry Pi after all, and I can use it to run useless services like this, because I’ve decided that that’s exactly what a Raspberry Pi is for. But I didn’t and I have no idea why. C’est la vie, I guess.

The secret ended up being to Google a bunch of phrases related to keyboard-event-emitting with initializeKeyboardEvent or something of that sort a lot of times and stealing some code from StackOverflow. It took a few tries, but eventually something that worked emerged, and it was cool.

For posterity, this was the code that worked:

But that wasn’t the end of it. No, now there was a piece of code which could convincingly send a message, but it also had to read messages too. That wasn’t hard, actually, I just created a MutationObserver object and named it after a Fringe character. But then, presumably because Robert’s code was still training on that 30k message corpus, or something inexplicable, I built a crappy rudimentary thing with four or so simple regular expression based rules.

One of them happened to be that if someone were to send me the message “lol”, I would inexplicably respond as well with “lol”. This wasn’t initially a problem because the idea was to create a robotic version of me and _only _me. But Robert found it cool too so he ran the same script. Problem was, that this meant that if either one of us were to introduce the word “lol”, it would spiral nigh perpetually ad nauseam.

Somehow this sparked the idea to implement simple esoteric languages on it. The first one was a simple substitution rule. 0 -> 1, 1 -> 10. And over a few iterations it grew and took up huge amounts of space.

Next up was implementing a tag system. I implemented the 2-tag collatz sequence one. The cool thing is that it’s actually insanely easy to implement. You just take a sequence of symbols, strip the first two, and append the first symbol after you’ve transformed it according to a certain transformation mapping.

It was interesting as each little computation of this tiny esolang of a chat conversation involved a round trip of hundreds of miles. But each tag was still letters, ‘a’, ‘b’, and ‘c’. And it was getting late, and I thought that maybe it would be nice to hark back at inside-joke/theory of sorts that sparked it all. So I decided that our tags could be, instead of solitary letters, comprise word symbols like “lol”, “d:” and “wat”, which perhaps is a subconscious reference to an old webcomic that I had posted a while ago about a steganographic system built on internet vernacular.

I updated my script, posted it to him, he ran it, and initiated it all by saying “lol lol lol”. And from then the line lengths grew and shrank like mountainous terrain, eventually converging on the final state, a solitary “lol”. And for some inexplicable reason, some weird quirk of fate, or some deeper underlying truth to the universe, Robert always got the last “lol”. When I had tried it with a random initiating code, it would always terminate in some odd number of iterations, leaving me with the lowly penultimate laugh.

It sort of broke after that, and we didn’t fix it.

Nonetheless, there’s something incredibly cool about how an idea for something questionably turing-test-worthy can evolve into something turing-complete.

And that’s a basic summary of how uneventful my winter break has been, incidentally written with exactly 1337 words.

Raspberry Pi 14 August 2012

As part of the shift between long multi-kiloword blog posts which are somewhat more like press releases back into a sort of more personal (i.e. blog-esque) format, I guess I’ll talk about my newly-arrived Raspberry Pi. Right now, there isn’t terribly much to talk about since I’ve only had it for about two weeks.

I’ve been planning on getting a Raspberry Pi for a pretty long time, and I was actually pretty excited about that. For the weeks preceding the official announcement, I built a tiny script which ran in a ten-minutely cron job which would basically download the purported Raspberry Pi store (, note the dot-com rather than dot-org, which their homepage is situated at) and compare the hash, notifying me via Ubuntu’s built in notification system.

On that sleepless night when the actual pre-sale announcement was being made, I was incessantly checking, which had suddenly morphed into a server maintenance message (which remains to this very day). The anticipation was intense, and some twenty minutes after it was supposed to happen was when I realized that the whole time I had been checking the wrong page. The announcement came instead on, their main blog, and by that time, it was certain that all the distributor’s sites were already collapsing under the crushing load of a million souls crying out for a taste of berry-scented silicon pastry.

On the next day, I checked the sites again, and all the order pages were already closed. Either way, it wasn’t terribly useful for me because most of them didn’t support Paypal. Fast forward a veritable eternity, on June 16th, I was notified via email that RS Components that I would be allowed to order the device some time in the near future. Sure enough, on the 22nd, another email gave me a link to the order form, which I promptly filled out and I began the process of waiting. Not really, since I had other stuff to do and most of my interest had already vaporized at the daunting 7 weeks it was supposed to take.

Another eternity later, it arrived in some rather nice packaging. It actually came as a bit of a surprise, because I had become so accustomed to waiting that I had never really expected it to materialize so suddenly. But when it did, it was everything I imagined and more. It came in this rather nice cardboard box, which I eventually cut in half with an X-Acto knife (which nowadays, I use for all my paper splicing needs) to build a makeshift case. I fumbled around in a closet and found a neglected 16GB SD card (probably back from the era when point-and-shoots were actually preferable to mobile phones) and installed that weird Debian distro (after having a little internal debate on what to install). But the first thing I had done was plugged it into a monitor through a HDMI-to-DVI converter. I took the charger from my Galaxy Nexus (I wasn’t using it for anything since I charge it in my room from my HP Touchpad Charger, and my Touchpad idly draining power from a cool inductive stand, the standardization of chargers is really pretty awesome), and used that as my Pi’s permanent power supply.

I also had a 2000mAh LiPo battery which I was going to use with my Arduino LilyPad for some cool foot-operated telegraph which I wanted to use as essentially a UPS for the Pi, but a bit of googling reveals that that might possibly entail actual electronicswork, so maybe that’s something for later.

I turned it on, and lo and behold it didn’t work. I actually never quite figured out why. Then, I tried plugging it into a really old 13 megaton CRT TV, which makes me realize how it’s sort of weird that the unit of megatons is hardly ever used for things other than atomic weapons, and now it feels oddly inappropriate for a hyperbole for the mass of a TV, but maybe it’s actually sort of appropriate because CRTs are terrifying. So analog seemed to work, except for this problem where my keyboard would keep repeating letters and not working well. That wasn’t a good start.

But after a little googling from my Chromebook, it turns out the keyboard issues came from the fact that I had plugged in my only spare USB keyboard which happened to be a Logitech Mouse+Keyboard+Speaker thing and my teensy Galaxy Nexus charger couldn’t eke out enough watts to power it. And the issue with the HDMI-to-DVI thing was just because I needed to restart with the cable plugged in. But neither of them posed a real material issue because I had been intending to use it as a headless rig from the start.

The first thing I really noticed was how surprisingly easy it was to install things. I had expected the ARM repositories to basically lack everything which might be useful, but it turns out that actually almost everything I wanted was available. I didn’t dare compiling anything, but Node (albeit a somewhat old version) was available from the repos, so I never really needed to. I had to manually update to a new version of npm, but that wasn’t that bad. I set up forever to run a few apps, but not much.

One of the main reasons I could justify getting the Raspberry Pi however, was to run my Facebook logging script on something other than my main computer, and aside from getting confused trying to use sendxmpp, it was fairly straightforward.

Visualizing Facebook Activity 29 May 2012

You might have noticed that I haven’t written much for this blog in the past few months. In truth, it’s because of school work, which has never really been something of an issue before. This is, quite probably the least productive stretch of time in my life thus far. I have a suspicion that this issue stems more psychologically than due to some radical increase in work load, but I haven’t looked in to testing that hypothesis (I’ve been collecting data hour-by-hour about what I’ve been doing in the past two years, so I could probably look into it if I were actually interested in that matter). But school’s nearing a close, and hopefully I can get back to a more productive lifestyle, maintaining my blog and most importantly, trying out cool things. I have a few things which I am working on at the moment which should be completed in the coming weeks (though I make no assurances). But since I have an internal goal for writing one blog post per month, I’m going to recycle a project from December of 2011.

Nearly every day, I inevitably end up glancing at my Facebook “buddy list” of sorts, wondering how many people are online. It’s a figure which almost always seems to depend on the time of day, and behaves almost like clockwork, there’s always a massive swarm of people online around 10-11pm, and hardly anyone is ever online at 4 in the morning. I guess the problem with drawing any conclusions from this in particular is how specific a group this graphic represents. It constitutes my friends, and in particular, my Facebook friends. Essentially all of them are people I’ve encountered in real life, and may or may not actually find interest in. But the thing that unites just about everyone is that they’re generally high school aged.

Before going on discussing how pretty of a chart this is, I think it’s worth going through what this chart actually represents. It’s quite easy to tell that this is in fact a polar chart, and on the inner circle, you can tell that it’s a 24 hour clock. Each of the rings represents a friend, and the rings are sorted by the total amount of time spent on Facebook in the given period. So you can see that toward the middle, the graph is almost opaque at every time, whereas on the fringes, the online activity is quite erratic and infrequent.

So, where does this data come from? It’s actually quite simple to get from the Facebook API. I have a cron job which runs every minute to run a FQL request and save the results to a specific log file.

The actual FQL which runs in order to retrieve the list of online users is

SELECT uid, name, online_presence FROM user WHERE online_presence IN (‘active’, ‘idle’) AND uid IN (SELECT uid2 FROM friend WHERE uid1 = me())

Basically, get the User ID, the name, and their online presence state for friends who are either active or idle in the list of the logged-in user. Since Facebook is an OAuth2-type API, you need an access token in order to do anything cool. I just use the Facebook Graph API Explorer to generate my access tokens. Just go press “Get Access Token”, and select (at minimum) the permissions “user_online_presence”, “friends_online_presence” and “offline_access”. Then copy and paste the revealed token into some authkey.txt and you should be set.

I have a python script to go through the log file and to render it as the polar chart which is depicted on the top of the page. The code used for that is frankly atrocious and the output is even more so. Python Imaging Library is used, which is a lovely library, except not for drawing graphics. There isn’t any smoothing or anti-aliasing on the arcs drawn by PIL and they all look hideous. So I render the chart at some absurdly high resolution and down-resize it in GIMP while adding layering, blurs and opacity in order to make the picture somewhat less atrocious. Also, it does’t support restricting the app to drawing a specific day of the week, even though it might be interesting to see the how the trend differs on a weekday versus weekend.

Something interesting about the appearance of the polar graph is that it almost resembles something of a digital fingerprint, and that brings up some interesting privacy considerations. Inside that graphic are the Facebook browsing habits of some two hundred people. There’s the question of how much this changes day by day for users, and to what extent this can be used to identify people. And even if a single ring doesn’t unambiguously represent a single person, the two hundred or so rings of their friends probably goes pretty far into identifying people. There’s also a striking amount of uniformity that says a lot about the type of people who I tend to associate with. Just at a glance, one can tell that there are very few people I’m friends with on Facebook who live in different timezones. Maybe what’s more dangerous than being able to identify a person is to be able to identify what kind of groups that person belongs to. And over the course of a day, just about everyone checks Facebook a few times.

Schedule Compare 02 September 2011

Posting this here is almost certainly useless. I assume very very few people who read this blog tend to be in the target 13-18 year old facebook-using high school student demographic. By the unlikely chance you are (if you aren’t, you can forward this to the nearest person who fits into this demographic, as I will add later, that I’m desperately looking for users).

It’s that time of year. The brief window where summer vacation isn’t technically over yet as school hasn’t started but you still know your classes for next year. You’re frantically attempting to complete those long procrastinated summer assignments, or like me, you’re desperately trying to avoid them by giving a yourself a false sense of productivity by building random apps.

My first foray into the realm of creating Facebook applications is fairly simple. It compares class schedules. In truth, the reason I made this was probably not the fact that I enjoy making useful tools, but more likely residual bitterness of rejection by a sci/tech high school over three years ago which has a school-specific schedule comparing app. Nonetheless, a neat side effect of this attempt is that it does happen to be quite cool.

This is also my first published app which is written in the CoffeeScript language. For those of you unaware, CoffeeScript is a language which is syntactically similar to Python but compiles into Javascript. It’s not a nasty GWT-esque compilation, but a relatively clean one (barring the underscores that result when you try doing comprehensions and the really cool stuff). I’ve always meant to write stuff in CoffeeScript, as it has quite a few awesome features. Most importantly is probably the ability to declare a function with two characters (->) rather than a massive “function(){}” and the array comprehensions.

Compiled (or should I say Transpiled?) languages have odd a few annoying properties, especially with debugging. The biggest issue was probably setting up everything: running a script which uses inotifywait to automatically compile your CoffeeScript once you hit “Save” on your editor of choice (gedit just because it works and comes with Ubuntu). Then when errors happen, your line numbers don’t match up and that’s also annoying.

The Facebook API is actually pretty good. My app reads Facebook schedules from your friends’ statuses. It’s not quite as easy as it should be. I could search the user’s news feed and that would be trivial except that it only gives me a subset of the statues that I want to be able to process. When using FQL (which I ended up pronouncing Feequel which sounds a bit like Fecal because it’s a SQL derivative, even though you’re not supposed to pronounce it “sequel”) it would only return/search the most recent status. I ended up doing a FQL request for each and every friend that the current user has, which is a on average a pretty big number. Fortunately it doesn’t seem like Faecbook has any API limits. Awesome.

For the longest time I was confused because my app inexplicably only worked for me. It turns out that my queries returned blank results for everyone else because I didn’t request the right permissions. That’s terrible. Absolutely terrible. First of all, the developer shouldn’t be entitled to have those magical privileges that the end users can’t have. It’s insanely confusing. And don’t just silently return no results and make the developer question his own sanity.

But it was a permissions issue - a one line fix in the end.

It’s also quite depressing that nobody’s using it. It’s pretty server intensive at the moment and it’s running on Google App Engine, which has that new pricing which means I should have my free quota expire after something like a meager 100 users. But I haven’t really come close to that. Why? I guess I have little influence over friends.