somewhere to talk about random ideas and projects like everyone else



Handwriting Recognition with Microcontrollers 30 June 2015

For my final project in 6.115, a microcomputer electronics class which I (and apparently nobody else) affectionately refer to as “leeblab”, I built a simple gestural input system. At its core lies an ordinary 8x8 LED matrix hacked into a low-res CCD and display coupled with a gutted expo dry erase marker used as a light pen. And per class requirements, it used a rube goldbergian cascade of TTL logic, an 8051, Cypress PSoC 5, an Arduino Pro Mini to process and massage the signals into USB HID compliant form, so that a computer might be able to use the contraption as a keyboard.

I had an 8x8 LED matrix laying around, and ‘twas the season that I had to come up with a final project for 6.115, a microcomputer electronics lab class. I vaguely recalled that an individual LED would generate a potential difference if you pummel it with enough photons. So I figured a cool and somewhat clever thing to do would be to create a display which could simultaneously act as a camera (pretty orweillian in retrospect).

I'm not totally sure about this, but I think this was the pinout of the LED matrix that I had. Notice that there doesn't seem to be any sensible mapping between spatial position and the corresponding pins

The LED matrix was something like this one. It’s wired using a technique called charlieplexing, where there’s a long wire along each row that connects the anodes of the LEDs, and another long weire along each column which connects the cathodes (modulo dyslexia).

You can imagine taking a battery and a few clips and touching one point along the row wires and another point on the column wires and see a single pixel light up at the intersection of those columns and rows.

Raspberry Pi 14 August 2012

As part of the shift between long multi-kiloword blog posts which are somewhat more like press releases back into a sort of more personal (i.e. blog-esque) format, I guess I’ll talk about my newly-arrived Raspberry Pi. Right now, there isn’t terribly much to talk about since I’ve only had it for about two weeks.

I’ve been planning on getting a Raspberry Pi for a pretty long time, and I was actually pretty excited about that. For the weeks preceding the official announcement, I built a tiny script which ran in a ten-minutely cron job which would basically download the purported Raspberry Pi store (, note the dot-com rather than dot-org, which their homepage is situated at) and compare the hash, notifying me via Ubuntu’s built in notification system.

On that sleepless night when the actual pre-sale announcement was being made, I was incessantly checking, which had suddenly morphed into a server maintenance message (which remains to this very day). The anticipation was intense, and some twenty minutes after it was supposed to happen was when I realized that the whole time I had been checking the wrong page. The announcement came instead on, their main blog, and by that time, it was certain that all the distributor’s sites were already collapsing under the crushing load of a million souls crying out for a taste of berry-scented silicon pastry.

On the next day, I checked the sites again, and all the order pages were already closed. Either way, it wasn’t terribly useful for me because most of them didn’t support Paypal. Fast forward a veritable eternity, on June 16th, I was notified via email that RS Components that I would be allowed to order the device some time in the near future. Sure enough, on the 22nd, another email gave me a link to the order form, which I promptly filled out and I began the process of waiting. Not really, since I had other stuff to do and most of my interest had already vaporized at the daunting 7 weeks it was supposed to take.

Another eternity later, it arrived in some rather nice packaging. It actually came as a bit of a surprise, because I had become so accustomed to waiting that I had never really expected it to materialize so suddenly. But when it did, it was everything I imagined and more. It came in this rather nice cardboard box, which I eventually cut in half with an X-Acto knife (which nowadays, I use for all my paper splicing needs) to build a makeshift case. I fumbled around in a closet and found a neglected 16GB SD card (probably back from the era when point-and-shoots were actually preferable to mobile phones) and installed that weird Debian distro (after having a little internal debate on what to install). But the first thing I had done was plugged it into a monitor through a HDMI-to-DVI converter. I took the charger from my Galaxy Nexus (I wasn’t using it for anything since I charge it in my room from my HP Touchpad Charger, and my Touchpad idly draining power from a cool inductive stand, the standardization of chargers is really pretty awesome), and used that as my Pi’s permanent power supply.

I also had a 2000mAh LiPo battery which I was going to use with my Arduino LilyPad for some cool foot-operated telegraph which I wanted to use as essentially a UPS for the Pi, but a bit of googling reveals that that might possibly entail actual electronicswork, so maybe that’s something for later.

I turned it on, and lo and behold it didn’t work. I actually never quite figured out why. Then, I tried plugging it into a really old 13 megaton CRT TV, which makes me realize how it’s sort of weird that the unit of megatons is hardly ever used for things other than atomic weapons, and now it feels oddly inappropriate for a hyperbole for the mass of a TV, but maybe it’s actually sort of appropriate because CRTs are terrifying. So analog seemed to work, except for this problem where my keyboard would keep repeating letters and not working well. That wasn’t a good start.

But after a little googling from my Chromebook, it turns out the keyboard issues came from the fact that I had plugged in my only spare USB keyboard which happened to be a Logitech Mouse+Keyboard+Speaker thing and my teensy Galaxy Nexus charger couldn’t eke out enough watts to power it. And the issue with the HDMI-to-DVI thing was just because I needed to restart with the cable plugged in. But neither of them posed a real material issue because I had been intending to use it as a headless rig from the start.

The first thing I really noticed was how surprisingly easy it was to install things. I had expected the ARM repositories to basically lack everything which might be useful, but it turns out that actually almost everything I wanted was available. I didn’t dare compiling anything, but Node (albeit a somewhat old version) was available from the repos, so I never really needed to. I had to manually update to a new version of npm, but that wasn’t that bad. I set up forever to run a few apps, but not much.

One of the main reasons I could justify getting the Raspberry Pi however, was to run my Facebook logging script on something other than my main computer, and aside from getting confused trying to use sendxmpp, it was fairly straightforward.

Determining if a Mousewheel Event Results in Scroll 14 August 2012

So here’s a somewhat technical post, actually it’s pretty technical. But either way the premise is sort of simple to understand, and probably so is the context. I’m working on Swipe Gesture 2.0, which basically tries to take Chrome and Safari on OS X Lion’s awesome back-forward transitions and make them work on other operating systems. See, the thing is that multitouch isn’t _strictly _a requirement for it to work, a lot of computers just have the little bars on the bottom and right of the track bar (often with a little somewhat abrasive textured surface so you don’t accidentally tread upon it). Regardless, the title is a bit of a misnomer, because even though the event is called the “mousewheel”, it’s hardly meant to be observed from an actual mouse (or a wheel), instead it means the scrolling gesture on some kind of trackpad, either multitouch or not.

Well, first, I guess I’ll talk about the difference between how Lion and Leopard do it. The way Leopard did it was pretty cool but not particularly applicable to other platforms since it relied on the existence of a three-finger gesture. As in, you needed some kind of touchpad which was cool enough to support three-finger multiouch, reliably. It also behaved completely independent of the current zoom or scroll position, which makes implementation in software entirely trivial given access to some drivers which can recognize three fingers on a touchpad.

Lion did it a completely different way. Instead of creating an entirely new gesture which was entirely dedicated to the singular task of navigating through history, it conflated the notions of scrolling with navigating, which sort of makes sense. Apple’s quite dedicated to skeuomorphic metaphors, and they want to treat the web more like literal pages. A user can move it around to better keep certain things in view, and the physical movement to slide a sheet out of view is just an extension of that panning gesture.

However, technically this poses a completely different challenge, because this requires you to distinguish between scrolls and navigation requests. Scrolling is always the default behavior, but the navigation swipe gesture happens when scrolling wouldn’t actually result in anything. However, many implementations of scrolling are at least somewhat kinetic, often it’s emphasized in software (in the form of smooth-scrolling) or hardware (scroll wheels that don’t click but instead move basically freely) or because your arm has to obey the laws of kinematics (unless it doesn’t, in which case that’s certainly fascinating). So not only does the software have to determine when a mouse wheel action results manifests as a scroll, it has to see if it was the user’s intent to do the extraneous scrolling.

This is done by clustering the mouse wheel movements together temporally. Scroll events flow in in discrete chunks, and you can split events off into little buckets (in a sense), where if there isn’t any event sent within some arbitrary threshold (say, 500msecs or half a second), you stick stuff in a new bucket. This way, lets say you scroll from the top of the page to the bottom and you’re sort of excited, and spin the wheel as fast as possible, you hit the bottom of the page but it’s not some instant stop. You continue scrolling (because you’re just that excited, and just can’t stop) for a little bit more. Ignoring the fact that you probably won’t have a vertical/horizontal event handler (though there are some sort of intriguing possibilities for this, one idea is to have the upper threshold trigger full screen). Without segmenting them into certain buckets, it doesn’t recognize that the time when you’re ramming into the top of the page is part of the same general gesture as when you were scrolling, and it may interpret that as an intentional gesture. So that’s one part which makes it a bit more complex.

So now, you have these series of mousewheel events conveniently delimited into little gesture-chunks. The next part is determining whether or not the gesture-chunks are part of a scroll action or not.

Thankfully that’s a really simple thing to do. Just look at the document’s scrollTop and check if it’s zero (or scrollLeft for horizontal stuff) or whatever value is the width of the element. If it can’t scroll no more, then you have a winner and you can start the falling balloons and confetti.

Except it’s not that easy, because the document isn’t the only thing which can scroll. Thanks to the glory of overflow:scroll, there are lots of things which can scroll. Things which aren’t necessarily documents may be in arbitrary scroll positions to wreak havoc on your well-meaning heuristics.

So back to the drawing board, I guess. Actually, to think of it, maybe it’s simpler to listen for the scroll event, which fires when a scroll happens, and quite intuitively doesn’t fire when a scroll doesn’t happen. And mouse wheel actions always precede scroll ones (because the wheel events bubble and are cancelable, so you can prevent a scroll from happening). The only problem is that scroll events don’t bubble. As in, when a scroll event happens on some element, it’s not going to show up on the document, it’s only going to show up if you’re listening on that specific element at the right point in time.

The naive approach to this dilemma is just to attach a scroll listener to every single event on the document, and to reattach to some other elements whenever the DOM tree is modified in some way. This means the overhead grows rather significantly when pages are larger, in a way which could be likened to O(n) time where n represents the number of nodes in the document. If you want, you could lazily do it by attaching the scroll listener only once the wheel event has fired, but that would cause a significant delay when attempting to legitimately scroll.

Another thing you could do, is to make another assumption: that the element which gets scrolled has to be some parent of the element which the mouse is currently over. Making that assumption, we can add a mousewheel listener to the root of the document, as those kinds of events actually do bubble. And since they’re mouse events, once you capture it, you can get a clientX and clientY, comprising the current coordinates of the mouse. And with that, you can get the element immediately below the cursor with document.elementFromPoint. And since the scroll might fire on any one of the elements which are parents of the current element, you ascend up the tree and add a listener on all of those (until, of course you hit the document element, at which point you can’t go any further up). This yields performance which could essentially be modeled with O(log n), quite a bit better than O(n).

So now the finished process is fairly simple, you listen for a mousewheel event, and when it happens we determine the element, and ascend the tree, yada yada. That scroll listener, when fired, sets a global variable lastDetectedScroll to the current timestamp. We set a little temporary variable set to the before time and then we set a little timer, 150 milliseconds. It usually only takes like four to see if a scroll thingy happened, but let’s be safe by having an order-of-magnitude threshold. The Cuckoo clock rings, and we check if the lastDetectedScroll is the same thing, and if it is, it’s a swipe, and otherwise, it’s a scroll.

Here’s a little demo: