somewhere to talk about random ideas and projects like everyone else

stuff

#touchscreen

ShinyTouch/OpenCV 15 July 2010

I have yet to give up entirely on ShinyTouch, my experiment into creating a touch screen input system which requires virtually no setup. For people who haven’t read my posts from last year, it works because magically things look shinier when you look at it from an angle. And so if you mount a camera at an angle (It doesn’t need to be as extreme as the screenshot above), you end up seeing a reflection on the surface of the screen (this could be aided by a transparent layer of acrylic or by having a glossy display, but as you can see, mine are matte, but they still work). The other pretty obvious idea of ShinyTouch, is that on a reflective surface, especially observed from a non-direct angle, you can see that the distance from the reflection (I guess my eighth grade science teacher would say the “virtual image”) to the apparent finger, or “real image” is twice the distance from either to the surface of the display. In other words, the reflection gets closer to you when you get closer to the mirror. A webcam usually gives a two-dimensional bitmap of data (and one non-spatial dimension of time). This gives (after a perspective transform) the X and Y positions of the finger. But an important aspect of a touchscreen and what this technology is also capable of, a “zero-touch screen”, is a Z axis: the distance of the finger and the screen. A touchscreen has a binary Z-axis: touch or no touch. Since you can measure the distance between the apparent real finger and it’s reflection, you can get the Z-axis. That’s how ShinyTouch works.

Last year someone was interested and actually contributed some code. Eventually we both agreed that my code was crap so he decided to rewrite it, this time using less PIL and pixel manipulation, and instead, opting for more OpenCV so it’s faster. The project died a bit early this year, and with nothing more to do, I decided to revive it. His code had some neat features:

  • Better perspective performs
  • Faster
  • Less Buggy
  • Simpler configuration (track bars instead of key combinations and editing JSON files)
  • Yellow square to indicate which corner to click when callibrating (actually, I wrote that feature) It was left however, at a pretty unfinished state. It couldn’t do anything more than generate config files through a nice UI and doing a perspective transform on the raw video feed. So in the last few days, I added a few more features.

  • Convert perspective-transformed code into grayscale

  • Apply a 6x6 gaussian blur filter
  • Apply a binary threshold filter
  • Copy it over to PIL and shrink the canvas by 75% for performance reasons
  • Hack a Python flood-fill function to do blob detection (because I couldn’t compile any python bindings for the opencv blob library)
  • Filter those blobs (sort of) Basically, it means ShinyTouch can now do multi-touch. Though the Z-axis processing, which is really what the project is all about still sucks. Like it sucks a lot. But when it does work (on a rare occasion), you get multitouch (yay). If TUIO gets ported (again), it’ll probably be able to interface with all the neat TUIO based apps.

Code here: http://github.com/antimatter15/shinytouch/ Please help, you probably don’t want to try it (yet).


How I would design a touchscreen browser 24 July 2009

This is again, an old idea of mine, I drew it on a sheet of paper maybe a year ago, but I just remembered it.

A common theme with modern browser is maximizing screen estate (which I don’t actually care about, becasue I have 2 huge monitors). But if I were to have a netbook or some otherwise technically restrained device, I would think that screen estate is important.

My Idea is pretty cool. The idea is that there is only a tab bar on top. It’s as usual, allocated to the tabs, and there is on the side, a new tab button. But for this, the new tab button occupies the entire rest of the space of the tab bar, because space is precious. Sort of like the Mozilla Fennec browser.

forward and backwards navigation is achieved by throwing (not just gentle pushing, throwing, it should be kinetic, if you don’t thow hard enough, it just shows some text saying the equivalent of “throw harder!”).

At least in the way I browse, I don’t enter URLs often unless I’m on about:blank. So there is no URL bar. To find what URL you’re on, or to enter a new one, simply double tap on the current tab. It expands and fills the tab bar with a text box and the other tabs are condensed to icons.

Swiping down shows a drop-down for a tab with options to do things like bookmark or view source.

Thowing a tab down (which is a more violent swipe) removes the tab. Something partly inspired by the Mac OS X dock.

The new tab button could also be a menu, swiping down to reveal a menu of bookmarks to select from.

And the new tab page could be almost like a desktop. with widgets, gadgets and whatever (Google wave? If only I got my dev invite :’(). Well, in my idea, the top portion of the new tab page could be the URL bar and the rest could be whatever other browsers are doing + maybe some widgets/gadgets Dashboard or Plasma style.


ShinyTouch Zero Setup Single Touch Surface Retrofitting Technology 11 July 2009

So Mirrortouch is really nice, it’s quite accurate, very fast, quite cheap and it’s my idea :)

But while trying to hook up the script to my webcam and looking at the live webcam feeds from it pointing at my monitor (aside from the awesome infinite-mirror effect!) I discovered an effect that’s quite painfully obvious but dismissed earlier: reflection.

So a few months ago, I just sat in the dark with a few flashlights and a 6in square block of acrylic. I explored the multitouch technologies with them. Shining the flashlight through the side, I can replicate the FTIR (Frustrated Total Internal Reflection) effect used in almost all multitouch systems. Looking from under, with a sheet of paper over and shining the flashlight up, I can experiment with Rear DI (Diffused Illumination). Shining it from the side but above the surface, I can see the principle of LLP Laser Light Plane, actually here, it’s more accurately like LED-LP). MirrorTouch is from looking at it with one end tilted torward a mirror.

If you look at a mirror, and look at it not directly on, but at an angle, however slight, you can notice that the reflection (or shadow, or virtual image whatever you want to call it) only appears comes in “contact” with the real image (the finger) when the finger is in physical contact with the reflective medium. From the diagram below, you can see the essence of the effect. When there is a touch, the reflection is to the immediate right (in this camera positioning) of the finger. If the reflection is not to the immediate right, then it is not a touch.

From the perspective of the camera
ShinyTouch Diagram

It’s a very very simple concept, but I disregarded it because real monitors aren’t that shiny. But when I hooked the webcam up to the monitor, it turns out it is. I have a matte display, and it’s actually really shiny from a moderately extreme angle.

So I hacked the MirrorTouch code quite a bit and I have something new: ShinyTouch (for the lack of a better name). ShinyTouch takes the dream of MirrorTouch one step further by reducing setup time to practically nothing. Other than the basic unmodified webcam, it takes absolutely nothing. No mirrors, no powered light sources, no lasers, speakers, batteries, bluetooth, wiimotes, microphones, acrylic, switches, silicon, colored tape, vellum, paper, tape, glue, soldering, LEDs, light bulbs, bandpass filters, none of that. Just mount your camera at whatever looks nice and run the software.

And for those who don’t really pay attention, this is more than finger tracking. A simple method of detecting the position of your fingers with no knowledge of the depth is not at all easy to use. The Wiimote method and the colored-tape methods are basically this.

The sheer simplicity of the hardware component is what really makes the design attractive. However, there is a cost. It’s not multitouch capable (actually it is, but the occlusion that it suffers from will deny the ability for any commonly used multitouch gestures). It’s slower than MirrorTouch. It doesn’t work very well in super bright environments and it needs calibration.

Calibration is at current stages of development, excruciatingly complicated. However, it can be simplified to be quite simple in comparison. The current one involves painful color value extraction manually from an image editor of your choice. Then it needs to run and you need to fix the color diff ranges. Before that you need to do a 4-click monitor calibration (which could theoretically be eliminated). But it could be reduced by making the camera detect a certain color pattern from the monitor to find out the corners and totally remove the 4 point clicking calibration. After that, the screen could ask you to click a certain box on the screen which would be captured pre-touch and post-touch and diff’d to get a finger RGB range. From that point, the user would be asked to follow a point as it moves around the the monitor to gather a color reflection diff range.

The current algorithm is quite awesome. It searches the grid pixel-by-pixel scanning horizontally from the right to the left (not left to right). Once it finds a row of 3 pixel matches for the finger color, it stops parsing and records the point and passes it over to the reflection analysis program. There are/were 3 ways to search for the reflection. The first one I made is a simple diff between the reflection and the surrounding. It finds the difference between the color of the point immediately to the right and the point to the top-right of the finger. The idea is that if there is no reflection, then the colors should basically roughly match and if it’s not then you can roughly determine that it is a touch.

This was later superseded by something that calculates the average of the color of the pixel on the top-right of the finger and the color of the finger. The average should theoretically equate the color of the reflection, so it diffs the averaged color with the color to the immediate right (the hypothetical reflection) and compares them.

There was another algorithm that is really simple for when it’s very very bright (near a window or something) and the reflection is totally overshadowed (pardon the pun, it wasn’t really intended) by the finger’s shadow. So instead of looking for a reflection, it looks for a shadow, which the agorithm thinks of as just a dark patch (color below a certain threshold). That one is obvoiusly the simples, and not really reliable either.

One big issue is that currently, the ranges are global, but in practice, the ranges need to vary for individual sections of the screen. So the next feature that should be implemented is dividing the context into several sections of the screen each with their own color ranges. It’s a bit more complex than the current system but totally feasable.

So the current program has the ability to function as a crude paint program and some sample images are on the bottom portion of this post.

Hai!!!!
purty2009-07-02T18:46:47.550169

Yayness!
purty2009-07-10T19:21:53.657122

:)
purty2009-07-10T19:14:06.879415

No.
purty2009-07-02T18:48:09.650197