somewhere to talk about random ideas and projects like everyone else

stuff

#syntax

Methods and Functions 16 December 2015

I watched a recording of Rob Pike’s talk Simplicity is Complicated a few weeks ago, and I was struck by his sentiment that many popular languages seem to be converging by adding popular features from other languages.

In particular, it seems that many popular languages combine aspects of object-oriented and functional programming— support for classes, high-order functions, and reduction. Sometimes this leads to situations where the distinction between methods and functions are confusing or inconsistent.

In Javascript, you find the length of an array or string using a getter method "blah".length but in Python[^2], it’s done through a function invocation len("blah").

This dichotomy is more evident when operations get chained in sequence to process some stream of data. Here’s an example in Java 8 of the object-oriented approach where methods are chained[^1]:

double average = roster.stream()
    .filter(p -> p.getGender() == Person.Sex.MALE)
    .mapToInt(Person::getAge)
    .average()

An alternative way to write the code is by thinking about it as function composition— as you can do in Haskell:

(sum (map sqrt (filter (>3) [1,2,3,4,5,6,7])))

An interesting exercise is to pay attention how your eyes move as you scan through snippets of code to figure out what it does.

To do that, let’s abstract away the slightly different syntax for these different languages and imagine a hypothetical language where x.y.z is syntactic sugar for z(y(x)). That is, when a function name is suffixed after a . behind some expression, it is equivalent to wrapping the entire expression and using it as the first argument.

More concretely, 32.double is equivalent to double(32), and 49.divide(7) is equivalent to divide(49, 7).

For a more complex example:

range(0, 20).map(x: sqrt(x)).filter(x: x > 3).sort.sum

sum(sort(filter(map(range(0, 20), x: sqrt(x)), x: x > 3)))

With the first code snippet, the process can be read left-to-right, and you can imagine it as the story of a little chunk of data flowing through the system.

With the second code snippet, you instead read the function from the inside-out. To a certain extent, it’s right-to-left, but you have to check the other side to make sure you aren’t forgetting any extra arguments.

Let’s try to visualize this:

sum(sort(filter(map(range(0, 20), x: sqrt(x)), x: x > 3)))
                    ------------
                ----            ------------
         -------                             ----------
    ----                                               -
---                                                      -

You can see here that interpreting the behavior with the functional style[^3] begins in the middle and approaches the beginning right-to-left. But when you reach some new function name, you have to zig-zag to the opposite side to visually inspect whether or not the method has any additional arguments.

range(0, 20).map(x: sqrt(x)).filter(x: x > 3).sort.sum
------------
            ----------------
                            -----------------
                                             -----
                                                  ----

The method chaining approach (read: OOP style) is nice because it fits the conceptual model of data flowing through a sequence of transformations, and it doesn’t disrupt the typical western left-to-right reading order.

To me, this seems much easier both to read and to write (for I am not so blessed as to use Paredit and must carefully tread a vast sea of syntax to manually close my parens) and I’ve always been mildly infuriated trying to learn Haskell[^4] because function composition feels much less natural[^5].

One of the stated reasons for why Python uses len is that you can’t glance at the type of the expression by looking at its head. When you’re scanning through a line and you see that the outermost function is “len”, you can immediately tell that the output is a number. When length is determined in postfix, you can’t determine the final type of an expression until you look at the end of the expression.

Another problem is that this linearized sequential syntax only helps when the the structure of an expression is relatively linear. Imagine instead that you have an expression like

sum(range(floor(sqrt(multiply(3, 4))), divide(square(5), 2)))
                     --------------           
                -----              -          ---------
          ------                    -  -------         ----
    ------                           --                    -
----                                                        -

In this case, both multiply(3, 4) and square(5) are of relatively similar depth which gives it the humps making it look like a snake eating an elephant.

Flattening it with the . syntax then forces range into a little invisible spot in the middle and that makes it a lot harder to interpret the behavior of the program.

3.multiply(4).sqrt.floor.range(5.square.divide(2)).sum

In fact this dot stuff is beginning to look a lot like the postfix notation in stack-based programming languages like FORTH) (read: similarly unreadable).

That actually leads to the nifty realization that it’s kind of like tail call optimization for the human mind.

When reading some code composed of nested function application, you end up having to keep track of how many level within the function you are. There’s an idea in psychology that human working memory can contain up to 7 ± 2 items. So reading some deeply nested set of expressions is naturally difficult.

But if the expression tree naturally represents a sequence of actions transforming a piece of data— you can reorganize it in such a way that depth of your mental stack doesn’t keep growing and objects can be substituted in place.


Passing objects as the first argument isn’t really all that strange. The way classes are implemented in python, methods are just functions with an implicit self argument.

class Dog:
    def talk(self, text):
        print "woof!", text, "bark!"

shibe = Dog()
shibe.talk("wow! such speak!")

So with the code above, shibe.talk("wow! such speak!") is really just shorthand for Dog.talk(shibe, "wow! such speak!").

Neither is it particularly strange to access method properties through function calls. This seems to be quite typical for accessing fields of typeclasses in Haskell.

data Car = Car {
    company :: String, 
    model :: String, 
    year :: Int
}
let WusTangClan = Car {company="Ford", model="Wustang", year=1967}
model(WusTangClan) -- "Wustang"

[^3]: Note that while I’m characterizing this as the functional style, many functional languages such as F# have a “pipe-forward” operator “|>” which has a similar effect. Haskell seems to have a convention where the data is passed as the last argument rather than the first argument, which makes implementing this particular strategy a bit clumsy.

[^2]: In the Python FAQ Guido describes the rationale behind the “len” function as opposed to using a method on strings and lists. Interestingly more recent revisions of the FAQ make it sound as if it were some mistake rendered unfixable by history.

[^1]: Something analogous to this sort of chained-method style is also used in popular frameworks like d3 and jQuery.

[^4]: Well, there’s also that incident where I tried installing some Hello World example with Cabal and ended up, through a far more infuriating series of events, rendering my Macbook unbootable.

[^5]: In similar vein, I’d like there to be a “@\” in Mathemath so that I can do data \@ transformation 1 rather than the other way around with the existing “/@” operator


Array Addition and Other Fun Javascript Hacks 30 June 2013

fun hacks

It’s times like these that I’m reminded of my favorite childhood TV show (actually it’s kind of a close contest between this and Spongebob), Mythbusters, and that perennial warning: “don’t try this at home”. But this isn’t actually dangerous in any physically life-damaging way, it’s just dangerous in that it’s a very bad idea. The whole thing started with a friend who was irked by a few Javascript oddities, one of them being the inability for arrays or tuples of numbers to add.

I actually came across an idea some time ago while reading about cross-site JSON exploits. But the premise for adding arrays is rather simple: since the language only supports subtracting and adding numbers, the javascript engine will try to convert an array to a number by calling its valueOf method. There really isn’t a numeric representation of an array so valueOf usually just returns NaN, which isn’t tremendously useful. However, we can override the valueOf function to return any number we want.

We can create a valueOf function which assigns numbers to arrays according to a specific pattern, such that the resultant number can reveal (without much ambiguity) which arrays were involved, and whether they were added or subtracted. There are probably a myriad of ways to construct numbers like that, but one of the simplest (or at least the first one that I could come up with) is to raise some given radix (at least 3) to a unique number that increments every time valueOf is called. You can understand it pretty simply in base 10.

Lets say our first array is [1, 2, 3] and our second array is [5, 4, 3]. Every time the instance’s valueOf is called, we plop it onto a temporary global array and record the index. In this case, lets assign the first array index 1, and the second array index 2. If we raise the radix 10 to that index, we can get the respective unique numbers: 10 and 100. Now these numbers can be added or subtracted, leading to 110 or 90 or -90 (or unaltered, leaving 10 and 100). To find out what operations and what numbers are involved in the operation, we can first add 1000 which is 10 raised to the size of the global array, which has the useful effect of making everything positive. Now we’re left with 1110, 1090, 0910, 1010, and 1100. We can ignore the first and last digits, and each digit can be either a 1, a 0 or a 9 (if it’s anything else, this is just a number, not the result of a magical array addition). A 1 tells us the number was added, a 0 tells us the number isn’t used, and a 9 tells us that the number was subtracted.

This leaves a weird problem though, which is how you’re going to convert a number back into that array transparently. And this segues into the more atrocious segment of this hack. If and when Skynet and the other assorted malevolent (uh… I mean, misunderstood) artificial intelligences develop their courts, the fembots and gentledroids of the jury will no doubt consider me guilty of whatever felonious usercrimes may exist. They’ll consider me an equal of Norman Bytes, butchering idiomatic javascript in the shower.

What exactly makes it such a heinous offence, you may find yourself asking. The answer is simple: the unholy matrimony of with and Object.defineProperty (also, keep in mind that it isn’t DOMA’s fault).

The great thing about Object.defineProperty is that it lets you get in the middle of the whole variable assignment process. The problem is that this only works with object properties, not top level variables. But Javascript has a nice (read: wretched) with statement which lets you treat variables as if they were object properties. This still leaves a slight problem because there’s no way to define a catch-all property. And if nothing so far has wrought utter terror to your soul, this last critical part may very well do exactly that. Since variables that are called must exist there in name, we can use the Function toString method to decompile the source and use a simple regular expression (any symbol which starts with A-z $ or _ followed by any of that or a number) to extract candidate variable tokens, and for each one, we can define a getter and setter on a new object which is subsequently passed as the first argument to the function whose first enclosed statement is a with.

The power of intercepting all function calls, variable declarations and retrievals then comes by recursively creating another fake object filled with getters and setters whenever a property is accessed. For method calls to primitive types like strings and numbers, we do the same type of sorcery but directly on their respective prototype properties. Whenever a function is passed a number which happens to match the pattern for an array addition or subtraction, we can passively intercept and substitute its value. Any string which matches a certain pattern of CSS selectors can be then transparently substituted with the NodeList which results from a document.querySelectorAll. And we can change all the variable declarations for a for..in loop such that array values are used instead of keys.

And now, four minutes before the end of this month, I’ve successfully yet again managed to eke out a blog post to fulfill my quota. And I guess I don’t have an Humane Society to prove that no humans were harmed in the making of this blog post, but how bad could it possibly be— it’s only 150 lines.


CloudFall: A Text Editor 06 July 2012

This is probably a horrible time to think of writing an online text editor. The “market”, if you can call it that, is virtually saturated with worthy contenders and (figuratively) every day is marked by the entrance of some other text editor to the already crowded space. Part of the problem is that now, web based text editing components are actually pretty awesome. Ace and CodeMirror work really quite well and each have associated with them some very formidable integrated development environments, referring to Cloud9 and LightTable, respectively (albeit there are quite a few others, these seem to be probably the most funded). And at Google I/O 2012, they announced the new Chrome Packaged Apps, which expose new functionality (most relevant being the ability to open and save files from the local disk and operating offline by default), and their sample apps include no less than two text editors.

Nevertheless, I’ve always pined (probably an exaggeration) for the opportunity to indulge in something as meta as writing a text editor which I could use as a text editor for the text editor (which is, if you did not notice, that text editor). It’s probably the pinnacle of dogfooding (probably nice coming after a streak of making extensions that I never actually use). But this doesn’t really go anywhere in terms of explaining why I’m doing this rather than using someone else’s rather better developed text editor package. It really comes down to this chrome app which I was quite a fan of almost a year ago, called SourceKit. SourceKit is a text editor which runs in Chrome that synced with Dropbox. The version which was distributed on the Chrome Web Store never supported offline, which is sort of sad because that’s the one feature I really wanted. The other source of inspiration was Streamie, the real time node based twitter client which had an absolutely phenomenal scheme for contributing. You just needed to fork the repo on github and all your commits would be visible live on your own subdomain of the site.

Also, in the mean time, I had discovered this pretty awesome text editor (no, it’s not VI or Emacs, because I’m not nearly cool enough or dedicated enough to approach that steep precipice of a learning curve) called Sublime Text 2. I have a pretty good picture now of what exactly I want from a text editor based on that (actually, I don’t know if this is just some delusion which has manifested in my mind because I probably thought the same way about gedit and krita and komodo and aptana and notepad2 and notepad++ and dreamweaver before that).

The real drive to creating that text editor happened in the week before I was scheduled to attend Google I/O 2012. I knew I’d be in some situations lacking Internet, and I felt like writing words or code or something in that time. So in a few days I hacked together this text editor which had a vague resemblance to Sublime Text. It was based on Ace, like SourceKit before it, but obviously a newer version with a whole bunch of syntax and themes included. It used a modified version of the Dropbox component from SourceKit (it was changed moderately to accommodate Dropbox API 2.0 and to deal with binary things like array buffers and blobs). And I added a little heads-up-menu-esque file and command picker, quite reminiscent of Sublime Text (and to a lesser extent Ubuntu Unity).

And then I headed for the airplane having not actually used it much in practice hoping to be super productive while using it, which you could have probably been able to tell by the beginning of this sentence wouldn’t pan out. The first (quite major) flaw which I encountered was some bug which would end up wiping out any file that I tried to edit, and left me with a gaping chasm in my hard drive (this is a metaphor, because my Chromebook is solid state). I just hope nothing of value was lost, but it remains quite likely that it was the otherwise.

I had no real system for testing out changes to my own extension, and I was too paralyzed of the fear of deleting my own extension to try changing it. So the end result was that the entire duration passed with me hardly doing anything productive on the project, or using it for any productive means in itself. All that happened was my discovery that everyone is working on a text editor right now and I should probably quit right then and there and work on something perhaps more novel or productive. But I came back and fixed it and did some more on it, and I’m finally at the point at which I feel comfortable using it for some mostly useless things, like writing a blog post inside the project’s readme about the project itself.

Welcome to CloudFall. Yeah, I’m aware how dumb it sounds, but the fact that the new James Bond movie is going to be called Skyfall essentially demolishes any hope of using that name. But rather than using this as a vindication of how cool that name would be and abandoning it for some novel name, I’m just going to contrive something in a similar vain, hence the current working name. But rather than spending the first few lines complaining about the name of the product, I should probably lay down what exactly this project aims to be. It’s a text editor, not a terribly glamorous concept, I know, and this is especially not a terribly great time to start. This is hardly the first text editor, and certainly will not be the last (until this either never finishes or the world actually does end by the end of this year). It’s not the most full featured, but I guess it does need to have something which distinguishes it.

The main feature is largely inherited from SourceKit, which is the ability to sync with Dropbox. And instead of editing “live” off of Dropbox’s servers, it really is more of a sync. It’s built around Chrome’s FileSystem API, so it has its own sandboxed imaginary folder where it sticks all the files. Every time a file is loaded, it’s first downloaded onto its spot on the imaginary folder and subsequent edits get sent to both the local copy and the server. This architecture, in theory, means it probably won’t inadvertently overwrite, corrupt or delete your important data in the advent of some syncing issues. It keeps track of the latest synced revisions and tells Dropbox’s API that information so that it won’t try overwriting a newer version from elsewhere and It’ll somewhat gracefully save to a different file in such a circumstance (though that routine can hardly be declared well tested, so beware of complications). It should in that way retain very close to full functionality while offline, since just about everything it does is entirely offline (including the compilation of CoffeeScript and LESS, which is explained later on).

Built into the extension are a few of the tiny tweaks that I have installed on Sublime Text which I find fairly useful. For instance, the app includes a copy of the CoffeeScript and LESS compilers, and so whenever you are editing one of those types of files in the text editor, it’ll automatically compile and save it into the same directory whenever you hit the save key. That’s actually pretty useful because it gives CoffeeScript back that REPL (almost certainly a malapropism, but I’m not too familiar with developer work-cycle jargon, so please excuse that grievous offense) dynamic that JS developers are so familiar with. And to aid that sort of work, it can preview your files “live”, even offline. While it can’t open and edit binary files like pictures, it can still download and display them fine.

For writing walls of text which don’t really need syntax highlighting (I wonder if someone has a package on mainstream text editors that syntax highlights free form English grammar, sort of like giving people synesthesia for no real reason), it includes a word count widget. Also, because I feel like encouraging unhealthy behavior, right next to the word count is an indication of your current typing speed. I’m pretty sure someone could go into some moral discussion into why it’s a completely detrimental addition to create something which displays how quickly the author is typing because it encourages a mindset which doesn’t focus on creating the most concise or effective means of delivery for some message, but rather promotes meaningless verbosity. But sometimes I would imagine using this for school assignments and the ilk, and maybe it would be nice to know when I’m reaching my designated quota filling some unwritten word requirement. I’m not quite sure how reliable of a word count it gives, since the algorithm is primitive and by no means nuanced, but it should be able to give a sort of rough estimate of the number of words on a page.

I may have mentioned earlier that there are two primary systems for interacting with the application, both of which are keyboard shortcuts: Ctrl-O and Ctrl-P, meaning the file browser and the command palette, respectively. They both appear in the same sort of interface component and generally behave the same, but there are some slight differences in how they operate (obviously). There are actually two “browsing modes” for the file browser, where it shows files either from the local stored cache or online in Dropbox. That can be switched by selecting either “Browse Mode: Local Copy” or “Browse Mode: Remote Dropbox” from the command palette (though there should probably be some better interface logically placed on the file browser panel itself). Remote Dropbox only works when the user is online, so the default mode is the local one. The sole interface to this list of files is the filter search box on the bottom of that widget, where typing in stuff filters out things. Of the visible items, the cursor can be manipulated by using the arrow keys, and one of the options can be executed by hitting the enter key. The widget needs to be manually dismissed by hitting the Esc key (which allows you to semi-rapidly open several files or try different commands, like changing themes). When a folder is activated, the context then shifts to what is inside the folder, descending into the hierarchy. It’s possible to ascend by entering the directory “../“. To create a file or folder (Note that at time of writing, creating folders does not sync to Dropbox), just type “+yourfile.txt” (or if there are no search results, it’ll automatically select the option to create a file with that name). To delete a file, you likewise just prepend your command with a minus instead, for instance “-oldfile.txt”. It’s not nearly as counterintuitive or confusing as I presume this description is making it sound.

The command palette, given that the interface is basically the same has much in common with the file system, but it’s entirely flat and linear. It lacks the hierarchy that plagues the file system, and navigating it is considerably simpler. Since the list is filtered and updated live with every keystoke, it’s fairly easy to find whatever command you’re looking for. The filtering algorithm only searches for characters but disregards gaps, only paying attention to order. In other words, the query “seals”, may match “SEt SyntAx: coLdfuSion”, it also means that you can construct very short queries to find what you want.

At this point, I don’t think it’s a final project per se, there are still a few features are woefully dysfunctional like Find and Replace. But I think it fits my own use case, editing files on my Chromebook, even when offline. It’s fast and doesn’t take too many steps to start. But the interface isn’t exceedingly intuitive (Basically everything is accessed through two keyboard shortcuts). I haven’t found a witty logo for it (I just haven’t bothered looking), and at this point I’m just writing a blog post about it because I feel obligated to document whatever it is that I just did.

Maybe some day I’ll pick it up again, or if there’s some response at all to this blog post, and add those awesome collaboration features (which probably won’t be very useful because the entire thing is quite hackish and not particularly kind to the prospect of improvements). Maybe I’ll fix up the search and replace interface (read: get rid of whatever horrid mess exists and replace it with a new one from scratch). And maybe I might publish it in some form that is significantly less involved than cloning off github and packaging the app yourself.