Sunday, January 21, 2007

How to Breathe, Functionally (part 1)

David Welton writes that the problem with functional languages is that you don't get to mentally "catch your breath" between computations; piling function invocations on function invocations makes it hard to understand what's actually being calculated. He has an amusing summary of how functional code can become unreadable:

Do this with the result of this with what this produces when given this input that is derived from this function which takes the result of this other thing after calculating the result of something that is derived from the output of a function that ...

Indeed, I've seen functional code like this (much of it written by myself). Here's a particular gem:

f splitPoints = map fst $ tail $ scanl (flip ($) . snd) ([], toSplit) $
map splitAt splitPoints

I wrote this code a year ago, and I don't know what it does! It's hard to read and hard to understand. So, however, is imperative code that says

Add 5 to A, now if B is not zero we jump to where we subtract one from B and if C is less than 3 we add 4 to C otherwise we divide C by 2 and then add it to A, then we ...

The reason both pieces of pseudocode are hard to understand is that they don't use the abstraction features built into the language. Good imperative code uses meaningful variable names, uses the structured-programming primitives the language provides, and abstracts complex operations into functions.

My rule of thumb (which is, heh, not always religiously followed) is that a piece of program structure that does more than about 3-6 things is probably too complicated; it seems like that's about as much local context as I can hold in my head comfortably (and the number is more 3 than 6 these days, if you get my drift). As David puts it,

If you'll bear with a poor comparison, sometimes programming is like a song. You do the tricky section, then you go back to an easier section and catch your breath for a bit.

The problem seems to be that people trained on imperative languages, when they're tossed into the ocean of functional coding, can't find the guideposts and landmarks that would help them impose some structure on the water. Should I put this complicated expression inside a "let", or break it out into another function? When do I use a higher-order function to unify two dissimilar pieces of code? Should I write this code in terms of higher-order library functions or directly? It's easy to wander off down a chain of nested invocations of anonymous higher-order functions, or composed partially applied library functions, just "because you can" and end up forgetting what you were trying to do in the first place.

Luckily, some of the basic rules of keeping your code readable remain more or less the same: assign intermediate results to variables with meaningful names, and abstract complicated operations into separate functions. To tackle David's example:

You get sections with DoThisAndThisAndThatWithTheResultsOfSelectThisFromTableXYZ that you have to slow down to think through.


GoToTheNextTrickyBit where ManyThingsHappen.

All in all, you wind up with a sort of ebb and flow that works out pretty well if you're in tune with the code being written, going from intense calculations, to a line or two where you do no more than store things in a variable, take a breather, and prepare for the next important stanza.

The way to "take a breath" in a functional language is generally to assign an intermediate result to a variable, or to extract a complicated operation into a function and give it a meaningful name. One Haskell approximation to David's example is

do DoThis
results <- SelectThisFromTableXYZ
someVal <- AndThat results
GoToTheNextTrickyBit someval

If you're writing pure functional code, you might write

results = SelectThisFromTableXYZ
someVal = AndThat results
GoToTheNextTrickyBit someVal

Of course, this will become unreadable if you write a huge complicated nested "let" expression in place of "AndThat". For a truly dreadful example of this kind of coding, check out the Gnucash Scheme code that generates the "Cash Flow" report (warning: read this in short bursts, or you WILL go blind). This is exactly analogous to writing too many nested loops and conditionals in imperative code, and the fix is the same: find logically distinct bits of code and extract them into separate functions.

I have a few more thoughts, but this post is getting too long, so I'll inflict them on the world in another post. The rule that short things are easier to understand applies to blog posts as well as code. :-)