Wednesday, 8 January 2014

Weighing filter & map Against for’s guards and yield

On the surface, it seems that a combination of filter and map can achieve the same end as a for-comprehension with judicious use of guards (e.g. if-expressions) and yield.  Or to say the same thing in code:

b.filter(_ % 2 == 0).map(2 * _)

takes the same inputs and produces the same outputs (with the same lack of side effects) as:

for (elem <- b if (b % 2 == 0)) yield 2 * elem

Cay Horstmann (for it is from his book, Scala for the Impatient that this code and the jumping-off point for this post is taken) points out that the former is preferred by those more experienced or comfortable with the functional idiom, but what I want to examine for a little while is if there are any other reasons why you would pick one over the other.

Legibility

Lets kick off with my old drum, and consider how clean and clear to the eye and comprehending mind they both are.  If I’m being honest, in the example above, the filter/map combo has it hands down.  Just as the method calls chain, so does your understanding as you follow it through, from left to right.  With the for on the other hand, you need to either ignore the elem <- b, jump to the guard to get an image of what elem is, before jumping back to the right again to the yield.  A lot of work for such a simple piece of logic.

It also helps greatly that (now I have a good mental construct for map) that we can read the method names and immediately call to mind what is happening.  In the for version there are no such explicit semantic clues to latch onto.

Flexibility

So far it’s a resounding 1-0 in favour of filter and map, but lets redress the balance.  The example we’ve used so far is an incredibly simple one. What if we needed to have two generators?

for (i <- 1 to 3; j <- 1 to 3) ...

Simple. However our filter and map combo can’t compete because here we’re calling methods which need to be on single object instances.  There is nothing available to us which can produce the same effect as simply (I’m sure you could get all complicated, but there’s nothing which will have the “just what I expected” feel of this common for element.

Before we move on, we out to kick it up another notch and mention yield expressions can get pretty powerful too (note this example was lazily and brazenly stolen from Atomic Scala by Bruce Eckel and Dianne Marsh):

def yielding3(v:Vector[Int]) : Vector[Int] = {
  for {
    n <- v
    if n < 10
    isOdd = (n % 2 != 0)
    if(isOdd)
  } yield {
    val u = n * 10
    u + 2
  }
}

Here, while it is conceivable that in some circumstances our filter and map combo could achieve something similar, it would likely be at the expense of their legibility (which is worth a lot to my mind).

Before we finish this aspect, it’s worth pointing out that for’s multiple guards could be represented easily by chaining the same number of filter calls instead of the single one shown at the top of this post, but that’s not quite enough to tip this one back towards a no-score-draw.

Summary

So, it seems that there are indeed reasons when you would want to use for in preference to filter and map. In all cases it’s when you want to unleash a bit of the generator / yield power beyond the simple simple use case.  Seems pretty sensible to me, and a rule-of-thumb I’ll be applying in the future.