Friday, 17 January 2014

Classes in Scala - Beyond the (Java) Expected

There is a lot more power to the Scala type system than any other language I know, and I’ve yet to come across - let along understand – most of it. (I’ve tried a few times now to compose a post on this topic, but realised each time I was still way off being in a position to do so.) However, in my latest reading of Scala for the Impatient, I’ve come across some interesting extensions to what I learned from Atomic Scala.

Object-Private Fields

As you’d expect (if you’re coming from Java as I have) methods in a (Scala) class can access private fields of all objects which are derived from that class. But you can also add a more restrictive access flag so that objects can only access private fields of the current object:

private[this] var name // access someObject.name is not allowed

example from Scala for the Impatient by Cay Horstmann, pp51

Behind the scenes (which I’m beginning to realise is a good way to grok these things) class-private fields have private getters and setters generated for them, whereas object-private fields have neither getters or setters.

You can, if you like, even go one step further, replacing “private[this]” with “private[className]". In this case, the field can be accessed only by methods of the named class.  Here, the generated code has to generate public getters and setters because the JVM doesn’t support this fine grained level of access control.

Bean Properties

A simple one this. Scala by default creates getters which are simply the same name as the field, and setters which are the name of the field plus “_”.  This doesn’t meet the JavaBeans spec, so to also get the expected getFieldName and setFieldName methods you can:

class Person {
    @BeanProperty var name : String = _
}

example from Scala for the Impatient by Cay Horstmann, pp52

Primary Constructor Parameters

These can use @BeanProperty, private[this] and private[className] forms on top of the “expected” ones.

Additionally, if you declare a Primary Class Constructor Parameter without a val or a var form, then how the parameter is processed is dependent upon how you use the field inside the class. In the following, immutable fields name and age are declared and initialised that are object-private:

class Person(name: String, age: Int) {
    def description = name + " is " + age + " years old"
}

example from Scala for the Impatient by Cay Horstmann, pp55

However, if no method uses the parameters they are not saved as fields at all, and they are instead simply treated as parameters that can be accessed within the Primary Constructor and no more.

As suggested by Martin Odersky and helpfully repeated by Cay, I found it useful to think of Scala classes as being able to take parameters just like methods do.  That makes perfect sense to me.

Before we move on, a final word on making a Primary Constructor private:

class Person private(val id: Int) { ... }

example from Scala for the Impatient by Cay Horstmann, pp57

Lovely for enforcing builders etc, and also just like in Java.  Bosh!

Nested Classes

In our final topic, again, Scala seems hyper-flexible.  Anything (pretty much) can go inside anything else (functions within functions for examples) and it’s no surprise to find out that classes can go within other classes.  I won’t lift the whole example from Cay’s book here (it’s in the free section, go and look at the bottom of page 57) but the important thing to note is that instances have their own classes, just like instances have their own fields too.  That is to say, two nested classes of two different instances of the containing class, are two different classes.

It sounds a little counter intuitive at first, but once you have a play with it it makes more and more sense.  However, if you do want to have things as you’d expect from a Java perspective, you can always stick the nested class definition in the companion object (which, you’ll recall is shared across all instances of a class).  Or (and this pushed me out of my comfort-zone significantly, you can use a type projection, but I’ll not go into that here. I get the impression that once I get to things like that then I’ll really be moving towards the domain of advanced Scala consumer.

Wednesday, 8 January 2014

Weighing filter & map Against for’s guards and yield

On the surface, it seems that a combination of filter and map can achieve the same end as a for-comprehension with judicious use of guards (e.g. if-expressions) and yield.  Or to say the same thing in code:

b.filter(_ % 2 == 0).map(2 * _)

takes the same inputs and produces the same outputs (with the same lack of side effects) as:

for (elem <- b if (b % 2 == 0)) yield 2 * elem

Cay Horstmann (for it is from his book, Scala for the Impatient that this code and the jumping-off point for this post is taken) points out that the former is preferred by those more experienced or comfortable with the functional idiom, but what I want to examine for a little while is if there are any other reasons why you would pick one over the other.

Legibility

Lets kick off with my old drum, and consider how clean and clear to the eye and comprehending mind they both are.  If I’m being honest, in the example above, the filter/map combo has it hands down.  Just as the method calls chain, so does your understanding as you follow it through, from left to right.  With the for on the other hand, you need to either ignore the elem <- b, jump to the guard to get an image of what elem is, before jumping back to the right again to the yield.  A lot of work for such a simple piece of logic.

It also helps greatly that (now I have a good mental construct for map) that we can read the method names and immediately call to mind what is happening.  In the for version there are no such explicit semantic clues to latch onto.

Flexibility

So far it’s a resounding 1-0 in favour of filter and map, but lets redress the balance.  The example we’ve used so far is an incredibly simple one. What if we needed to have two generators?

for (i <- 1 to 3; j <- 1 to 3) ...

Simple. However our filter and map combo can’t compete because here we’re calling methods which need to be on single object instances.  There is nothing available to us which can produce the same effect as simply (I’m sure you could get all complicated, but there’s nothing which will have the “just what I expected” feel of this common for element.

Before we move on, we out to kick it up another notch and mention yield expressions can get pretty powerful too (note this example was lazily and brazenly stolen from Atomic Scala by Bruce Eckel and Dianne Marsh):

def yielding3(v:Vector[Int]) : Vector[Int] = {
  for {
    n <- v
    if n < 10
    isOdd = (n % 2 != 0)
    if(isOdd)
  } yield {
    val u = n * 10
    u + 2
  }
}

Here, while it is conceivable that in some circumstances our filter and map combo could achieve something similar, it would likely be at the expense of their legibility (which is worth a lot to my mind).

Before we finish this aspect, it’s worth pointing out that for’s multiple guards could be represented easily by chaining the same number of filter calls instead of the single one shown at the top of this post, but that’s not quite enough to tip this one back towards a no-score-draw.

Summary

So, it seems that there are indeed reasons when you would want to use for in preference to filter and map. In all cases it’s when you want to unleash a bit of the generator / yield power beyond the simple simple use case.  Seems pretty sensible to me, and a rule-of-thumb I’ll be applying in the future.

Functions and Methods – A Subtle Distinction

There are a great many benefits to be learning something like Scala, when you have an educational background like I do.  Even small snippets of text can flesh out whole expanses of a conceptual landscape, and not just a Scala-one.

Today it is the following lines from Cay Horstmann’s Scala for the Impatient:

“Scala has functions in addition to methods. A method operates on an object, but a function doesn’t. C++ has functions as well, but in Java you imitate them with static methods”

from Scala for the Impatient by Cay Horstmann, pp 21

I’ve been here before, but in the mass of words I copied from Wikipedia at that point, this simple distinction was lost on me (it’s there, but implicitly). It’s an important distinction too – a method has an object, a function doesn’t - and another fine conceptual tool which, added to my armoury of subtle distinctions and concepts, should help me get further under the skin of both Scala and other languages.

Tuesday, 7 January 2014

Back on the Old for-Comprehensions Again – Filters and Guard Conditions

NOTE: There's an update to this post, based on a Google+ comment from Chris Phelps.  Cheers Chris!
I’m back on Scala’s for comprehension construct, courtesy of chapter 2 of Cay Horstmann’sScala for the Impatient”.  As you’d expect from a book with such a title, he wastes no time in getting to the point, and after a quick (thou comprehensive) intro to the basic single-generator-and-little-else flavour he’s off into multiple generators:
for (i <- 1 to 3; j <- 1 to 3)
    print((10 * i + j) + " ") // Prints 11 12 13 21 22 23 31 32 33

from Scala for the Impatient, by Cay Horstmann, pp 20
The generators here are in the first set of parens, and give us the i and j variables which are then used in the body of the loop. All very clear. 
Then we’re onto the addition of guard conditions:
for (i <- 1 to 3; j <- 1 to 3 if i != j)    print((10 * i + j) + " ") // Prints 12 13 21 23 31 32
from Scala for the Impatient, by Cay Horstmann, pp20
Now before we had “filters” which seemed to do a similar thing – only continue evaluation of that iteration of the loop if they evaluated to true themselves - but filters were applied in the body whereas here the guard condition is found within the generator parens, at least in this example.
A bit of reading ahead soon reveals that these “filters” and “guard conditions” seem to be the same thing, but things just looked different because there are two valid forms of the for-comprehension syntax – one with parens and semi-colons (Cay’s way in) and the other with curly braces (the Atomic Scala opener).  But before we make any sweeping statements, lets check to make sure.  Here’s what happens when we start with a parens-version and convert it to the curly-braces flavour:
for (i <- 1 to 3; from = 4 - i; j <- from to 3)
  print((10 * i + j) + " ")
// Prints 13 22 23 31 32 33

for {             // added a newline and replaced the open-paren with a curly brace
  i <- 1 to 3     // added a newline and dropped the semicolon
  from = 4 – i    // added a newline and dropped the semicolon
  j <- from to 3  // added a newline and dropped the semicolon
}                 // added a newline and replaced the close-paren with a curly brace
print((10 * i + j) + " ")
// Still prints 13 22 23 31 32 33

adapted from Scala for the Impatient, by Cay Horstmann, pp20
As expected, they are the same.  I’m not sure if that added flexibility makes me happy or sad.  I can however cope with the different ways of referring to filters/guard-conditions. But wait, there’s no time for emotion, yield is about to be introduced.
And here again another subtlety I wasn’t previously aware of seems to have arisen – without a yield, what we have is called a “for-loop”, but once we’re yielding it’s a “for-comprehension”.  However, as I am already aware, yielding creates a collection filled with objects of the type output by the first generator, so it’s swings and roundabouts.
Update (08/02/2014): Chris Phelps (one of my Scala-mentors, for which I'm exceedingly grateful) comments: "Underneath the syntactic sugar, the "filter" or "guard" actually uses a variant of the filter method (withFilter), so if you thought those filters were related to the filter method, well spotted."  Looks like I'm on the right track.

Friday, 13 December 2013

Conceptualising map and flatMap

This time, we’re got a specially selected guest post from Chris at CodingFrog.  He’s way further down the Scala-learning path than I am, but that just means this post contains maximal goodness and he’s very kindly offered to share it with us. It certainly furthered my understanding in many areas, and made me want to look more into monads.  Take it away sir…

In this guest post, I wanted to address a few thoughts about map and flatMap. A number of types in the standard Scala libraries (so-called "monads", though there's a little more to monads than this - but this is not a monad post) have these helpful methods. Both map and flatMap are higher-order functions, meaning they take functions as arguments and apply these to the type's contents.

The first exposure most developers have to the map operation is in the context of a collection type. The map operation on a collection applies a mapping function to all the contents of the collection. Given a collection, for example a list, and a function, the function is applied to each element in the collection and a new collection made up of the results is returned. The mapping function takes a single argument, of the same type (or a supertype) of the collection contents, and returns a result, potentially of another type:

scala> def myFun(x: Int) = x * x

myFun: (x: Int)Int

scala> List(1,2,3) map { myFun }

res0: List[Int] = List(1, 4, 9)

This example shows a named function being passed to map, but of course a lambda could be used as well:

scala> List(1,2,3) map { x => x * x }

res1: List[Int] = List(1, 4, 9)

We can also use the '_' placeholder, though I personally like to be slightly more explicit with my lambdas to maintain readability. Note that I am also using infix notation for map and flatMap in all these examples, but the dotted calling style is perfectly valid as well.

Before going further, I'd like to comment on the naming of map and the relationship between the operation and the data type which shares this name. These two uses of “map” are slightly different, but there is a connection. First, the Map data type, as you likely know, is a type which contains key-value pairs. If you think about this slightly differently, this is a conceptual function which "maps" a relationship - given a key input, it produces a value output. The map operation applies a mapping function to values in a collection. You could even pass a Map data type to a map function to convert a collection of keys into a collection of values!:

scala> val myMap = Map(1 -> 1, 2->4, 3->9)

myMap: scala.collection.immutable.Map[Int,Int] = Map(1 -> 1, 2 -> 4, 3 -> 9)

scala> List(1,2,3) map { myMap }

res2: List[Int] = List(1, 4, 9)

Like map, the first exposure to flatMap is usually in a collection context. Suppose instead of a squaring function that returns a single value, you have a mapping function that returns a list from a given input:

scala> List(1,2,3) map { x => for {

     | y <- 1 to x toList

     | } yield y

     | }

res3: List[List[Int]] = List(List(1), List(1, 2), List(1, 2, 3))

In this case, instead of a simple list List[Int], we got a nested list List[List[Int]]! This may be what we want, but suppose we wanted to have a simple list. We can "flatten" this nested list, to remove a level of wrapping and give us a single list. flatMap does that for us in one step. So far so good:

scala> List(1,2,3) flatMap { x => for {

     | y <- 1 to x toList

     | } yield y

     | }

res4: List[Int] = List(1, 1, 2, 1, 2, 3)

So thinking about collections, it seems like we will be using the map operation much more frequently than we use the flatMap operation. After all, we are much more likely to use a function which gives us a single output for an input rather than one that creates nested collections. Again, we will have cases where we DO want that, but they are less common than the straightforward case.

But this isn't the case for other monad types. Let's look at the type Try. The Try type is a type which represents an operation that can fail. Try has two values - Success(t) which contains a successful result t; and Failure(ex) which contains an exception. In other words, instead of throwing the exception, we've grabbed it and stuffed it in a box. You don't know until you look in the box whether it contains the Success or the Failure (and you haven't jumped up the call stack to a catch clause). You can find out what is in the box with pattern matching:

scala> def examine(myTry: Try[Int]): Int = {

     | myTry match {

     | case Success(t) => t

     | case Failure(e) => -1

     | }

     | }

examine: (myTry: scala.util.Try[Int])Int

This is where I started to struggle to adapt my intuition of List map and flatMap, to the Try ideas of map and flatMap. So what would these mean? Map takes a function which takes a value (of type T) and converts it to another value (of type U, which may or may not be the same as type T). Now map on Try has a specific behaviour guarantee: for Success values, it will call the function on the value in the Success, and wrap the result back up in a Try.:

case Success(t) => Try(f(t))

For a Failure, it just returns this - in other words, it skips calling the function, and just returns the original failure:

case Failure(e) => this

But what if our mapping function wants to be able to represent failures? It's not hard to imagine wanting to have a sequence of operations, each of which could throw some exception. The chain will return a value if all operations in the sequence are successful, or else return the first failure. We can do this by using map. In this case, calling map on the first Try, will call the mapping function, which itself returns a Try value, and wrap that in a Try value:

scala> import scala.util.{Try, Success, Failure}

import scala.util.{Try, Success, Failure}


scala> def dangerOp1(i: Int): Try[Int] = {

     | if (i != 0) Success(i)

     | else Failure(new IllegalArgumentException)

     | }

dangerOp1: (i: Int)scala.util.Try[Int]


scala> val myTry = Success(5)

myTry: scala.util.Success[Int] = Success(5)


scala> myTry map { dangerOp1 }

res5: scala.util.Try[scala.util.Try[Int]] = Success(Success(5))


scala> val myTry2 = Failure(new NoSuchElementException)

myTry2: scala.util.Failure[Nothing] = Failure(java.util.NoSuchElementException)


scala> myTry2 map { dangerOp1 }

res6: scala.util.Try[scala.util.Try[Int]] = Failure(java.util.NoSuchElementException)

So we've gone from Try[T] to Try[Try[T]], just because we wanted to be able to have our mapping function return some exception cases. This could easily get out of hand if we want to keep mapping more and more functions which could return exceptional cases with Try! What can we do?

If we look back at our List example, we see that this is really the same case as when our mapping function itself returned a List. There we went from List[T] to List[List[T]], while here we're going from Try[T] to Try[Try[T]]. Our solution here is the same as it was there: we need to flatten a layer of nesting. We could take our first result, map our function, and then flatten the result, or we could do the map and the flatten in one step:

scala> myTry flatMap { dangerOp1 }

res7: scala.util.Try[Int] = Success(5)


scala> myTry2 flatMap { dangerOp1 }

res8: scala.util.Try[Int] = Failure(java.util.NoSuchElementException)

Learning map on List gave us this good (or at least, better) intuition for what mapping a higher order function onto that List means. But this didn't give us quite a rich enough intuition of what flatMap means. Applying functions that create lists to members of a list is just not a common enough pattern to really feel it in our bones. Other types, though, will give us the nested construction as our default pattern. It's quite easy, maybe even expected, to run a sequence of steps that each produce exceptions wrapped in Try. A sequence of operations where each one may return an error value wrapped in the Option monad seems pretty likely. When we step up to doing asynchronous coding, we can easily envision combining a sequence of operations that will complete in the future, hooking each up to be run in sequence whenever the previous step completes, by using the Future monad. In all these cases, flatMap is a much more natural and basic operation than map, which would keep wrapping each step in another level of nesting. By studying map and flatMap in terms of these types, we can get a better intuitve feel for how these operations combine values of these types, rather than falling back to our List intuition of these operations.

All of this has nice implications for for-comprehensions, which rely heavily on flatMap internally. But let's leave that for further study, and maybe another post.

Even More Legibility Wins (and no Losses) (Part 3 in a Occasional Series): Special “Option” Edition

I posted twice before about some of the things in Scala which I think help and hinder legibility – all from my personal perspective of course.  Well, here’s the third instalment.  It’s dedicated to the Option.  #winning:

Wins

  • Option types
    • just the concept in general – I think it’ll be a long time before I fully grok how great these really are
    • not to mention they’re biased (c.f. Atomic Scala, Atom: “Handling Non-Values with Option”, pp 352)
    • but also their being handled seamlessly in for comprehensions
    • as well as their having foreach and map functions on them
    • and the Option names: Some and None
  • null (even from Java code) being wrapped as a None

Losses

  • None – not that I can think of anyway…

Undecided

  • The name “Option” – Bruce and Dianne aren’t convinced. I don’t hate it. (c.f. Atomic Scala, Atom: “Handling Non-Values with Option”, pp 355)

Post-script

Excitingly, the same idiomatic concept makes it into the Scala equivalent of Java’s try/catch: Try(…) which produces either a Success or a Failure.  Far more on that to come in the next post (and much more), this time from a special guest: Colorado-resident, CodingFrog.

Wednesday, 20 November 2013

(Many) Legibility Wins and (a few) Losses (Part 2 in an Occasional Series)

I posted before about some of the things in Scala which I think help and hinder legibility – all from my personal perspective of course.  We’ll here’s the next instalment:

Wins

  • yield
  • Any” type, and its immediate syb-types
  • Pattern matching with types
  • Pattern matching with tuples
  • rockets (<= and =>) – so far at least…
  • (new Duck).myMethod – to keep the scope of the Duck instance as tight as possible
  • traits – I’ve always liked mixins anyway, but nothing is broken from what I liked in Ruby
  • the terse syntax (sans curly braces etc.) for defining classes when you don’t need it – e.g. “class MyClass
  • val c = new MyObject with MyTrait – when you don’t need to reuse the class definition elsewhere
  • sealed – does what it says on the tin
  • zip – ditto
  • take – ditto
  • + – adding a new key-value pair to an existing map. Gives you a new map back
  • catches and pattern matching
  • try blocks being expressions
  • exceptions being only for exceptional circumstances – a nice idiom

Losses

  • explicit setter definition (_=) - :$
  • set unions with ‘|
  • set differences with ‘&~’ or ‘--

Undecided

  • no parens for methods which have no side effects – sometimes, when these are abstract, they make you double-take a little

Overall, things are getting less and less surprising and more and more #winning.  Perhaps my resistance is being worn down. Perhaps I’m just more open to the experience.  Or perhaps all this is taking hold. One thing I do know however is that some of the method names on the collections classes are just plain wrong, but I don’t think I’m alone in thinking that.