Friday 13 December 2013

Conceptualising map and flatMap

This time, we’re got a specially selected guest post from Chris at CodingFrog.  He’s way further down the Scala-learning path than I am, but that just means this post contains maximal goodness and he’s very kindly offered to share it with us. It certainly furthered my understanding in many areas, and made me want to look more into monads.  Take it away sir…

In this guest post, I wanted to address a few thoughts about map and flatMap. A number of types in the standard Scala libraries (so-called "monads", though there's a little more to monads than this - but this is not a monad post) have these helpful methods. Both map and flatMap are higher-order functions, meaning they take functions as arguments and apply these to the type's contents.

The first exposure most developers have to the map operation is in the context of a collection type. The map operation on a collection applies a mapping function to all the contents of the collection. Given a collection, for example a list, and a function, the function is applied to each element in the collection and a new collection made up of the results is returned. The mapping function takes a single argument, of the same type (or a supertype) of the collection contents, and returns a result, potentially of another type:

scala> def myFun(x: Int) = x * x

myFun: (x: Int)Int

scala> List(1,2,3) map { myFun }

res0: List[Int] = List(1, 4, 9)

This example shows a named function being passed to map, but of course a lambda could be used as well:

scala> List(1,2,3) map { x => x * x }

res1: List[Int] = List(1, 4, 9)

We can also use the '_' placeholder, though I personally like to be slightly more explicit with my lambdas to maintain readability. Note that I am also using infix notation for map and flatMap in all these examples, but the dotted calling style is perfectly valid as well.

Before going further, I'd like to comment on the naming of map and the relationship between the operation and the data type which shares this name. These two uses of “map” are slightly different, but there is a connection. First, the Map data type, as you likely know, is a type which contains key-value pairs. If you think about this slightly differently, this is a conceptual function which "maps" a relationship - given a key input, it produces a value output. The map operation applies a mapping function to values in a collection. You could even pass a Map data type to a map function to convert a collection of keys into a collection of values!:

scala> val myMap = Map(1 -> 1, 2->4, 3->9)

myMap: scala.collection.immutable.Map[Int,Int] = Map(1 -> 1, 2 -> 4, 3 -> 9)

scala> List(1,2,3) map { myMap }

res2: List[Int] = List(1, 4, 9)

Like map, the first exposure to flatMap is usually in a collection context. Suppose instead of a squaring function that returns a single value, you have a mapping function that returns a list from a given input:

scala> List(1,2,3) map { x => for {

     | y <- 1 to x toList

     | } yield y

     | }

res3: List[List[Int]] = List(List(1), List(1, 2), List(1, 2, 3))

In this case, instead of a simple list List[Int], we got a nested list List[List[Int]]! This may be what we want, but suppose we wanted to have a simple list. We can "flatten" this nested list, to remove a level of wrapping and give us a single list. flatMap does that for us in one step. So far so good:

scala> List(1,2,3) flatMap { x => for {

     | y <- 1 to x toList

     | } yield y

     | }

res4: List[Int] = List(1, 1, 2, 1, 2, 3)

So thinking about collections, it seems like we will be using the map operation much more frequently than we use the flatMap operation. After all, we are much more likely to use a function which gives us a single output for an input rather than one that creates nested collections. Again, we will have cases where we DO want that, but they are less common than the straightforward case.

But this isn't the case for other monad types. Let's look at the type Try. The Try type is a type which represents an operation that can fail. Try has two values - Success(t) which contains a successful result t; and Failure(ex) which contains an exception. In other words, instead of throwing the exception, we've grabbed it and stuffed it in a box. You don't know until you look in the box whether it contains the Success or the Failure (and you haven't jumped up the call stack to a catch clause). You can find out what is in the box with pattern matching:

scala> def examine(myTry: Try[Int]): Int = {

     | myTry match {

     | case Success(t) => t

     | case Failure(e) => -1

     | }

     | }

examine: (myTry: scala.util.Try[Int])Int

This is where I started to struggle to adapt my intuition of List map and flatMap, to the Try ideas of map and flatMap. So what would these mean? Map takes a function which takes a value (of type T) and converts it to another value (of type U, which may or may not be the same as type T). Now map on Try has a specific behaviour guarantee: for Success values, it will call the function on the value in the Success, and wrap the result back up in a Try.:

case Success(t) => Try(f(t))

For a Failure, it just returns this - in other words, it skips calling the function, and just returns the original failure:

case Failure(e) => this

But what if our mapping function wants to be able to represent failures? It's not hard to imagine wanting to have a sequence of operations, each of which could throw some exception. The chain will return a value if all operations in the sequence are successful, or else return the first failure. We can do this by using map. In this case, calling map on the first Try, will call the mapping function, which itself returns a Try value, and wrap that in a Try value:

scala> import scala.util.{Try, Success, Failure}

import scala.util.{Try, Success, Failure}


scala> def dangerOp1(i: Int): Try[Int] = {

     | if (i != 0) Success(i)

     | else Failure(new IllegalArgumentException)

     | }

dangerOp1: (i: Int)scala.util.Try[Int]


scala> val myTry = Success(5)

myTry: scala.util.Success[Int] = Success(5)


scala> myTry map { dangerOp1 }

res5: scala.util.Try[scala.util.Try[Int]] = Success(Success(5))


scala> val myTry2 = Failure(new NoSuchElementException)

myTry2: scala.util.Failure[Nothing] = Failure(java.util.NoSuchElementException)


scala> myTry2 map { dangerOp1 }

res6: scala.util.Try[scala.util.Try[Int]] = Failure(java.util.NoSuchElementException)

So we've gone from Try[T] to Try[Try[T]], just because we wanted to be able to have our mapping function return some exception cases. This could easily get out of hand if we want to keep mapping more and more functions which could return exceptional cases with Try! What can we do?

If we look back at our List example, we see that this is really the same case as when our mapping function itself returned a List. There we went from List[T] to List[List[T]], while here we're going from Try[T] to Try[Try[T]]. Our solution here is the same as it was there: we need to flatten a layer of nesting. We could take our first result, map our function, and then flatten the result, or we could do the map and the flatten in one step:

scala> myTry flatMap { dangerOp1 }

res7: scala.util.Try[Int] = Success(5)


scala> myTry2 flatMap { dangerOp1 }

res8: scala.util.Try[Int] = Failure(java.util.NoSuchElementException)

Learning map on List gave us this good (or at least, better) intuition for what mapping a higher order function onto that List means. But this didn't give us quite a rich enough intuition of what flatMap means. Applying functions that create lists to members of a list is just not a common enough pattern to really feel it in our bones. Other types, though, will give us the nested construction as our default pattern. It's quite easy, maybe even expected, to run a sequence of steps that each produce exceptions wrapped in Try. A sequence of operations where each one may return an error value wrapped in the Option monad seems pretty likely. When we step up to doing asynchronous coding, we can easily envision combining a sequence of operations that will complete in the future, hooking each up to be run in sequence whenever the previous step completes, by using the Future monad. In all these cases, flatMap is a much more natural and basic operation than map, which would keep wrapping each step in another level of nesting. By studying map and flatMap in terms of these types, we can get a better intuitve feel for how these operations combine values of these types, rather than falling back to our List intuition of these operations.

All of this has nice implications for for-comprehensions, which rely heavily on flatMap internally. But let's leave that for further study, and maybe another post.

Even More Legibility Wins (and no Losses) (Part 3 in a Occasional Series): Special “Option” Edition

I posted twice before about some of the things in Scala which I think help and hinder legibility – all from my personal perspective of course.  Well, here’s the third instalment.  It’s dedicated to the Option.  #winning:

Wins

  • Option types
    • just the concept in general – I think it’ll be a long time before I fully grok how great these really are
    • not to mention they’re biased (c.f. Atomic Scala, Atom: “Handling Non-Values with Option”, pp 352)
    • but also their being handled seamlessly in for comprehensions
    • as well as their having foreach and map functions on them
    • and the Option names: Some and None
  • null (even from Java code) being wrapped as a None

Losses

  • None – not that I can think of anyway…

Undecided

  • The name “Option” – Bruce and Dianne aren’t convinced. I don’t hate it. (c.f. Atomic Scala, Atom: “Handling Non-Values with Option”, pp 355)

Post-script

Excitingly, the same idiomatic concept makes it into the Scala equivalent of Java’s try/catch: Try(…) which produces either a Success or a Failure.  Far more on that to come in the next post (and much more), this time from a special guest: Colorado-resident, CodingFrog.

Wednesday 20 November 2013

(Many) Legibility Wins and (a few) Losses (Part 2 in an Occasional Series)

I posted before about some of the things in Scala which I think help and hinder legibility – all from my personal perspective of course.  We’ll here’s the next instalment:

Wins

  • yield
  • Any” type, and its immediate syb-types
  • Pattern matching with types
  • Pattern matching with tuples
  • rockets (<= and =>) – so far at least…
  • (new Duck).myMethod – to keep the scope of the Duck instance as tight as possible
  • traits – I’ve always liked mixins anyway, but nothing is broken from what I liked in Ruby
  • the terse syntax (sans curly braces etc.) for defining classes when you don’t need it – e.g. “class MyClass
  • val c = new MyObject with MyTrait – when you don’t need to reuse the class definition elsewhere
  • sealed – does what it says on the tin
  • zip – ditto
  • take – ditto
  • + – adding a new key-value pair to an existing map. Gives you a new map back
  • catches and pattern matching
  • try blocks being expressions
  • exceptions being only for exceptional circumstances – a nice idiom

Losses

  • explicit setter definition (_=) - :$
  • set unions with ‘|
  • set differences with ‘&~’ or ‘--

Undecided

  • no parens for methods which have no side effects – sometimes, when these are abstract, they make you double-take a little

Overall, things are getting less and less surprising and more and more #winning.  Perhaps my resistance is being worn down. Perhaps I’m just more open to the experience.  Or perhaps all this is taking hold. One thing I do know however is that some of the method names on the collections classes are just plain wrong, but I don’t think I’m alone in thinking that.

Wednesday 13 November 2013

Scala’s Type System – A Mid-Learning Checkpoint

Update (21st January, 2014): There is a subtle error in the section on “Enumerations (by extending Enumeration)”.  It’s fixed now.

WARNING: THIS POST CONTAINS WAY MORE QUESTIONS THAN ANSWERS

ALSO, APOLOGIES IF THE SCOPE OF THIS POST RIDES ROUGH-SHOD OVER THE CANONICAL DEFINITION OF A “TYPE-SYSTEM”.  IT SIMPLY CONTAINS EVERYTHING I BRING TO MIND WHEN I THINK ABOUT TYPES IN SCALA OR ANY LANGUAGE.  PLEASE FEEL FREE TO POINT OUT MY IDIOSYNCRACIES IN THE COMMENTS

You hear statements like “the best thing about Scala is it’s type system” and “Scala has a powerful type system" being flung around a lot.  For a long time, it’s been my aim to get a deep enough understanding of this aspect, as the people making this statement (in various forms) are people I respect a great deal.

I’ve made some brief forays into this territory before. Option Types is something I’ve heard Dick Wall talk about (and also something I think I understand despite not having covered it in my reading yet; thanks Dick) and the Fast Track to Scala course introduced me to the core class hierarchy (which seemed infinitely sensible – anything which includes “Any” as a type is clearly trying to strive for something I can relate to).  But these and other small aspects I’ve come across haven’t yet enabled me to honestly say I could defend the title of this post.

However, this isn’t to say I doubt I will get to this point eventually.  Just not yet.

To this end, I’ve just been went back over what I’ve learned so far about the type system.  It boils down to a set of facts. Lets begin with the ones which shouldn’t be surprising at all to a Java developer, plus a few little bits (signposted with italics) which might raise an eyebrow or produce a smile:

  • Classes and Objects – instantiate a class to create an object instance. Classes have fields and operations/methods
  • Creating Classes – class bodies are executed when the classes are created
  • Methods inside classes – methods have special access to other class elements (other methods and fields)
  • Fields – Always objects. Can be vals (immutable) or vars (mutable), or functions
  • Class Arguments – like constructors, but a list placed after the class name. Add a val/var to each definition to encapsulate it. You can have varargs too (but remember to put the definition last in the argument list)
  • Named and Default Arguments – you can specify the names of, and defaults for, Class arguments. If all args have defaults you can call “new *” without using any parentheses
  • Overloading – methods can be overloaded, as long as the argument lists differ (i.e. the signatures)
  • Constructors – automatically generated for us if we do nothing.  The expressions they contain are treated as statements, that is considered purely for their side effects. The result of the final expression is ignored, and the constructed object is returned instead. You can't use return half way through a set of constructor expressions either
  • Auxiliary Constructors – constructor overloading is possible, by defining a method called “this”. All must first call the Primary constructor (which is the constructor produced by the class argument list together with the class body) again using “this”. This means that ultimately, the primary constructor is always called first. Also you can’t use val or var defined arguments, as that would mean the field was only generated by that auxiliary constructor.  This guarantees all classes have the same structure
  • Case Classes – automatically creates all the fields for you as if you put the val keyword in front of each of them. (You can make the field a var if you like by pre-pending “var” to the definition.) You create them without having to use the “new” keyword. They also provide a nice default toString implementation. You cannot (from Scala 2.10 onwards) inherit from case classes 
  • Parameterized Types – at initialisation time, tells the compiler what type of object the container holds
  • Type Inference – don’t bother specifying the type explicitly, you don’t need to
  • Inheritance – inherit from another class using the extends keyword.  A derived class can only extend one base class, but a base class can be extended by any number of derived classes
  • Base Class Initialization – Scala guarantees all constructors are called within a class hierarchy during initialisation. If a base class has constructor arguments, then any class that inherits from that base must provide those arguments during construction. Derived-class primary constructors can call any of the overloaded, auxiliary constructors in the base class by providing the necessary constructor arguments in the call. You can’t call base-class constructors inside of overloaded derived-class constructors. The primary constructor is the “gateway” for all the overloaded constructors
  • Overriding Methods – provide an implementation in a derived class of a method in the base class (distinguished by the method signature).  The “override” keyword must be provided so Scala knows you intended to override. This gives us polymorphism just as you’d expect from Java. If you want to invoke the base-class version of a method, use the keyword “super”
  • Abstract Classes – like an ordinary class, except that one or more methods or fields is incomplete (i.e. without a definition or initialisation). Signified by the keyword “abstract” before the “class” keyword.  Use of “overrides” keyword is optional in definition of abstract methods in concrete subclasses
  • Polymorphism - If we create a class extending another abstract class A along with traits B and C, we can choose to treat that class as if it were only an A or only a B or only a C
  • Composition – just put something inside (i.e. as a field). Typically this is done with one (abstract) class and many traits with definitions but not implementations, deferring this implementation until the concrete base classes are created (aka “delay concreteness”)
  • Type Parameters – like Java Generics, from the perspective of the user 
  • Type Parameter Constraints – impose constraints on type parameters (again c.f. Java Generics)

So far, so (mostly predictable. Admittedly, there’s some syntactic sugar in it all (another way of thinking about the bits in italics) but there’s nothing to stretch the Java mind too much.  But there’s a lot more. The bits which go much further off-piste from a Java perspective (that I’ve come across so far) are as follows:

  • Functions as Objects – pass them around like any other object, and define methods to take functions as arguments
  • Function Literals – anonymous Functions, frequently only used once. defined by the ‘=>’ symbol. You can even assign an anonymous function to a var or val
  • Pattern Matching with Types – as well as pattern matching against values, you can pattern match against types
  • “Any” Typeany type, including functions
  • Pattern Matching with Case Classes – case classes were originally designed for this purpose.  When working with case classes, a match expression can even extract the argument fields
  • Enumerations (by extending Enumeration) – a collection of names. Enumeration is typically extended into an object. Within this object we need to define the set of vals assigned to Value (part of Enumeration) that the enumeration represents, enumeration fields and initialize each of them by calling the Enumeration.Value method.  Each call to Enumeration.Value returns a new instance of an inner class, also called Value. Furthermore you can and then alias the new enumeration object to the type Value (using the “type” keyword) to allow us to treat  as a type. For more information see Cay Hostmann’s “Scala for the Impatient”, pp. 65-66.
  • Enumerations (as a Subtype of a Trait) – we can also create something like an Enumeration by using a Tagging Trait.  Having taken this leap it doesn’t seem too great a leap to want to do other OO-type things with our Enumeration
  • Tuples – you can return more than one thing from a method, using a nice syntactic sugar to create and access and unpack them.  The same unpacking idiom is also accessible to case classes
  • Objects – by using the “object” keyword creates something you can’t create instances of – it already is an instance.  That means that “this” still works
  • Companion Objects – associated by having the same name as a class. If you create a field in the companion object, it produces a single piece of data for that field no matter how many instances of the associated class you make. Under the covers, creating a case class automatically creates a companion object with a factory method called “apply”. When you “construct” a new case class without using the “new” keyword Scala is actually calling “apply” (passing in the parameters you provided) and returning a new instance of the class
  • Traits – for assembling small, logical concepts, letting you acquire capabilities piecemeal rather than inheriting as a clump.  Ideally they represent a single concept. Defined using the “trait” keyword.  Mixed-in to a class by “extends” / “with” keywords (the former if it is the first, and there is no inheritance, the latter for all others).  You can mixin as many traits as you like, into concrete or abstract classes. Traits cannot be instantiated on their own. Traits can have values and functions, and be wholly or partially abstract if required (though they don’t have to be). Concrete classes which mixin traits must provide implementations of the abstract elements.  Abstract classes which mixin traits need not implement the abstract elements, but the concrete subclasses of them must.  Traits can extend from other traits, as well as abstract and concrete classes. If, when you mixin more than one trait, you combine two methods with the same signature (the name plus the type) you can (must) resolve these collisions by hand using super[Type].methodName
  • Tagging Traits – a way of grouping classes or objects together. Sometimes an alternative to Enumerations (combined with case objects). Indicated by the keyword “sealed” which indicates no more subtypes than the ones you see in the current source file
  • Case Objects – like a case class, but an object
  • Type Hierarchy – this is pretty fundamentally different from the one you’ll expect if coming from Java

There’s already a lot in what I’ve listed here, and from having used these elements, I can testify to the solid, expressive-but-terse code it lets you write.  But from reading ahead a little, and having listened in on conversations between people far brighter than myself, I know this is just the basics.  I’ll post back on those more in-depth topics once I come across them.

Onward!

Tuesday 12 November 2013

Dianne Marsh Talks About “Demystifying Scala” with Scott Hanselman

Just a quick post this, but its been pointed out to me that Dianne Marsh (co-author of Atomic Scala and Director of Engineering at Netflix) recorded a recent episode of Scott Hanselman’s Hanselminutes.  In it she talks about demystifying Scala.  Definitely worth a listen.

Monday 30 September 2013

More on the Subtleties of Scala and the Uniform Access Principle

I posted before about Scala and the Uniform Access Principle.  At that point I felt pretty smug that I’d noticed something clever.  Well I didn’t realise everything, but luckily the Atomic Scala Atom: “Uniform Access and Setters” was there to bring me fully up to speed.

Once you’ve seen it, it makes perfect sense, but when I came into this chapter-ette, I’d been under the impression thought you could swap defs for vals or vars as you pleased.  However, it soon became clear that there was more to think about, and this thinking comes down to contracts; contracts that the language makes with us, and which it (and we) can’t break.

For this discussion, the relevant Scala contracts are:

  • vals are immutable
  • vars aren’t
  • functions (def) can return different values and so Scala can’t guarantee the result will always be the same

This means that you can’t just implement fields and methods from an abstract base type in a subtype using any old variable or function. You must make sure you don’t break the contract that was made already in the parent.  Explicitly:

  • an abstract def can be implemented as a val or a var
  • an abstract var can be implemented as a def as long as a you also provide a setter as well as a getter

You’ll note that abstract vals can’t be implemented with defs.  That makes sense if we think about it.  A def could return various things – Scala can’t guarantee it’ll always be the same (especially if you consider overriding), whereas vals are immutable.  Broken contract? Denied.

An Added Subtlety

But wait a minute, we missed a fourth contract.  That second bullet mentioned setters. The contracts in play here are actually four-fold:

  • vals are immutable
  • vars aren’t
  • functions (def) can return different values and so Scala can’t guarantee the result will always be the same
  • vars require getters and setters

But we can still roll with that if we add another little piece of Scala sugar, we can supply a setter method in the subtype:

def d3 = n
def d3_=(newVal:Int) = (n = newVal)

Here, the “def d3_=…” line adds the setter Scala needs to fulfil the contracts and we’re back in action.

Does This Also Stand For Non-Abstract Overrides?

One final thing to consider is how uniform really is the Scala implementation of the principle? Pretty well universal as far as I can see, because going beyond the scope of the Atom, what happens when the superclass and it’s methods and fields aren’t abstract?  It turns out it’s exactly the same as above, as long as you remember your override keywords.  Predictable. Nice.

Wednesday 25 September 2013

Scala is an Expression-Based Language Not a Statement-Based One

I listened to a great podcast from the Java Posse Roundup 2013 on the train home last night: “Functional Programming
During it, Dick Wall briefly described how Scala is an “expression-based” language as opposed to a “statement-based” one (because everything in it is an expression) and that this was one of the main reasons why he liked it. 
In short:
  • if a language is expression-based, it means everything has a value (i.e. returns something),
  • but if it is statement-based, some things need to rely on side-effects to work (e.g. in Java if/else doesn’t have a value, it returns something nothing)
Now, I’ve looked into the terms statement and expression before on this blog, but the full weight of the concept hadn’t then struck home.  Listening to the podcast was the final push I needed.  Consequently, I did a little more reading on the topic.  I was planning on writing a post here about what I found, but instead found Joel Spolsky had got there before me.

Monday 23 September 2013

Simple Tuples for the Tired, Plus a Lesson in the Uniform Access Principle

I’m back on the early train again.  I’m tackling Tuples again too. (Atomic Scala, “Tuples” atom.)

I’m after the “Brevity” and “A Question of Style” atoms too, so the more succinct syntax coupled with something my Java-brain isn’t tuned into (plus, I would argue, the early hour) meant I did a quadruple-take and then a little investigation before Tuple unpacking made sense to me.  The reason?  This idiom (for I’m guessing it’s idiomatic:

def f = (1, 3.14, “Mouse”, false, “Altitude”)

What do we know from this, despite the spare syntax?

  1. it’s a function (because of the “def”)
  2. it’s called “f
  3. it takes no arguments, and does not mutate state (no parentheses just after the method name)
  4. it comprises a single expression (no curly braces)
  5. the return type is a tuple
  6. this return type is implicit inferred by the compiler (no need to specify in the method signature) (thanks Semeru-san)

With that in mind, what does this function do?  Well, nothing. It only serves to make and return a 5-value tuple which can be captured in a val or var; the first element of which is an Int, the second a Double, the third and fifth are Strings and the fourth is a Boolean.  In short, it’s a very simple tuple-builder.

Next step in the atom is to prove to ourselves that a tuple is just a collection of vals.  Firstly, we allocate our tuple to a val:

val (n, d, a, b, h) = f

That’s a little scary to the uninitiated (or tired) too. But it’s simply saying that I want to call the previously defined tuple-maker function, f, and then store the resulting tuple in another, explicitly defined, tuple of vals. This is called unpacking, and that then means, as well as a val for the tuple as a whole, we also have vals for each individual element and can manipulate them individually (notice the order change in the second line, and the “is” function comes from AtomicTest):

(n, d, a, b, h) is (1,3.14,"Mouse",false,"Altitude")
(a, b, n, d, h) is ("Mouse", false, 1, 3.14, "Altitude")

The final little chunk is some tuple indexing.  Again, this has an unfamiliar syntax, but makes sense once you roll it around a little:

f._1 is 1           // NOTE: NOT ZERO-INDEXED!
f._2 is 3.14
f._3 is "Mouse"
f._4 is false
f._5 is "Altitude"

This seemed a little haywire at first, but it makes perfect sense.  Again, breaking it down, we have f, the tuple-maker function from before, which returns a tuple of elements in a specific order when called.  So after the “f” we have a tuple.  We can then call methods on this (using the “.” in this example) and the methods we are calling here are the ones for indexing.

Lessons Learned

  1. Trying to learn a new type of language when still waking up is difficult
  2. There is a secret point that this atom is also making: In this simplest of simple functions, f, we seeing the uniform access principle in action. That is to say, as functions are first-class, I should be able to do with a function what I would do with any other object, and it should be transparent.  It was this transparency that pulled me up short.  Now, having realised this, I’m a lot more comfortable with the principle, and another small piece of Scala syntax

Tuesday 17 September 2013

Slowly Constructing a Pseudo-Backus-Naur Form

“For Comprehensions” (or just “Comprehensions”) are famously (on this blog at least) where I “lost it” during the TypeSafe’s Fast Track to Scala course right back at the start of this, and despite tackling it again on my own via Scala in Action it’s still not properly stuck.  At the time, the “losing it” pretty much felt like I had little familiar to grip onto.  Nothing looked as I’d expected it (the syntax for example) and the terminology wrapped around things that were tossed out to help me was alien too.  That first post (linked above) already makes clear I’m not coming at Scala from the traditional CompSci background, and in many respects, this unholy combination of factors is precisely why I’m diving into this – I know this is an area I am both weak in, and will benefit greatly from learning.  The second post (again linked above), while a lot more positive, still flounders a little in the big-ness of it all.  This post is a stripping back, and a bit of solid progress forward too.

The “Comprehensions” Atom

So now I’m tackling them again; the Atomic Scala aversion therapy approach has got me up to this point, and I’ve just had my first trip back into the water at the scene where I had my terrible experience last time.  But this time I’m prepared.  I’ve seen the basic Scala for construct on its own, stripped down and simple.  I’ve also been steadily introduced to much basic but needed terminology – “expression” for example, andinfix - and built up a solid and dependable understanding of it.  Besides this, I’ve slowly jettisoned by must-be-C-like syntax dependence with some hardcore functions-as-objects fiddling and parentheses and semi-colons have been ripped away from me left, right and centre, but I’m still comfortable.

But even with all this in the mental kit-bag, I’m going to take this slowly.  I’m fortunate that this is also the approach Bruce and Dianne feel is appropriate.  Typically a new concept is introduced not only with the hardcore details, but also with some lighter discussion of why, or how Scala folks mostly tend to use something.  They are also very up front about when they’ve avoided looking under such-and-such a rock for now; but they do point out the rocks, so you know there’ll be a return for more at a later date.

One such example in this Atom is the discussion around the possible flavours of for-comprehension using either parentheses or curly braces (Atomic Scala, pp.177).  A cursory glance at this topic, considering the (surface) similarity of the Scala for-comprehension with the Java for loop, might indicate that starting an introduction to such significant syntax with surrounding parens rather than curly braces might be more intuitive.  It’s not.  My biggest mistake was to bring anything of the Java for with me when I started on this Scala construct.  By going in first with a multi-line formatting and curly braces at the top and bottom, announcing (subtly) a block, the authors give me something to read, line-by-line.  To me, it’s a bonus that this happens to also be the way most Scala code is written.

Now that I have the metre of the construct we’re looking at, I can begin to look at the individual elements.  The first step is to re-visit the generator.  The statement beginning “the for loop you saw in For Loops was a comprehension with a single generator…” (my italics) implies to the careful reader that things will get harier later on in the generator space - but not yet - we’re sticking with one.  I’ve already commented that I like this syntax. (And just for the record, I think the Java for syntax, and by extension the C/C++ syntax is terrible.)  Lets keep moving.

Next up is the addition of some filters.  These are expressed in the form of if expressions which I’ve also seen a lot previously. It’s beginning to look as if every element that is pulled out of the input by the generator (or should I say generated?) drops down through the various filters which follow, only proceeding if the expression returns true.

The last piece in this puzzle are some definitions.  While these aren’t picked out as explicitly as the other two constructs, it seems clear that these are the remaining statements which all seem to be allocating values to vars or vars, either ones which are scoped with in the comprehension body or outwith it.  isOdd = (n % 2 != 0) is one such definition.  The authors note the lack of val / var declaration for this and I do too.  Scala is managing things for us here.

Putting It All Back Together

The final step is to pull it all back together to see what we have in totality – and I’m finding that this is easiest by my building a mini, mental pseudo-Backus-Naur Form. (See how nicely formatted my mind is? :D ):

for ([generator]) {
    [filter | definition] (*)
}

Please note, I find B.-N.F. as illegible as anything else, and the above chunk makes no attempt to be valid in any way. (It’s pseudo B.-N. F. remember). But it does provide me a way of representing what I’m building up on the page/screen.

When we do this it seems clear that we have an incredibly powerful tool for working our ways through collections and filtering each of the elements in turn to restrict things to only the ones we really need, using definitions as required to make our code clean and clear.

But What About yield?

Helpfully for me, this just made sense.  Having come across the same keyword in Ruby (though it’s used differently, with closures) I already had a mental construct of what this keyword should do to a running system that fits use-case just as nicely. Even when we get to the multi-line-block variety  Lets add it to the mini-mental pseudo-B.-N. F.:

for ([generator]) {
    [filter | definition] (*)
} [expression using element from the left-hand-side of the generator | yield] {
   
[expression using element from the left-hand-side of the generator] (*)
}

What Next?

I’ve been around this game long enough to know that it’s not going to stay as simple as I currently have it in my head.  But this feels nice and solid enough a piece of scaffolding to be getting on with.  I’m not going to speculate on where the additional complexity is going to come from.  I’m happy for now to roll this elegant piece of syntax around my brain pan for a while and see how it feels.

Monday 16 September 2013

It’s Nice That…

In Scala, Arrays are a lot more similar to other collections classes (like Vector). E.g.:

def sumIt(args:Int*) = {
    args.reduce((sum, n) => sum + n)
}

Looks pretty similar to:

def sumIt(args:Vector) = {
    args.reduce((sum, n) => sum + n)
}

Wednesday 11 September 2013

Legibility Wins and Losses (Part 1 in an Occasional Series)

I keep banging on about Scala’s syntax and the resulting legibility.  I’ve also mentioned more than once that I think the style propounded in Atomic Scala is the clearest I’ve read anywhere, and is likely to be a deciding factor in the books longevity.

What follows is a round-up of the wins, and losses, for legibility, that I’ve encountered so far.  this is all opinion, so please feel free to ignore everything in this post.

Wins

  • Scripts and classes
  • No semi-colons
  • vals and vars (as well as immutability as a general concept)
  • type inference
  • function definitions
  • The for loop (and how they play with Range literals)
  • Class Arguments
  • Named arguments
  • Default arguments
  • Case Classes (because of the terse syntax – no need for a body!, the removal of the need for “new” and the free “toString”)
  • Some of the methods in core classes (e.g. ‘to’ and ‘until’ when creating Ranges)
  • Overloading
  • String Interpolation
  • Optional Parameterized Types (square brackets just seem more apt for a DSL, but being optional means I can avoid using them unless its necessary)

Losses

  • :: (aka “prepend to a List”)
  • Some of the methods in the Collections classes (e.g. Vector.sorted)
  • Constructors (where did they go? they’ve just become a “run” of the body of a class. And yes, I know about the “Constructors” Atom in Atomic Scala)
  • Auxiliary Constructors
  • Function definitions within function definitions
  • “”””bleh””” (triple quotes; really?)

Undecided

  • Pattern matchers
  • The Scala Style Guide
  • Optional parentheses
  • Optional ‘.’ when calling methods on objects
  • Return type inferral

Could I have a style guide which just banned the “Losses” way of doing things?  Yes. We do that in Java all the time.  Might I be wrong about some of these?  Yes.  Noob’s prerogative.  Would an IDE help? Yes, syntax highlighting is a boon in my opinion.  Have they chipped away at my enthusiasm?  Not really.  The more I see, the easier it becomes to read.  It’s generally getting less and less outlandish looking with every passing day.

Pattern Matching, Take 2, Part 1

Things are about to get interesting.  I’ve made it to the point in Atomic Scala where Bruce (Eckel) and Dianne (Marsh) recommend you jump in if you're proficient in another language (not me, hopefully) or if you didn’t want to make damn certain you grokked the basics, before moving onto the core stuff (definitely me).  So what’s up? Pattern Matching that’s what’s up.

The first time I came across the Pattern Matching Syntax, like many things in Scala, I wasn’t a fan. I’m rapidly coming to the conclusion however that this initial revulsion is the way things are normally presented, rather than the syntax itself.  My primary requirement (in order for it to enter my head and slosh around up there accruing supporting information) is that it reads.  That’s one of Dianne and Bruce’s strengths; the first encounters with new syntactical elements always serves as a way to “read it to yourself in your head”. For example, you could narrate their introductory example on pp130 as:

“take this color [sic] object and match it to these cases: ({)
    in the case when it’s “red” => then produce result “RED”
    in the case when it’s “blue” => then produce result “BLUE”
    in the case when it’s “green” => then produce result “GREEN”
    and in the case when it’s anything else (_) => then produce result “UNKNOWN COLOR: ” + color”

That feels right to me. i know it can get a helluva lot more complicated, but to have that basic understanding to hang my hat on helps a lot.  Helpfully Bruce and Dianne also signpost specifically where things are going to get hairy later on. But I don’t have to worry about that yet.  First I’ve got the exercises to cement things.

Monday 26 August 2013

The for Loop “Reads”. Not Sure About Some of the Vector Methods Though…

The claim that Scala doesn’t “force you to jump through hoops [with the syntax]” (Atomic Scala pp.109-110) holds true for the for loop, especially when you “read” it as follows:

// “for i gets each value of the Vector"
for (i <- Vector(1, 2, 3, 4)) {
    println(“the value of I is: “ i)
}

I especially like that it parses in this instance, in the same way a lot of Ruby syntax does.  (Update, 5th September, 2013: As Summary 2, pp. 126-7 of Atomic Scala points out, the Scala “for” focusses on the loop, rather than the counting.  Precisely. And if you want to count? Loop through a range.  Beautifully explicit. You even get the Range’s “to” and “until” methods. Luverly)

But what about “Vector.sorted”?  I follow the argument (Atomic Scala pp.110) that Scala likes to “leave things in place” and instead create new copies of things. But for me, a good method name always is, or starts with a verb: getXxxx, sortXxxx, calculateXxxx.  That’s because methods act on the object they belong to, using the parameters they are passed.

But is this just another thing I’m trying to carry across from the well-worn Java/OO world?  My dictionary tells me (we focussed more on creativity at my school…) “sorted” is an adjective, and my dictionary also tells me that an adjective is

“a word naming an attribute of a noun", such as sweet”

(Oxford Dictionary of English)

Now if we see it in context:

val sortedVal = v4.sorted

And read it, knowing that Scala prefers to leave the original objects alone, and return a copy (and in this case a sorted copy) then perhaps I can warm to this.  I don’t think I’ll have too much of a problem, as long as things are consistent in this rule.  But doesn’t “Vector.sum” already break this rule? (Shouldn’t it be “summed”?)

Small Things That Make Me Happy #1

When you display a Vector:

val vector = Vector(1, 2, 3)

println(vector) // prints “Vector(1, 2, 3)”

It prints in the same form as the initialization expression.

That makes me happy.

(from Atomic Scala, pp.109)

That’s Annoying

Annoyingly, the Scala REPL doesn’t show all the possible completions when you press the [TAB] key.  Apparently the Scala documentation may contain other features.

I wonder why this is?  Something else to add to the “To Find Out” list…

Additionally, when running in the REPL, scripts will “sometimes have behaviour that is not proper for regular Scala programs.” (Atomic Scala, pp 79)

Atomic Scala has TDD Baked In

I’m still slowly working through Bruce Eckel’s and Dianne Marsh’s Atomic Scala.  I just got to the Testing "Atom” (Chapter) (pp. 94-100).  It’s the first atom (that I can recall anyway) which has merited a sub-atom, entitled “Testing as Part Programming”, and it introduces TDD.

I’m a massive fan of TDD, and my this I mean Kent-Beck-Red-Green-Refactor-Test-Driven-Design-TDD, and these guys get it.  They keep it simple (it’s a beginners book after all) but it’s the first time I’ve seen TDD being brought in so early to teach a new language (hell, I’ve never even seen it for frameworks outside of Beck’s “Test-Driven Development by Example”, and it works excellently. 

Given the known, but little-discussed mental-offloading benefits of TDD, I’m surprised they’re such pathfinders.  A real shame really, but others loss is their gain…

Wednesday 14 August 2013

REPL :paste Power Move (via Atomic Scala)

This is a blatant cut-and-paste, but as it’s about cutting-and-pasting, I think it’s almost allowed. (from Atomic Scala by Bruce Eckel and Dianne Marsh, Summary I, pp. 65)

To put the Scala REPL into multi-line, delayed-interpretation, paste-mode, use the :paste command. To evaluate when you’re done, hit CTRL-D.

Good times.

Statements and Expressions (via Atomic Scala)

NOTE: I’ve posted on the vague area of this topic before

As previously discussed, not having a strict CompSci background seems to be a particular hindrance when trying to learn Scala. At least when you are following the standard texts. One example of this is around the terminology regarding “Statements” and “Expressions”. I know it’s important I know the difference, but what is  the difference?  “Atomic Scala, Atom: Expressions” (pp. 54) to the rescue:

A statement changes state […] An expression expresses [a] result.

Now, not only is this clear, its terse, and memorable too. I’d expect nothing less from a book co-written by the author of Thinking in Java.  Good ole’ breezy simplicity.  A section of my mind now feels free to really grok the goodness of immutability and case classes.

“Atomic Scala” is Available (as a PDF too)

FULL DISCLOSURE: I was a proof-reader/tester on an early draft of Atomic Scala, and I know both Bruce and Dianne.

I’ve posted before about Atomic Scala, for which I hold out great hope.  It was always my aim to use it as the wedge which which I would force my way into Scala-greatness.  The problem was, I do all my reading on the train, and there was no way I was going to lug a technical tome back and forth to work on top of my laptop.

Until now (well, now-ish, I’m a little slow in getting to this post) there has been no electronic version.  Ideally I’m looking for a Kindle (.mobi) version, but anything electronic would do.  This meant that I’d been fighting my way through Nilanjan Raychaudhuri’s Scala in Action. Until now.

I’m pleased to say that I’ve heard from Bruce and Dianne that you can now get an Atomic Scala PDF (with Kindle .mobi to come later when it’s available at no extra cost) for $20.00 from Gumroad.  (It’s in Beta, but that’s for technical, not content reasons – the dead-tree version has been about for more than six months.)

I’ve got it, and am loving it.  I highly recommend you get a copy too.  If you want to dip your toe in free, the first 100 pages are available for free.

Thursday 27 June 2013

Pattern Matching Syntax – Case by Case

I’ve just read about Patter Matching.  The concepts made perfect sense, but as is frequently the case with me, the syntax still seemed hard to grasp.  Why were there all these bits? Which bit went where?  Why?

def my_method(myArg:TheArgType) = myArg match {   // match myArg against the following cases
    case myArg_is_same_as_this => do_this()       // and return the result I’m guessing
    case _ => do_that()                           // (ditto)
}

Scala in Action, Nilanjan Raychaudhuri, Chapter 2 (with a few slight changes for additional clarity)

That seems to be the simplest form.  That makes perfect sense. Next we kick it up a notch to replace simple value-matching for type-matching:

def my_method(arg:AnyRef) = arg match {  // match the arg against the following cases
    case s:String => do_this()           // now we have the object:Type syntax
    case _ => do_that()                 
}

Scala in Action, Nilanjan Raychaudhuri, Chapter 2 (with a few slight changes for additional clarity)

Having stepped through it, this still makes a lot of sense – especially the reappearance in the pattern of the object:Type syntax from class and method declarations. Jumping ahead a little (but not too much) I can also see that I can use the matched item reference (e.g. “s”) on the right hand side of the rocket (“=>”):

def my_method(arg:AnyRef) = arg match {  // match the arg against the following cases
    case s:String => “this works as I know I’ve got a String “ + s
    case _ => do_that()                 
}

Now I’m ready to kick it up the next notch and try on some “Infix Operator Patterns”.  First off, I’m forearmed – I know what an Infix Operator is.  Next up, my initial reaction is that this syntax looks a little impenetrable:

scala>  List(1, 2, 3, 4) match {
            case firstItem :: secondItem :: restOfTheItems => List(firstItem, secondItem)
            case _ => Nil
        }
scala>  List[Int] = List(1, 2)

Scala in Action, Nilanjan Raychaudhuri, Chapter 2 (with a few slight changes for additional clarity)

I must admit, the first time I read this It didn’t parse at all.  Having now crept up on it slowly, I think I see why.  It’s not the sudden abundance of double colons; rather it’s the purpose of the match.  We’re using things here to do some extraction, and the case statement is in some ways just a wrapper to ensure that nothing untoward goes on when there isn’t a firstItem and secondItem present. 

There’s also the issue that as this example has suddenly moved to the REPL, and we’re just defining our match as a one-off, operating on an explicit List literal which will always match the first pattern. Consequently the second pattern seems a little superfluous.  That confused me, as I thought I was missing something to begin with.

Finally (for now) we’re throwing some additional guard clauses into the mix.  Again, I’m forearmed, having met guard clauses before.

def rangeMatcher(num:Int) = num match {
    case within10  if within10  <= 10               => println(“within 0 to 10”)
    case within100 if within100 <= 100              => println(“within 11 to 100”)
    case beyond100 if beyond100 < Integer.MAX_VALUE => println(“beyond 100”)

}

Looking back, guard clauses were just “if some_comparison_which_evaluates_to_true_or_false”.  In this case we’ve been able to use one where previously we had the type in the type matcher.  If I wrap the guard clause in (redundant) parentheses then it’s a little clearer where the boundaries are

def rangeMatcher(num:Int) = num match {
    case within10 (if within10) <= 10 => println(“within 0 to 10”)
    case within100 (if within100) <= 100 => println(“within 11 to 100”)
    case beyond100 (if beyond100) < Integer.MAX_VALUE => println(“beyond 100”)

}

Yet again, moving my way through it slowly has helped all the pieces fall into place.  Again Scala wins the prize for not-immediately-obvious-syntax, but then that is eclipsed by winning the prize for incredibly-powerful-in-a-usable-way.

Wednesday 12 June 2013

Terminology Confusion “Functions / Operations / …?” (Part Two of an Ongoing Series)

Again my lack of a formal (i.e. any) Computing Science education gets the better of me.  What is the difference between a Function, a Operation and a Method? (I think I half know some of them, but half-knowing has already got me into trouble.)

Lets hit wikipedia again:

Subroutine (aka Function)

From Wikipedia.

In computer programming, a subroutine is a sequence of program instructions that perform a specific task, packaged as a unit. This unit can then be used in programs wherever that particular task should be performed. Subprograms may be defined within programs, or separately in libraries that can be used by multiple programs.

In different programming languages a subroutine may be called a procedure, a function, a routine, a method, or a subprogram. The generic term callable unit is sometimes used.[1]

As the name subprogram suggests, a subroutine behaves in much the same way as a computer program that is used as one step in a larger program or another subprogram. A subroutine is often coded so that it can be started (called) several times and/or from several places during one execution of the program, including from other subroutines, and then branch back (return) to the next instruction after the call once the subroutine's task is done.

 

Operation (mathematics)

From Wikipedia.

In its simplest meaning in mathematics and logic, an operation is an action or procedure which produces a new value from one or more input values, called "operands". There are two common types of operations: unary and binary. Unary operations involve only one value, such as negation and trigonometric functions. Binary operations, on the other hand, take two values, and include addition, subtraction,multiplication, division, and exponentiation.

Operations can involve mathematical objects other than numbers. The logical values true and false can be combined using logic operations, such as and, or, and not. Vectors can be added and subtracted. Rotations can be combined using the function compositionoperation, performing the first rotation and then the second. Operations on sets include the binary operations union and intersection and the unary operation of complementation. Operations on functions include composition and convolution.

Method

From Wikipedia.

In object-oriented programming, a method is a subroutine (or procedure) associated with a class. Methods define the behavior to be exhibited by instances of the associated class at program run time. Methods have the special property that at runtime, they have access to data stored in an instance of the class (or class instance or class object or object) they are associated with and are thereby able to control the state of the instance.[1]

Conclusion

So where does that leave us?  Operations are disambiguated quite simply – they are a sub-set operators.  Functions are a more general name for Methods, which is tied to Object Oriented programming. Excellent.

Tuesday 11 June 2013

Infix and Postfix (and Prefix)

Jargon Alert!  Two (OK, three) more terms from the mysterious worlds of Maths and Computing Science.  Lets go to wikipedia:

Infix Notation

From Wikipedia, the free encyclopedia

Infix-dia.svg

Infix notation is the common arithmetic and logical formula notation, in which operators are written infix-style between the operands they act on (e.g. 2 + 2). It is not as simple to parse by computers as prefix notation ( e.g. + 2 2 ) or postfix notation ( e.g. 2 2 + ), but many programming languages use it due to its familiarity.

In infix notation, unlike in prefix or postfix notations, parentheses surrounding groups of operands and operators are necessary to indicate the intended order in which operations are to be performed. In the absence of parentheses, certain precedence rules determine the order of operations.

Update (5th September, 2013): The statement above about parentheses doesn’t refer to the use of parentheses to group the arguments passed to a method.  It means that they should be used to indicate evaluation order. (c.f. Atomic Scala pp.54-57).  It is common for Scala to be described as having “infix notation” but by this people generally mean that (as in the case with AtomicTest’s “is” method) a method can be called without a preceding dot (‘.’) and following parentheses for arguments. (c.f. Atomic Scala pp.119 for an example of this).

 

Reverse Polish Notation (aka Postfix Notation)

From Wikipedia, the free encyclopedia

Postfix-dia.svg

Reverse Polish notation (RPN) is a mathematical notation in which every operator follows all of its operands, in contrast to Polish notation, which puts the operator in the prefix position. It is also known as postfix notation and is parenthesis-free as long as operator arities are fixed. The description "Polish" refers to the nationality of logician Jan Łukasiewicz, who invented (prefix) Polish notation in the 1920s.

The reverse Polish scheme was proposed in 1954 by Burks, Warren, and Wright[1] and was independently reinvented by F. L. Bauer and E. W. Dijkstra in the early 1960s to reduce computer memory access and utilize the stack to evaluate expressions. The algorithms and notation for this scheme were extended by Australian philosopher and computer scientist Charles Hamblinin the mid-1950s.[2][3]

In computer science, postfix notation is often used in stack-based and concatenative programming languages. It is also common in dataflow and pipeline-based systems, including Unix pipelines.

 

Polish notation (aka Prefix notation)

From Wikipedia, the free encyclopedia

Prefix-dia.svg

Polish notation, also known as Polish prefix notation or simply prefix notation, is a form of notation for logic, arithmetic, and algebra. Its distinguishing feature is that it places operators to the left of their operands. If the arity of the operators is fixed, the result is a syntax lacking parentheses or other brackets that can still be parsed without ambiguity. The Polish logician Jan Łukasiewicz invented this notation in 1924 in order to simplify sentential logic.

The term Polish notation is sometimes taken (as the opposite of infix notation) to also include Polish postfix notation, or Reverse Polish notation, in which the operator is placed after the operands.[1]

When Polish notation is used as a syntax for mathematical expressions by interpreters of programming languages, it is readily parsed into abstract syntax trees and can, in fact, define a one-to-one representation for the same. Because of this, Lisp and related programming languages define their entire syntax in terms of prefix notation (and others use postfix notation).

In Łukasiewicz 1951 book, Aristotle’s Syllogistic from the Standpoint of Modern Formal Logic, he mentions that the principle of his notation was to write the functors before the arguments to avoid brackets and that he had employed his notation in his logical papers since 1929.[4] He then goes on to cite, as an example, a 1930 paper he wrote with Alfred Tarski on the sentential calculus.[5]

While no longer used much in logic,[citation needed] Polish notation has since found a place in computer science.

Simple.  Now when we come across this again in the future, I’ll not look so blank.

[Deep Breath] For-Comprehensions

As I already noted, for-comprehensions are the point at which I lost track of the Fast Track to Scala course.  I think the word “for” had lulled me into a false sense of security.  It didn’t last long.

So now I’ve hit them again, in Chapter 2 of Scala in Action, and this time I’m determined to understand them.  The rest of this post will comprise a running commentary on my progress towards this goal.

The Syntax

Already I can see why I got so confused:

‘for’ (‘(‘Enumerators’)’ | ‘{‘Enumerators’}’) {nl} [‘yield’] Expr

Scala in Action, Nilanjan Raychaudhuri, Chapter 2

But having just typed it in (no cut-and-paste for me), I can already see that, as before with functions and function-literals, that either parentheses and curly braces can be used to wrap the ‘Enumerators’. What else can I learn?

A Small First Step

On top of this, it must be my lucky day, as Nilanjan decides to tackle this from a perspective of tradition, and seeing as Java is the new Cobol, tradition is something that makes me very comfortable.  Lets iterate through some collections:

val files = new java.io.File(“.”).listFiles
for(file <- files) {
    val filename = file.getName
    if(filename.endsWith(“.scala”)) println (file)
}

from Scala in Action, Nilanjan Raychaudhuri, Chapter 2

This is so damn legible that I’m already one step ahead of Nilanjan when he says that this looks very much like the equivalent foreach construct in Java.  (Mmmmmm, the warm glow of familiarity.) But lets not jump ahead. we’re getting some terminology too – the “file <- files” element of all this is called a generator, and it’s job is to iterate through a collection.  Why not call it an “Enumerator” like the previous syntax snippet?  Don’t get too far ahead of yourself. I’m sure we’ll get to that.

OK. Continue.

Next, Add Some Definitions and Guard Clauses

Right, now we’re going to do a lot more within the generator.

for(
   
file <- files;
    filename = file.getName;
    if(filename.endsWith(“.scala”))

) println (file)

from Scala in Action, Nilanjan Raychaudhuri, Chapter 2

This is precisely one of the concepts I now realise I’d missed during the course.  Armed with my knowledge of the Scala-ite’s love of reducing things, I can begin to unpick what has happened here.  Nilanjan tells me we can add definitions and guard clauses within a for loop. I also notice that we’ve had to add semi-colons (first sighting of them in Scala-land). The former (filename = file.getName;) simply defines a new val and the latter (if(filename.endsWith(“.scala”)) runs a check against it. If the guard clause is true, then the body of the loop executes, in this case, println (file).

Before we move on, I wonder if we can inline it even more?:

for(
   
file <- files;
    if(file.getName.endsWith(“.scala”))

) println (file)

Sure can.  I think one trap I need not to fall into here is to think of this as being like a standard Java for loop.  The semi-colons (I am guessing) are there to help the compiler and that is all.  I now know I can inline the definitions. But can I have more of them?:

for(
   
file <- files;
    filename = file.getName;
    timeNow = new java.util.Date()

    if(filename.endsWith(“.scala”))

) println (“time now: ” + timeNow + “, file: “ + file)

Yup. This works fine, even with the just-noticed missing semi-colon (try it yourself).

But before I inadvertently fall off the deep end again, lets reel in the off-piste enthusiasm and get back to the book.

Multiple Generators

Now we’re getting complicated. Apparently we can specify more than one generator, and the loop is executed across both of them, just as if the latter was inside the former, and in light of that, the Scala syntax seems surprisingly simple (simpler than any way I can think of doing it in Java anyway):

scala> val aList = List(1, 2, 3)
aList: List[Int] = List(1, 2, 3)
scala> val bList = List(4, 5, 6)
bList: List[Int] = List(4, 5, 6)
scala> for { a <- aList ; b <- bList } println(a + b)
5
6
7
6
7
8
7
8
9

from Scala in Action, Nilanjan Raychaudhuri, Chapter 2

Nice.

One more thing before we turn to the next page, Nilanjan threw in that curly-braces-instead-of-parens trick I spotted earlier.  Tricksy.  He also put everything on one line. I was waiting for that, so that wasn’t so much of a surprise.

Interim Summary – The Imperative Form

Summary-time. Apparently all of these examples of the for-comprehension are in the imperative form. In this form we specify statements that will get executed by the loop (e.g. println (file) and println(a + b)) and nothing gets returns.  OK. I get that.  However, mapping back to the syntax definition at the top of this post, what I still don’t get is why we just talked about “statements” instead of “expr(essions)”, and “generators” instead of “enumerators”. Perhaps that will become clear later on. Lets keep on trucking.

Onward! The Functional Form (aka “Sequence Comprehension”)

Now we’re working with values rather than executing statements.  That means they’re objects, and remembering that everything in Scala is an Object I’ve got my wits about me.

scala> for {a <- aList ; b <- bList} yield a + b
res3: List[Int] = List(5, 6, 7, 6, 7, 8, 7, 8, 9)

from Scala in Action, Nilanjan Raychaudhuri, Chapter 2

Now luckily for me, (although this was the point during the course when I was lulled into a false sense of security) I have come across the yield keyword before, in Ruby-land.  But is it really lucky? After some back-and-forth reading / comparison, I’ve decided to not think about Ruby’s yield too much.  All I want to take is the warm glow I get from a familiar keyword, but tackle the Scala meaning on it’s own terms. Lets see if I manage.

Nilanjan deftly expresses the difference between this and the previous example.  Whereas originally we were just using the a and b values within the loop, now we’re returning the values from the loop.  Unsurprisingly with this in mind the result is a List[Int].  We can then store this result and use it:

scala> val result = for {a <- aList ; b <- bList} yield a + b
res3: List[Int] = List(5, 6, 7, 6, 7, 8, 7, 8, 9)

scala> for (r <- result) print(r)
5
6
7
6
7
8
7
8
9

from Scala in Action, Nilanjan Raychaudhuri, Chapter 2

He then points out that while this new form is more verbose (yup) it separates the computation (adding a to b) from the use of the result (in our case, simply printing it).  Does his claim that this improves reusability and compatibility?  I can see how it does.  And now I understand it, and have a nice syntax style I can parse, both styles seem very legible too.

Here We Are & This Is It

That was all surprisingly sane.  As usual, taking it slowly, reading around a little, and trying out a few edge cases to cement my understanding helped a great deal.  I think I’ll try the Koan related to this now to see how I get on, and if there is anything else I can learn.

A Haunted House

Virginia Woolf published a collection of short stories entitled “A Haunted House”.  The eponymous opening story of the collection is among the first things I ever read by her as an author and it took me about 15 minutes to get off the first page.  Why is this relevant to my study of Scala? Without spoiling anything about the story itself, it gave me the exact same impression of repeatedly sliding down the surface of the words, never really understanding what was going on.  Scala in Action or a ghost story, both could leave me similarly stumped.

But in both cases I really wanted to grasp what was going on. I needed, I realised after a few attempts (each having ended with my looking blankly at the line of words in front of me, realising I’d missed something in the words just a few lines above), that all I needed to do was read very slowly, and very carefully.  The problem was, in this information-everywhere age, with Facebook status update streams, and Twitter firehoses, and RSS, and bookmarks, and email, and …, and … it is easy to become accustomed to glancing over things; simple to say “Yup, I read it” when in fact we merely cast our eyes in it’s general direction, smiled at and only subconsciously digested the pretty pictures, checked there were no actions for ourselves and then moved on. 

I’ve become so used to this way of reading that I can read an entire work of literature in this trance, not really paying attention to anything at all very much.  And so here is why A Haunted House (and everything else by Woolf that I’ve managed to read) is like Scala in Action. Because, to really learn Scala, and to become a better developer, I can’t just consume all the source texts in a cursory way.  I need to pick my way through them, line by line sometimes, just like Woolf has taught me to do. I need to always be aware, to keep my mind from wandering, and to stop when “my buffer overfloweth”, but plough on when I still have some mental capacity spare.  And I need to enjoy myself as I go.  Because challenging yourself is fun.

Monday 3 June 2013

Terminology Confusion - “Variables” and val (Part One of an Ongoing Series)

In the grand Scala-brain-load I’ve reached variables.  It seems that in this case, close reading can lead to confusion.  I had assumed that when we talked about a “variable” we only meant a var, but it seems that this can also refer to a val:

“In Scala there are two ways you can define variables: var and val. A val is a single assignment variable sometimes called [a] value.” (Italics are mine)

Scala in Action, Nilanjan Raychaudhuri, Chapter 1

I’m not sniping. Nilanjan is completely correct in his use of “variable” to refer to vals but it highlights the need for me to keep my eyes open and my brain engaged.

Toire Wa?

(Thanks to @ChrisPhelps for his japanese help and input into the form of this post.)

As a language, Japanese is highly sensitive to context. (I suspect that English, as a language, is also highly sensitive to context too, but I’m trying to illustrate a point so run with me.) As a language, Japanese is highly sensitive to context. As an example, and because I’m British, lets consider the following:

Toire wa doko? - トイレはどこ。

Toire wa doko desu ka? - トイレはどこですか。

These are two forms of asking where the nearest toilet is in Japanese.  The difference here is not primarily one of politeness (which is also a strong factor in the language) but instead it is one of context.  If I wanted to be really terse (in an emergency for example) I could even be understood with this:

Toire wa? - トイレは。

However, if I wanted to be very specific (and, lets be honest, quite polite) I could use this form:

Toire wa doko ni arimasu ka? - トイレはどこにありますか。

And at the extreme end (which you probably won't hear apart from in samurai movies or Rurouni Kenshin) there's always:

Toire wa doko de gozaimasu ka? - トイレはどこでございますか。

In all cases, apart from the tersest, and the verbose, it is totally find with the listener to use all these forms – as long as the context is applicable.

So how does this relate to Scala? Well, it’s struck me, from the time I attended the Typesafe “Fast Track to Scala” course that context can allow you a lot of leeway here too.  In fact, in discussions about how the interpreter would view the code we’d written, there was mention of how it is perfectly acceptable to leave out stuff the Scala interpreter (i.e. listener) can discern from context.

So bearing this in mind, you might understand how, as a learner, I’ve only just now seen an instance where taking syntax elements away in Scala made things more understandable.  Up until this glimpse of pure joy I’d been working through every example meticulously putting things back in so that I could map back to something approaching the verbosity of Java which I grasp and feel comfortable with.  Note I don’t say I like it, but it is a place of safety for me. 

However, when I was in the Ruby world (for only about six months mind) I grew to like the clarity and expressiveness of much of the syntax. Things like this and this set my heart racing at the prospect of actually legible code.

So how has Scala managed it? Function literals is the answer. Take this example:

scala> val evenNumbers = List(2, 4, 6, 8, 10)
evenNumbers: List[Int] = List(2, 4, 6, 8, 10)
scala> evenNumbers.foldLeft(0) { _ + _ }
res4: Int = 30

Is it because it actually looks so much like the closure syntax in Ruby? Perhaps.

But lets have a little fiddle. Is it possible to go too far. With standard functions, we can drop the curly braces an things are still fine. Can we do this with their literal cousins?  If this works:

scala> def max(a: Int, b:Int) = { if(a> b) a else b }
max: (a: Int, b: Int)Int
scala> max (10, 200)
res9: Int = 200
scala> def max(a: Int, b:Int) = if(a> b) a else b
max: (a: Int, b: Int)Int
scala> max (10, 200)
res6: Int = 200

Does this?:

scala> evenNumbers.foldLeft(0) { _ + _ }
res7: Int = 30
scala> evenNumbers.foldLeft(0)  _ + _
res8: String => java.lang.String = <function1>

It seems not, although this isn’t an error, it’s certainly not what I’d hoped for.  I’ll need to come back to this later when I’ve learned more. 

But before I leave on a happy, satisfied note, there’s one more thing.  The book has assumed I’m clever and stated:

Now, with your new knowledge of function literals, it should be pretty obvious that _.isUpper is a function literal:

    val hasUpperCase = name.exists(_.isUpper)

Scala in Action, Nilanjan Raychaudhuri, Chapter 1

I’m afraid Nilanjan is being optimistic regarding myself again.  Lets try some deduction and see where that gets us…

  1. First Observation: I’d assumed that function literals would come after the method name and the parens. No evidence of this here;
  2. Second Observation: I’d also assumed that function literals would be contained within curly braces. That’s clearly wrong too
  3. Third Observation: isUpper has no parens

So how can I discern that it’s a function literal?  The hint in the text seems to be the presence of the underscore (_), but we also learned a little earlier that they can also be used to assign default values to variables.  What I can see however is the dot (.) appended after the underscore, and that (so far at least) indicates that a function is being called (isUpper), and the fact it is appended onto an underscore does strongly imply it’s a literal.  Would it work with curly braces in place of the standard parens? It turns out it would:

scala> val hasUpperCase = "name".exists{_.isUpper}
hasUpperCase: Boolean = false

The final observation, not related to function literals specifically, means that think I can assume isUpper has no side effects (it has no parens of its own).

I get the feeling I’ll need to come back this.  It’s solidifying, but nowhere near enough for my liking yet.