handouts/scala-ho.tex
changeset 238 527fdb90fffe
parent 237 370c0647a9bf
child 239 68d98140b90b
--- a/handouts/scala-ho.tex	Wed Sep 03 11:01:49 2014 +0100
+++ b/handouts/scala-ho.tex	Wed Sep 03 14:57:43 2014 +0100
@@ -33,7 +33,7 @@
 lectures because its functional programming-style allows me to
 implement the functions we will discuss with very small
 code-snippets. If I had to do this in Java, for example, I
-would first have to run through heaps of boilerplate code and
+would first have to go through heaps of boilerplate code and
 the code-snippets would not look pretty. Since the Scala
 compiler is free, you can download the code-snippets and run
 every example I give. But if you prefer, you can also easily
@@ -87,7 +87,7 @@
 between a function that returns a result, like addition, and a
 function that causes a side-effect, like \code{print}. We
 shall come back to this point later, but if you are curious
-now, the latter kind of functions always have as return type
+now, the latter kind of functions always has as return type
 \code{Unit}.
 
 If you want to write a stand-alone app in Scala, you can
@@ -120,13 +120,15 @@
 hello world
 \end{lstlisting}
 
+\noindent You might need to adapt the path to where you have
+installed Scala.
 
 \subsection*{Inductive Datatypes}
 
 The elegance and conciseness of Scala programs are often a
-result of inductive datatypes that can be easily defined. For
-example in ``every-day mathematics'' we define regular
-expressions simply by giving the grammar
+result of inductive datatypes that can be easily defined in
+Scala. For example in ``every-day mathematics'' we define
+regular expressions simply by giving the grammar
 
 \begin{center}
 \begin{tabular}{r@{\hspace{2mm}}r@{\hspace{2mm}}l@{\hspace{13mm}}l}
@@ -178,7 +180,7 @@
 datatypes, feel free to define for example binary trees.
 
 Once you make a definition like the one above in Scala, you
-can represent, for example, the regular expression for $a + b$
+can represent the regular expression for $a + b$, for example,
 as \code{ALT(CHAR('a'), CHAR('b'))}. Expressions such as
 \code{'a'} stand for ASCII characters, though in the output
 syntax, as you can see below, the quotes are omitted. In a
@@ -284,7 +286,6 @@
 right after the condition---there is no \code{then} keyword in
 Scala.
 
-
 The real power of Scala comes, however, from the ability to
 define functions by \emph{pattern matching}. In the
 \code{collatz} function above we need to test each case using a
@@ -353,10 +354,10 @@
 \lstinputlisting[language=Scala]{../progs/collatz2.scala}
 
 
-\noindent Although in this case the pattern-match does not
-improve the code in any way. In cases like \code{rev} it
-is really crucial. The underscore in the last case indicates
-that we do not care what the pattern looks like. Thus Line 4
+\noindent Although in this particular case the pattern-match
+does not improve the code in any way. In cases like \code{rev}
+it is really crucial. The underscore in Line 4 indicates that
+we do not care what the patter8n looks like. Thus this case
 acts like a default case whenever the cases above did not
 match. Cases are always tried out from top to bottom.
  
@@ -365,13 +366,13 @@
 Coming from Java or C, you might be surprised that Scala does
 not really have loops. It has instead, what is in functional
 programming called \emph{maps}. To illustrate how they work,
-lets assume you have a list of numbers from 1 to 10 and want to
-build the list of squares. The list of numbers from 1 to 10 
+lets assume you have a list of numbers from 1 to 8 and want to
+build the list of squares. The list of numbers from 1 to 8 
 can be constructed in Scala as follows:
 
 \begin{lstlisting}[language=Scala,numbers=none]
-scala> (1 to 10).toList
-res1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
+scala> (1 to 8).toList
+res1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8)
 \end{lstlisting}
 
 \noindent Generating from this list the list of squares in a
@@ -385,26 +386,26 @@
 element is transformed (for example squared) and a list over
 which this function should work. There are two forms to
 express such maps in Scala. The first way is called a
-for-comprehension. Squaring the numbers from 1 to 10 would
-look in this form as follows:
+\emph{for-comprehension}. Squaring the numbers from 1 to 8
+would look in this form as follows:
 
 \begin{lstlisting}[language=Scala,numbers=none]
-scala> for (n <- (1 to 10).toList) yield n * n
-res2: List[Int] = List(1, 4, 9, 16, 25, 36, 49, 64, 81, 100)
+scala> for (n <- (1 to 8).toList) yield n * n
+res2: List[Int] = List(1, 4, 9, 16, 25, 36, 49, 64)
 \end{lstlisting}
 
 \noindent The important keywords are \code{for} and
 \code{yield}. This for-comprehension roughly states that from
 the list of numbers we draw \code{n}s and compute the result
 of \code{n * n}. As you can see, we specified the list where
-each \code{n} comes from, namely \code{(1 to 10).toList}, and
+each \code{n} comes from, namely \code{(1 to 8).toList}, and
 how each element needs to be transformed. This can also be
 expressed in a second way in Scala by using directly
 \code{map}s as follows:
 
 \begin{lstlisting}[language=Scala,numbers=none]
-scala> (1 to 10).toList.map(n => n * n)
-res3 = List(1, 4, 9, 16, 25, 36, 49, 64, 81, 100)
+scala> (1 to 8).toList.map(n => n * n)
+res3 = List(1, 4, 9, 16, 25, 36, 49, 64)
 \end{lstlisting}
 
 \noindent In this way, the expression \code{n => n * n} stands
@@ -424,16 +425,16 @@
 this list, we can write
 
 \begin{lstlisting}[language=Scala,numbers=none]
-scala> (1 to 10).toList.map(n => n % 3)
-res4 = List(1, 2, 0, 1, 2, 0, 1, 2, 0, 1)
+scala> (1 to 8).toList.map(n => n % 3)
+res4 = List(1, 2, 0, 1, 2, 0, 1, 2)
 \end{lstlisting}
 
-\noindent If we, however, transform the numbers 1 to 10 not
+\noindent If we, however, transform the numbers 1 to 8 not
 into a list, but into a set, and then compute the reminders
 modulo 3 we obtain
 
 \begin{lstlisting}[language=Scala,numbers=none]
-scala> (1 to 10).toSet[Int].map(n => n % 3)
+scala> (1 to 8).toSet[Int].map(n => n % 3)
 res5 = Set(2, 1, 0)
 \end{lstlisting}
 
@@ -446,7 +447,7 @@
 written as
 
 \begin{lstlisting}[language=Scala,numbers=none]
-scala> for (n <- (1 to 10).toSet[Int]) yield n % 3
+scala> for (n <- (1 to 8).toSet[Int]) yield n % 3
 res5 = Set(2, 1, 0)
 \end{lstlisting}
 
@@ -472,9 +473,8 @@
 res7 = List((1,1), (1,3), (2,2), (3,1), (3,3))
 \end{lstlisting}
 
-\noindent 
-The \code{if}-condition filters out all pairs where the
-sum is not even.
+\noindent The \code{if}-condition in the for-comprehension
+filters out all pairs where the sum is not even.
 
 While hopefully this all looks reasonable, there is one
 complication: In the examples above we always wanted to
@@ -489,12 +489,8 @@
 would look as follows
 
 \begin{lstlisting}[language=Scala,numbers=none]
-scala> for (n <- (1 to 5).toList) println(n)
-1
-2
-3
-4
-5
+scala> for (n <- (1 to 5).toList) print(n)
+12345
 \end{lstlisting}
 
 \noindent
@@ -526,28 +522,18 @@
 
 
 \begin{lstlisting}[language=Scala,numbers=none]
-scala> (1 to 5).toList.foreach(n => println(n))
-1
-2
-3
-4
-5
+scala> (1 to 5).toList.foreach(n => print(n))
+12345
 \end{lstlisting}
 
 
 \noindent or even just
 
-
 \begin{lstlisting}[language=Scala,numbers=none]
-scala> (1 to 5).toList.foreach(println)
-1
-2
-3
-4
-5
+scala> (1 to 5).toList.foreach(print)
+12345
 \end{lstlisting}
 
-
 \noindent Again I hope this reminds you a bit of your
 lambda-calculus lessons, where an explanation is given why
 both forms produce the same result.
@@ -567,7 +553,7 @@
 \code{String} and \code{BigInt}, but also user-defined ones,
 like \code{Rexp}. Unfortunately, types can be a thorny
 subject, especially in Scala. For example, why do we need to
-give the type to \code{toSet[Int]} but not to \code{toList}?
+give the type to \code{toSet[Int]}, but not to \code{toList}?
 The reason is the power of Scala, which sometimes means it
 cannot infer all necessary typing information. At the
 beginning while getting familiar with Scala, I recommend a
@@ -591,11 +577,17 @@
 
 There are a few special type-constructors that fall outside
 this pattern. One is for tuples, where the type is written
-with parentheses. For example \code{(Int, Int, String)} for a
-triple consisting of two integers and a string. Tuples are
-helpful if you want to define functions with multiple
-results, say the function returning the quotient and reminder
-of two numbers. For this you might define:
+with parentheses. For example 
+
+\begin{lstlisting}[language=Scala, numbers=none]
+(Int, Int, String)
+\end{lstlisting}
+
+\noindent is for a triple (a tuple with three components---two
+integers and a string). Tuples are helpful if you want to
+define functions with multiple results, say the function
+returning the quotient and reminder of two numbers. For this
+you might define:
 
 
 \begin{lstlisting}[language=Scala, numbers=none]
@@ -603,13 +595,13 @@
 \end{lstlisting}
 
 
-\noindent
-Since this function returns a pair of integers, its 
-return type needs to be \code{(Int, Int)}. Incidentally,
-this is also the input type of this function. Notice it takes
-\emph{two} arguments, namely \code{m} and \code{n}, both
-of which are integers. They are ``packaged'' in a pair.
-Consequently the complete type of \code{quo_rem} is
+\noindent Since this function returns a pair of integers, its
+return type needs to be of type \code{(Int, Int)}.
+Incidentally, this is also the input type of this function.
+Notice this function takes \emph{two} arguments, namely
+\code{m} and \code{n}, both of which are integers. They are
+``packaged'' in a pair. Consequently the complete type of
+\code{quo_rem} is
 
 \begin{lstlisting}[language=Scala, numbers=none]
 (Int, Int) => (Int, Int)
@@ -620,7 +612,6 @@
 \code{Int => String} is for a function that takes an integer as argument
 and produces a string. A function of this type is for instance
 
-
 \begin{lstlisting}[language=Scala,numbers=none]
 def mk_string(n: Int) : String = n match {
   case 0 => "zero"
@@ -630,7 +621,6 @@
 } 
 \end{lstlisting}
 
-
 \noindent It takes an integer as argument and returns a
 string. Unlike other functional programming languages, there
 is in Scala no easy way to find out the types of existing
@@ -648,36 +638,65 @@
 
 
 \begin{lstlisting}[language=Scala,numbers=none]
-def chk_string(n: Int, s: String) : Boolean = 
+def chk_string(n: Int)(s: String) : Boolean = 
   mk_string(n) == s
 \end{lstlisting}
 
 
-\noindent which checks whether the integer \code{n} corresponds
-to the name \code{s} given by the function \code{mk\_string}.
+\noindent which checks whether the integer \code{n}
+corresponds to the name \code{s} given by the function
+\code{mk\_string}. Notice the unusual way of specifying the
+arguments of this function: the arguments are given one after
+the other, instead of being in a pair (what would be the type
+of this function then?). This way of specifying the arguments
+can be useful, for example in situations like this
+
+\begin{lstlisting}[language=Scala,numbers=none]
+scala> List("one", "two", "three", "many").map(chk_string(2))
+res4 = List(false, true, false, false)
+
+scala> List("one", "two", "three", "many").map(chk_string(3))
+res5 = List(false, false, false, true)
+\end{lstlisting}
+
+\noindent In each case we can give to \code{map} a specialised
+version of \code{chk_string}---once specialised to 2 and once
+to 3. This kind of ``specialising'' a function is called
+\emph{partial application}---we have not yet given to this
+function all arguments it needs, but only one of them.
 
 Coming back to the type \code{Int => String => Boolean}. The
 rule about such function types is that the right-most type
 specifies what the function returns (a boolean in this case).
 The types before that specify how many arguments the function
-expects and what is their type (in this case two arguments,
+expects and what their type is (in this case two arguments,
 one of type \code{Int} and another of type \code{String}).
 Given this rule, what kind of function has type
 \mbox{\code{(Int => String) => Boolean}}? Well, it returns a
 boolean. More interestingly, though, it only takes a single
 argument (because of the parentheses). The single argument
 happens to be another function (taking an integer as input and
-returning a string).
+returning a string). Remember that \code{mk_string} is just 
+such a function. So how can we use it? For this define
+the somewhat silly function \code{apply_3}
+
+\begin{lstlisting}[language=Scala,numbers=none]
+def apply_3(f: Int => String): Bool = f(3) == "many"
+
+scala> apply_3(mk_string)
+res6 = true
+\end{lstlisting}
 
-Now you might ask, what is the point of having function as 
-arguments to other functions? In Java there is no need of this 
-kind of feature. But in all functional programming languages, 
-including Scala, it is really essential. Above you already
-seen \code{map} and \code{foreach} which need this. Consider 
-the functions \code{print} and \code{println}, which both 
-print out strings, but the latter adds a line break. You can
-call \code{foreach} with either of them and thus changing how,
-for example, five numbers are printed.
+You might ask, apart from silly functions like above, what is
+the point of having function as arguments to other functions?
+In Java there is indeed no need of this kind of feature. But
+in all functional programming languages, including Scala, it
+is really essential. Above you already seen \code{map} and
+\code{foreach} which need this. Consider the functions
+\code{print} and \code{println}, which both print out strings,
+but the latter adds a line break. You can call \code{foreach}
+with either of them and thus changing how, for example, five
+numbers are printed.
 
 
 \begin{lstlisting}[language=Scala,numbers=none]
@@ -752,7 +771,7 @@
 feature is about implicit type conversions. If you have
 regular expressions and want to use them for language
 processing you often want to recognise keywords in a language,
-for example \code{for}, \code{if}, \code{yield} and so on. But
+for example \code{for},{} \code{if},{} \code{yield} and so on. But
 the basic regular expression, \code{CHAR}, can only recognise
 a single character. In order to recognise a whole string, like
 \code{ for}, you have to put many of those together using
@@ -773,7 +792,13 @@
 
 In Scala you can do better by ``hiding'' the conversion
 functions. The keyword for doing this is \code{implicit} and
-it needs a built-in library called \code{implicitConversions}.
+it needs a built-in library called 
+
+\begin{lstlisting}[language=Scala,numbers=none]
+scala.language.implicitConversions
+\end{lstlisting}
+
+\noindent
 Consider the code
 
 
@@ -803,7 +828,7 @@
 
 \begin{lstlisting}[language=Scala,numbers=none]
 scala> ALT("ab", "ac")
-res9: ALT = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c)))
+res9 = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c)))
 \end{lstlisting}
 
 \noindent \code{ALT} expects two regular expressions
@@ -815,7 +840,7 @@
 some further syntactic sugar for regular expressions:
 
 
-\begin{lstlisting}[language=Scala]
+\begin{lstlisting}[language=Scala, numbers=none]
 implicit def RexpOps(r: Rexp) = new {
   def | (s: Rexp) = ALT(r, s)
   def ~ (s: Rexp) = SEQ(r, s)
@@ -839,7 +864,7 @@
 
 \begin{lstlisting}[language=Scala,numbers=none]
 scala> "ab" | "ac"
-res10: ALT = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c)))
+res10 = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c)))
 \end{lstlisting}