| 
     1 \documentclass{article} | 
         | 
     2 \usepackage{../style} | 
         | 
     3 \usepackage{../langs} | 
         | 
     4 \usepackage{marvosym} | 
         | 
     5   | 
         | 
     6 %cheat sheet  | 
         | 
     7 %http://worldline.github.io/scala-cheatsheet/  | 
         | 
     8   | 
         | 
     9 \begin{document} | 
         | 
    10   | 
         | 
    11 \section*{A Crash-Course on Scala} | 
         | 
    12   | 
         | 
    13 Scala is a programming language that combines functional and  | 
         | 
    14 object-oriented programming-styles. It has received quite a  | 
         | 
    15 bit of attention in the last five years or so. One reason for  | 
         | 
    16 this attention is that, like the Java programming language,  | 
         | 
    17 Scala compiles to the Java Virtual Machine (JVM) and therefore  | 
         | 
    18 Scala programs can run under MacOSX, Linux and  | 
         | 
    19 Windows.\footnote{There are also experimental backends for | 
         | 
    20 Android and JavaScript; and also work is under way to have a  | 
         | 
    21 native compiler, see \url{https://github.com/scala-native/scala-native}.} Unlike Java, however, Scala often | 
         | 
    22 allows programmers to write very concise and elegant code.  | 
         | 
    23 Some therefore say: Scala is the much better Java. A number of  | 
         | 
    24 companies, The Guardian, Twitter, Coursera, FourSquare,  | 
         | 
    25 LinkedIn to name a few, either use Scala exclusively in  | 
         | 
    26 production code, or at least to some substantial degree. It  | 
         | 
    27 also seems to be useful in job-interviews (in Data Science)  | 
         | 
    28 according to this annectotical report  | 
         | 
    29   | 
         | 
    30 \begin{quote} | 
         | 
    31 \url{https://techcrunch.com/2016/06/14/scala-is-the-new-golden-child/} | 
         | 
    32 \end{quote} | 
         | 
    33   | 
         | 
    34 \noindent  | 
         | 
    35 If you want to try out Scala yourself, the official Scala compiler can be  | 
         | 
    36 downloaded from  | 
         | 
    37   | 
         | 
    38 \begin{quote} | 
         | 
    39 \url{http://www.scala-lang.org} | 
         | 
    40 \end{quote} | 
         | 
    41   | 
         | 
    42 \noindent  | 
         | 
    43 A ready-made bundle with the Eclipse IDE is at  | 
         | 
    44   | 
         | 
    45 \begin{quote} | 
         | 
    46 \url{http://scala-ide.org/download/sdk.html} | 
         | 
    47 \end{quote} | 
         | 
    48   | 
         | 
    49 Why do I use Scala in the AFL module? Actually, you can do  | 
         | 
    50 \emph{any} part of the coursework in \emph{any} programming | 
         | 
    51 language you like. I use Scala for showing you code during the  | 
         | 
    52 lectures because its functional programming-style allows me to  | 
         | 
    53 implement the functions we will discuss with very small  | 
         | 
    54 code-snippets. If I had to do this in Java, I would first have  | 
         | 
    55 to go through heaps of boilerplate code and the code-snippets  | 
         | 
    56 would not look pretty. Since the Scala compiler is free, you  | 
         | 
    57 can download the code-snippets and run every example I give.  | 
         | 
    58 But if you prefer, you can also easily translate them into any  | 
         | 
    59 other functional language, for example Haskell, Swift,  | 
         | 
    60 Standard ML, F$^\#$, Ocaml and so on.  | 
         | 
    61   | 
         | 
    62 Developing programs in Scala can be done with the Eclipse IDE  | 
         | 
    63 and also with the IntelliJ IDE, but for the small programs I will  | 
         | 
    64 develop the good old Emacs-editor is adequate for me and I  | 
         | 
    65 will run the programs on the command line. One advantage of  | 
         | 
    66 Scala over Java is that it includes an interpreter (a REPL, or  | 
         | 
    67 \underline{R}ead-\underline{E}val-\underline{P}rint-\underline{L}oop) | 
         | 
    68 with which you can run and test small code-snippets without  | 
         | 
    69 the need of the compiler. This helps a lot with interactively  | 
         | 
    70 developing programs. Once you installed Scala, you can start  | 
         | 
    71 the interpreter by typing on the command line:  | 
         | 
    72   | 
         | 
    73 \begin{lstlisting}[language={},numbers=none,basicstyle=\ttfamily\small] | 
         | 
    74 $ scala  | 
         | 
    75 Welcome to Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM).  | 
         | 
    76 Type in expressions for evaluation. Or try :help.  | 
         | 
    77   | 
         | 
    78 scala>  | 
         | 
    79 \end{lstlisting} | 
         | 
    80   | 
         | 
    81 \noindent Of course the precise response may vary due to the  | 
         | 
    82 version and platform where you installed Scala. At the Scala  | 
         | 
    83 prompt you can type things like \code{2 + 3} \keys{Ret} and | 
         | 
    84 the output will be  | 
         | 
    85   | 
         | 
    86 \begin{lstlisting}[numbers=none] | 
         | 
    87 scala> 2 + 3  | 
         | 
    88 res0: Int = 5  | 
         | 
    89 \end{lstlisting} | 
         | 
    90   | 
         | 
    91 \noindent indicating that the result of the addition is of  | 
         | 
    92 type \code{Int} and the actual result is 5. Another classic | 
         | 
    93 example you can try out is  | 
         | 
    94   | 
         | 
    95 \begin{lstlisting}[numbers=none] | 
         | 
    96 scala> print("hello world") | 
         | 
    97 hello world  | 
         | 
    98 \end{lstlisting} | 
         | 
    99   | 
         | 
   100 \noindent Note that in this case there is no result. The  | 
         | 
   101 reason is that \code{print} does not actually produce a result | 
         | 
   102 (there is no \code{resXX} and no type), rather it is a | 
         | 
   103 function that causes the \emph{side-effect} of printing out a | 
         | 
   104 string. Once you are more familiar with the functional  | 
         | 
   105 programming-style, you will know what the difference is  | 
         | 
   106 between a function that returns a result, like addition, and a  | 
         | 
   107 function that causes a side-effect, like \code{print}. We | 
         | 
   108 shall come back to this point later, but if you are curious  | 
         | 
   109 now, the latter kind of functions always has \code{Unit} as | 
         | 
   110 return type.  | 
         | 
   111   | 
         | 
   112 If you want to write a stand-alone app in Scala, you can  | 
         | 
   113 implement an object that is an instance of \code{App}, say | 
         | 
   114   | 
         | 
   115 \begin{lstlisting}[numbers=none] | 
         | 
   116 object Hello extends App { | 
         | 
   117     println("hello world") | 
         | 
   118 }  | 
         | 
   119 \end{lstlisting} | 
         | 
   120   | 
         | 
   121 \noindent save it in a file, say {\tt hello-world.scala}, and | 
         | 
   122 then run the compiler and runtime environment:  | 
         | 
   123   | 
         | 
   124 \begin{lstlisting}[language={},numbers=none,basicstyle=\ttfamily\small] | 
         | 
   125 $ scalac hello-world.scala  | 
         | 
   126 $ scala Hello  | 
         | 
   127 hello world  | 
         | 
   128 \end{lstlisting} | 
         | 
   129   | 
         | 
   130 As mentioned above, Scala targets the JVM and consequently  | 
         | 
   131 Scala programs can also be executed by the bog-standard Java  | 
         | 
   132 Runtime. This only requires the inclusion of {\tt | 
         | 
   133 scala-library.jar}, which on my computer can be done as  | 
         | 
   134 follows:  | 
         | 
   135   | 
         | 
   136 \begin{lstlisting}[language={},numbers=none,basicstyle=\ttfamily\small] | 
         | 
   137 $ scalac hello-world.scala  | 
         | 
   138 $ java -cp /usr/local/src/scala/lib/scala-library.jar:. Hello  | 
         | 
   139 hello world  | 
         | 
   140 \end{lstlisting} | 
         | 
   141   | 
         | 
   142 \noindent You might need to adapt the path to where you have  | 
         | 
   143 installed Scala.  | 
         | 
   144   | 
         | 
   145 \subsection*{Inductive Datatypes} | 
         | 
   146   | 
         | 
   147 The elegance and conciseness of Scala programs are often a  | 
         | 
   148 result of inductive datatypes that can be easily defined in  | 
         | 
   149 Scala. For example in ``every-day mathematics'' we define  | 
         | 
   150 regular expressions simply by giving the grammar  | 
         | 
   151   | 
         | 
   152 \begin{center} | 
         | 
   153 \begin{tabular}{r@{\hspace{2mm}}r@{\hspace{2mm}}l@{\hspace{13mm}}l} | 
         | 
   154   $r$ & $::=$ &   $\ZERO$            & null\\  | 
         | 
   155         & $\mid$ & $\ONE$            & empty string\\  | 
         | 
   156         & $\mid$ & $c$               & single character\\  | 
         | 
   157         & $\mid$ & $r_1 \cdot r_2$   & sequence\\  | 
         | 
   158         & $\mid$ & $r_1 + r_2$       & alternative / choice\\  | 
         | 
   159         & $\mid$ & $r^\star$             & star (zero or more)\\  | 
         | 
   160   \end{tabular} | 
         | 
   161 \end{center} | 
         | 
   162   | 
         | 
   163 \noindent This grammar specifies what regular expressions are  | 
         | 
   164 (essentially a kind of tree-structure with three kinds of  | 
         | 
   165 inner nodes---sequence, alternative and star---and three kinds  | 
         | 
   166 of leave nodes---null, empty and character). If you are  | 
         | 
   167 familiar with Java, it might be an instructive exercise to  | 
         | 
   168 define this kind of inductive datatypes in Java\footnote{Happy | 
         | 
   169 programming! \Smiley} and then compare it with how it can be  | 
         | 
   170 implemented in Scala.  | 
         | 
   171   | 
         | 
   172 Implementing the regular expressions from above in Scala is  | 
         | 
   173 actually very simple: It first requires an \emph{abstract | 
         | 
   174 class}, say, \code{Rexp}. This will act as the type for | 
         | 
   175 regular expressions. Second, it requires a case for each  | 
         | 
   176 clause in the grammar. The cases for $\ZERO$ and $\ONE$ do not  | 
         | 
   177 have any arguments, while in all the other cases we do have  | 
         | 
   178 arguments. For example the character regular expression needs  | 
         | 
   179 to take as an argument the character it is supposed to  | 
         | 
   180 recognise. In Scala, the cases without arguments are called  | 
         | 
   181 \emph{case objects}, whereas the ones with arguments are | 
         | 
   182 \emph{case classes}. The corresponding Scala code is as | 
         | 
   183 follows:  | 
         | 
   184   | 
         | 
   185 \begin{lstlisting}[numbers=none] | 
         | 
   186 abstract class Rexp   | 
         | 
   187 case object ZERO extends Rexp  | 
         | 
   188 case object ONE extends Rexp  | 
         | 
   189 case class CHAR (c: Char) extends Rexp  | 
         | 
   190 case class SEQ (r1: Rexp, r2: Rexp) extends Rexp   | 
         | 
   191 case class ALT (r1: Rexp, r2: Rexp) extends Rexp   | 
         | 
   192 case class STAR (r: Rexp) extends Rexp   | 
         | 
   193 \end{lstlisting} | 
         | 
   194   | 
         | 
   195 \noindent In order to be an instance of \code{Rexp}, each case | 
         | 
   196 object and case class needs to extend \code{Rexp}. Given the | 
         | 
   197 grammar above, I hope you can see the underlying pattern. If  | 
         | 
   198 you want to play further with such definitions of inductive  | 
         | 
   199 datatypes, feel free to define for example binary trees.  | 
         | 
   200   | 
         | 
   201 Once you make a definition like the one above in Scala, you  | 
         | 
   202 can represent the regular expression for $a + b$, for example,  | 
         | 
   203 as \code{ALT(CHAR('a'), CHAR('b'))}. Expressions such as | 
         | 
   204 \code{'a'} stand for ASCII characters, though in the output | 
         | 
   205 syntax, as you can see below, the quotes are omitted. In a  | 
         | 
   206 later section we will see how we can support the more  | 
         | 
   207 mathematical infix notation for regular expression operators  | 
         | 
   208 in Scala. If you want to assign this regular expression to a  | 
         | 
   209 variable, you can use the keyword \code{val} and type | 
         | 
   210   | 
         | 
   211 \begin{lstlisting}[numbers=none] | 
         | 
   212 scala> val r = ALT(CHAR('a'), CHAR('b')) | 
         | 
   213 r: ALT = ALT(CHAR(a),CHAR(b))  | 
         | 
   214 \end{lstlisting} | 
         | 
   215   | 
         | 
   216 \noindent As you can see, in order to make such assignments,  | 
         | 
   217 no \code{new} or constructor is required in the class (as in | 
         | 
   218 Java). However, if there is the need for some non-standard  | 
         | 
   219 initialisation, you can of course define such a constructor in  | 
         | 
   220 Scala too. But we omit such ``tricks'' here.   | 
         | 
   221   | 
         | 
   222 Note that Scala in its response says the variable \code{r} is | 
         | 
   223 of type \code{ALT}, not \code{Rexp}. This might be a bit | 
         | 
   224 unexpected, but can be explained as follows: Scala always  | 
         | 
   225 tries to find the most general type that is needed for a  | 
         | 
   226 variable or expression, but does not ``over-generalise''. In  | 
         | 
   227 our definition the type \code{Rexp} is more general than | 
         | 
   228 \code{ALT}, since it is the abstract class for all regular | 
         | 
   229 expressions. But in this particular case there is no need to  | 
         | 
   230 give \code{r} the more general type of \code{Rexp}. This is | 
         | 
   231 different if you want to form a list of regular expressions,  | 
         | 
   232 for example  | 
         | 
   233   | 
         | 
   234 \begin{lstlisting}[numbers=none] | 
         | 
   235 scala> val ls = List(ALT(CHAR('a'), CHAR('b')), ZERO) | 
         | 
   236 ls: List[Rexp] = List(ALT(CHAR(a),CHAR(b)), ZERO)  | 
         | 
   237 \end{lstlisting} | 
         | 
   238   | 
         | 
   239 \noindent In this case, Scala needs to assign a common type to  | 
         | 
   240 the regular expressions so that it is compatible with the  | 
         | 
   241 fact that lists can only contain elements of a single type. In  | 
         | 
   242 this case the first common type is \code{Rexp}.\footnote{If you | 
         | 
   243 type in this example, you will notice that the type contains  | 
         | 
   244 some further information, but let us ignore this for the  | 
         | 
   245 moment.}   | 
         | 
   246   | 
         | 
   247 For compound types like \code{List[...]}, the general rule is | 
         | 
   248 that when a type takes another type as argument, then this  | 
         | 
   249 argument type is written in angle-brackets. This can also  | 
         | 
   250 contain nested types as in \code{List[Set[Rexp]]}, which is a | 
         | 
   251 list of sets each of which contains regular expressions.  | 
         | 
   252   | 
         | 
   253 \subsection*{Functions and Pattern-Matching} | 
         | 
   254   | 
         | 
   255 I mentioned above that Scala is a very elegant programming  | 
         | 
   256 language for the code we will write in this module. This  | 
         | 
   257 elegance mainly stems from the fact that in addition to  | 
         | 
   258 inductive datatypes, also functions can be implemented very  | 
         | 
   259 easily in Scala. To show you this, let us first consider a  | 
         | 
   260 problem from number theory, called the \emph{Collatz-series}, | 
         | 
   261 which corresponds to a famous unsolved problem in  | 
         | 
   262 mathematics.\footnote{See for example | 
         | 
   263 \url{http://mathworld.wolfram.com/CollatzProblem.html}.} | 
         | 
   264 Mathematicians define this series as:  | 
         | 
   265   | 
         | 
   266 \[  | 
         | 
   267 collatz_{n + 1} \dn  | 
         | 
   268 \begin{cases} | 
         | 
   269 \frac{1}{2} * collatz_n & \text{if $collatz_n$ is even}\\ | 
         | 
   270 3 * collatz_n + 1 & \text{if $collatz_n$ is odd} | 
         | 
   271 \end{cases} | 
         | 
   272 \]  | 
         | 
   273   | 
         | 
   274 \noindent The famous unsolved question is whether this  | 
         | 
   275 series started with any $n > 0$ as $collatz_0$ will always  | 
         | 
   276 return to $1$. This is obvious when started with $1$, and also  | 
         | 
   277 with $2$, but already needs a bit of head-scratching for the  | 
         | 
   278 case of $3$.  | 
         | 
   279   | 
         | 
   280 If we want to avoid the head-scratching, we could implement  | 
         | 
   281 this as the following function in Scala:  | 
         | 
   282   | 
         | 
   283 \lstinputlisting[numbers=none]{../progs/collatz.scala} | 
         | 
   284   | 
         | 
   285 \noindent The keyword for function definitions is \code{def} | 
         | 
   286 followed by the name of the function. After that you have a  | 
         | 
   287 list of arguments (enclosed in parentheses and separated by  | 
         | 
   288 commas). Each argument in this list needs its type to be  | 
         | 
   289 annotated. In this case we only have one argument, which is of  | 
         | 
   290 type \code{BigInt}. This type stands in Scala for arbitrary | 
         | 
   291 precision integers (in case you want to try out the function  | 
         | 
   292 on really big numbers). After the arguments comes the type of  | 
         | 
   293 what the function returns---a Boolean in this case for  | 
         | 
   294 indicating that the function has reached 1. Finally, after the  | 
         | 
   295 \code{=} comes the \emph{body} of the function implementing | 
         | 
   296 what the function is supposed to do. What the \code{collatz} | 
         | 
   297 function does should be pretty self-explanatory: the function  | 
         | 
   298 first tests whether \code{n} is equal to 1 in which case it | 
         | 
   299 returns \code{true} and so on. | 
         | 
   300   | 
         | 
   301 Notice the quirk in Scala's syntax for \code{if}s: The condition | 
         | 
   302 needs to be enclosed in parentheses and the then-case comes  | 
         | 
   303 right after the condition---there is no \code{then} keyword in | 
         | 
   304 Scala.  | 
         | 
   305   | 
         | 
   306 The real power of Scala comes, however, from the ability to  | 
         | 
   307 define functions by \emph{pattern matching}. In the | 
         | 
   308 \code{collatz} function above we need to test each case using a | 
         | 
   309 sequence of \code{if}s. This can be very cumbersome and brittle | 
         | 
   310 if there are many cases. If we wanted to define a function  | 
         | 
   311 over regular expressions in Java, for example, which does not  | 
         | 
   312 have pattern-matching, the resulting code would just be  | 
         | 
   313 awkward.  | 
         | 
   314   | 
         | 
   315 Mathematicians already use the power of pattern-matching when  | 
         | 
   316 they define the function that takes a regular expression and  | 
         | 
   317 produces another regular expression that can recognise the  | 
         | 
   318 reversed strings. They define this function as follows:  | 
         | 
   319   | 
         | 
   320 \begin{center} | 
         | 
   321 \begin{tabular}{r@{\hspace{2mm}}c@{\hspace{2mm}}l} | 
         | 
   322 $rev(\ZERO)$   & $\dn$ & $\ZERO$\\  | 
         | 
   323 $rev(\ONE)$      & $\dn$ & $\ONE$\\  | 
         | 
   324 $rev(c)$             & $\dn$ & $c$\\  | 
         | 
   325 $rev(r_1 + r_2)$     & $\dn$ & $rev(r_1) + rev(r_2)$\\  | 
         | 
   326 $rev(r_1 \cdot r_2)$ & $\dn$ & $rev(r_2) \cdot rev(r_1)$\\  | 
         | 
   327 $rev(r^*)$                   & $\dn$ & $rev(r)^*$\\  | 
         | 
   328 \end{tabular} | 
         | 
   329 \end{center} | 
         | 
   330   | 
         | 
   331 \noindent It is defined by recursion analysing each pattern of  | 
         | 
   332 what the regular expression might look like. The corresponding  | 
         | 
   333 Scala code looks very similar to this definition, thanks to  | 
         | 
   334 pattern-matching.  | 
         | 
   335   | 
         | 
   336 \lstinputlisting[language=Scala]{../progs/rev.scala} | 
         | 
   337   | 
         | 
   338 \noindent The keyword for starting a pattern-match is  | 
         | 
   339 \code{match} followed by a list of \code{case}s. Before the | 
         | 
   340 match keyword can be another pattern, but often, as in the  | 
         | 
   341 case above, it is just a variable you want to pattern-match  | 
         | 
   342 (the \code{r} after \code{=} in Line 1). | 
         | 
   343   | 
         | 
   344 Each case in this definition follows the structure of how we  | 
         | 
   345 defined regular expressions as inductive datatype. For example  | 
         | 
   346 the case in Line 3 you can read as: if the regular expression  | 
         | 
   347 \code{r} is of the form \code{EMPTY} then do whatever follows | 
         | 
   348 the \code{=>} (in this case just return \code{EMPTY}). Line 5 | 
         | 
   349 reads as: if the regular expression \code{r} is of the form | 
         | 
   350 \code{ALT(r1, r2)}, where the left-branch of the alternative is | 
         | 
   351 matched by the variable \code{r1} and the right-branch by | 
         | 
   352 \code{r2} then do ``something''. The ``something'' can now use the | 
         | 
   353 variables \code{r1} and \code{r2} from the match.  | 
         | 
   354   | 
         | 
   355 If you want to play with this function, call it for example  | 
         | 
   356 with the regular expression $ab + ac$. This regular expression  | 
         | 
   357 can recognise the strings $ab$ and $ac$. The function   | 
         | 
   358 \code{rev} produces $ba + ca$, which can recognise the reversed | 
         | 
   359 strings $ba$ and $ca$.  | 
         | 
   360   | 
         | 
   361 In Scala each pattern-match can also be guarded as in  | 
         | 
   362   | 
         | 
   363 \begin{lstlisting}[ numbers=none] | 
         | 
   364 case Pattern if Condition => Do_Something  | 
         | 
   365 \end{lstlisting} | 
         | 
   366   | 
         | 
   367 \noindent This allows us, for example, to re-write the   | 
         | 
   368 \code{collatz}-function from above as follows: | 
         | 
   369    | 
         | 
   370 \lstinputlisting[language=Scala]{../progs/collatz2.scala} | 
         | 
   371   | 
         | 
   372   | 
         | 
   373 \noindent Although in this particular case the pattern-match  | 
         | 
   374 does not improve the code in any way. In cases like  | 
         | 
   375 \code{rev}, however, it is really crucial. The underscore in | 
         | 
   376 Line 4 indicates that we do not care what the pattern looks  | 
         | 
   377 like. Thus this case acts like a default case whenever the  | 
         | 
   378 cases above did not match. Cases are always tried out from top  | 
         | 
   379 to bottom.  | 
         | 
   380    | 
         | 
   381 \subsection*{Loops, or better the Absence thereof} | 
         | 
   382   | 
         | 
   383 Coming from Java or C, you might be surprised that Scala does  | 
         | 
   384 not really have loops. It has instead, what is in functional  | 
         | 
   385 programming called, \emph{maps}. To illustrate how they work, | 
         | 
   386 let us assume you have a list of numbers from 1 to 8 and want to  | 
         | 
   387 build the list of squares. The list of numbers from 1 to 8   | 
         | 
   388 can be constructed in Scala as follows:  | 
         | 
   389   | 
         | 
   390 \begin{lstlisting}[numbers=none] | 
         | 
   391 scala> (1 to 8).toList  | 
         | 
   392 res1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8)  | 
         | 
   393 \end{lstlisting} | 
         | 
   394   | 
         | 
   395 \noindent Generating from this list, the list of squares in a  | 
         | 
   396 programming language such as Java, you would assume the list  | 
         | 
   397 is given as a kind of array. You would then iterate, or loop,  | 
         | 
   398 an index over this array and replace each entry in the array  | 
         | 
   399 by the square. Right? In Scala, and in other functional  | 
         | 
   400 programming languages, you use maps to achieve the same.   | 
         | 
   401   | 
         | 
   402 A map essentially takes a function that describes how each  | 
         | 
   403 element is transformed (for example squared) and a list over  | 
         | 
   404 which this function should work. There are two forms to  | 
         | 
   405 express such maps in Scala. The first way is called a  | 
         | 
   406 \emph{for-comprehension}. Squaring the numbers from 1 to 8 | 
         | 
   407 would look in this form as follows:  | 
         | 
   408   | 
         | 
   409 \begin{lstlisting}[numbers=none] | 
         | 
   410 scala> for (n <- (1 to 8).toList) yield n * n  | 
         | 
   411 res2: List[Int] = List(1, 4, 9, 16, 25, 36, 49, 64)  | 
         | 
   412 \end{lstlisting} | 
         | 
   413   | 
         | 
   414 \noindent The important keywords are \code{for} and | 
         | 
   415 \code{yield}. This for-comprehension roughly states that from | 
         | 
   416 the list of numbers we draw \code{n}s and compute the result | 
         | 
   417 of \code{n * n}. As you can see, we specified the list where | 
         | 
   418 each \code{n} comes from, namely \code{(1 to 8).toList}, and | 
         | 
   419 how each element needs to be transformed. This can also be  | 
         | 
   420 expressed in a second way in Scala by using directly  | 
         | 
   421 \code{map}s as follows: | 
         | 
   422   | 
         | 
   423 \begin{lstlisting}[numbers=none] | 
         | 
   424 scala> (1 to 8).toList.map(n => n * n)  | 
         | 
   425 res3 = List(1, 4, 9, 16, 25, 36, 49, 64)  | 
         | 
   426 \end{lstlisting} | 
         | 
   427   | 
         | 
   428 \noindent In this way, the expression \code{n => n * n} stands | 
         | 
   429 for the function that calculates the square (this is how the  | 
         | 
   430 \code{n}s are transformed). This expression for functions | 
         | 
   431 might remind you of your lessons about the lambda-calculus  | 
         | 
   432 where this would have been written as $\lambda n.\,n * n$. It  | 
         | 
   433 might not be obvious, but for-comprehensions are just  | 
         | 
   434 syntactic sugar: when compiling, Scala translates  | 
         | 
   435 for-comprehensions into equivalent maps. This even works  | 
         | 
   436 when for-comprehensions get more complicated (see below).  | 
         | 
   437   | 
         | 
   438 The very charming feature of Scala is that such maps or  | 
         | 
   439 for-comprehensions can be written for any kind of data  | 
         | 
   440 collection, such as lists, sets, vectors, options and so on.  | 
         | 
   441 For example if we instead compute the reminders modulo 3 of  | 
         | 
   442 this list, we can write  | 
         | 
   443   | 
         | 
   444 \begin{lstlisting}[numbers=none] | 
         | 
   445 scala> (1 to 8).toList.map(n => n % 3)  | 
         | 
   446 res4 = List(1, 2, 0, 1, 2, 0, 1, 2)  | 
         | 
   447 \end{lstlisting} | 
         | 
   448   | 
         | 
   449 \noindent If we, however, transform the numbers 1 to 8 not  | 
         | 
   450 into a list, but into a set, and then compute the reminders  | 
         | 
   451 modulo 3 we obtain  | 
         | 
   452   | 
         | 
   453 \begin{lstlisting}[numbers=none] | 
         | 
   454 scala> (1 to 8).toSet[Int].map(n => n % 3)  | 
         | 
   455 res5 = Set(2, 1, 0)  | 
         | 
   456 \end{lstlisting} | 
         | 
   457   | 
         | 
   458 \noindent This is the correct result for sets, as there are  | 
         | 
   459 only three equivalence classes of integers modulo 3. Note that  | 
         | 
   460 in this example we need to ``help'' Scala to transform the  | 
         | 
   461 numbers into a set of integers by explicitly annotating the  | 
         | 
   462 type \code{Int}. Since maps and for-comprehensions are | 
         | 
   463 just syntactic variants of each other, the latter can also be  | 
         | 
   464 written as  | 
         | 
   465   | 
         | 
   466 \begin{lstlisting}[numbers=none] | 
         | 
   467 scala> for (n <- (1 to 8).toSet[Int]) yield n % 3  | 
         | 
   468 res5 = Set(2, 1, 0)  | 
         | 
   469 \end{lstlisting} | 
         | 
   470   | 
         | 
   471 For-comprehensions can also be nested and the selection of   | 
         | 
   472 elements can be guarded. For example if we want to pair up  | 
         | 
   473 the numbers 1 to 4 with the letters a to c, we can write  | 
         | 
   474   | 
         | 
   475 \begin{lstlisting}[numbers=none] | 
         | 
   476 scala> for (n <- (1 to 4).toList;   | 
         | 
   477             m <- ('a' to 'c').toList) yield (n, m) | 
         | 
   478 res6 = List((1,a), (1,b), (1,c), (2,a), (2,b), (2,c),   | 
         | 
   479             (3,a), (3,b), (3,c), (4,a), (4,b), (4,c))  | 
         | 
   480 \end{lstlisting} | 
         | 
   481   | 
         | 
   482 \noindent   | 
         | 
   483 Or if we want to find all pairs of numbers between 1 and 3  | 
         | 
   484 where the sum is an even number, we can write  | 
         | 
   485   | 
         | 
   486 \begin{lstlisting}[numbers=none] | 
         | 
   487 scala> for (n <- (1 to 3).toList;   | 
         | 
   488             m <- (1 to 3).toList;  | 
         | 
   489             if (n + m) % 2 == 0) yield (n, m)  | 
         | 
   490 res7 = List((1,1), (1,3), (2,2), (3,1), (3,3))  | 
         | 
   491 \end{lstlisting} | 
         | 
   492   | 
         | 
   493 \noindent The \code{if}-condition in the for-comprehension | 
         | 
   494 filters out all pairs where the sum is not even.  | 
         | 
   495   | 
         | 
   496 While hopefully this all looks reasonable, there is one  | 
         | 
   497 complication: In the examples above we always wanted to  | 
         | 
   498 transform one list into another list (e.g.~list of squares),  | 
         | 
   499 or one set into another set (set of numbers into set of  | 
         | 
   500 reminders modulo 3). What happens if we just want to print out  | 
         | 
   501 a list of integers? Then actually the for-comprehension  | 
         | 
   502 needs to be modified. The reason is that \code{print}, you | 
         | 
   503 guessed it, does not produce any result, but only produces  | 
         | 
   504 what is in the functional-programming-lingo called a  | 
         | 
   505 side-effect. Printing out the list of numbers from 1 to 5  | 
         | 
   506 would look as follows  | 
         | 
   507   | 
         | 
   508 \begin{lstlisting}[numbers=none] | 
         | 
   509 scala> for (n <- (1 to 5).toList) print(n)  | 
         | 
   510 12345  | 
         | 
   511 \end{lstlisting} | 
         | 
   512   | 
         | 
   513 \noindent  | 
         | 
   514 where you need to omit the keyword \code{yield}. You can | 
         | 
   515 also do more elaborate calculations such as  | 
         | 
   516   | 
         | 
   517 \begin{lstlisting}[numbers=none] | 
         | 
   518 scala> for (n <- (1 to 5).toList) { | 
         | 
   519   val square_n = n * n  | 
         | 
   520   println(s"$n * $n = $square_n")   | 
         | 
   521 }  | 
         | 
   522 1 * 1 = 1  | 
         | 
   523 2 * 2 = 4  | 
         | 
   524 3 * 3 = 9  | 
         | 
   525 4 * 4 = 16  | 
         | 
   526 5 * 5 = 25  | 
         | 
   527 \end{lstlisting} | 
         | 
   528   | 
         | 
   529 \noindent In this code I use a variable assignment (\code{val | 
         | 
   530 square_n = ...} ) and also what is called in Scala a  | 
         | 
   531 \emph{string interpolation}, written \code{s"..."}. The latter | 
         | 
   532 is for printing out an equation. It allows me to refer to the  | 
         | 
   533 integer values \code{n} and \code{square\_n} inside a string. | 
         | 
   534 This is very convenient for printing out ``things''.   | 
         | 
   535   | 
         | 
   536 The corresponding map construction for functions with   | 
         | 
   537 side-effects is in Scala called \code{foreach}. So you  | 
         | 
   538 could also write  | 
         | 
   539   | 
         | 
   540   | 
         | 
   541 \begin{lstlisting}[numbers=none] | 
         | 
   542 scala> (1 to 5).toList.foreach(n => print(n))  | 
         | 
   543 12345  | 
         | 
   544 \end{lstlisting} | 
         | 
   545   | 
         | 
   546   | 
         | 
   547 \noindent or even just  | 
         | 
   548   | 
         | 
   549 \begin{lstlisting}[numbers=none] | 
         | 
   550 scala> (1 to 5).toList.foreach(print)  | 
         | 
   551 12345  | 
         | 
   552 \end{lstlisting} | 
         | 
   553   | 
         | 
   554 \noindent Again I hope this reminds you a bit of your  | 
         | 
   555 lambda-calculus lessons, where an explanation is given why  | 
         | 
   556 both forms produce the same result.  | 
         | 
   557   | 
         | 
   558   | 
         | 
   559 If you want to find out more about maps and functions with  | 
         | 
   560 side-effects, you can ponder about the response Scala gives if  | 
         | 
   561 you replace \code{foreach} by \code{map} in the expression | 
         | 
   562 above. Scala will still allow \code{map} with side-effect | 
         | 
   563 functions, but then reacts with a slightly interesting result.  | 
         | 
   564   | 
         | 
   565 \subsection*{Types} | 
         | 
   566   | 
         | 
   567 In most functional programming languages, types play an  | 
         | 
   568 important role. Scala is such a language. You have already  | 
         | 
   569 seen built-in types, like \code{Int}, \code{Boolean}, | 
         | 
   570 \code{String} and \code{BigInt}, but also user-defined ones, | 
         | 
   571 like \code{Rexp}. Unfortunately, types can be a thorny | 
         | 
   572 subject, especially in Scala. For example, why do we need to  | 
         | 
   573 give the type to \code{toSet[Int]}, but not to \code{toList}? | 
         | 
   574 The reason is the power of Scala, which sometimes means it  | 
         | 
   575 cannot infer all necessary typing information. At the  | 
         | 
   576 beginning while getting familiar with Scala, I recommend a  | 
         | 
   577 ``play-it-by-ear-approach'' to types. Fully understanding  | 
         | 
   578 type-systems, especially complicated ones like in Scala, can  | 
         | 
   579 take a module on their own.\footnote{Still, such a study can | 
         | 
   580 be a rewarding training: If you are in the business of  | 
         | 
   581 designing new programming languages, you will not be able to  | 
         | 
   582 turn a blind eye to types. They essentially help programmers  | 
         | 
   583 to avoid common programming errors and help with maintaining  | 
         | 
   584 code.}  | 
         | 
   585   | 
         | 
   586 In Scala, types are needed whenever you define an inductive  | 
         | 
   587 datatype and also whenever you define functions (their  | 
         | 
   588 arguments and their results need a type). Base types are types  | 
         | 
   589 that do not take any (type)arguments, for example \code{Int} | 
         | 
   590 and \code{String}. Compound types take one or more arguments, | 
         | 
   591 which as seen earlier need to be given in angle-brackets, for  | 
         | 
   592 example \code{List[Int]} or \code{Set[List[String]]} or  | 
         | 
   593 \code{Map[Int, Int]}. | 
         | 
   594   | 
         | 
   595 There are a few special type-constructors that fall outside  | 
         | 
   596 this pattern. One is for tuples, where the type is written  | 
         | 
   597 with parentheses. For example   | 
         | 
   598   | 
         | 
   599 \begin{lstlisting}[ numbers=none] | 
         | 
   600 (Int, Int, String)  | 
         | 
   601 \end{lstlisting} | 
         | 
   602   | 
         | 
   603 \noindent is for a triple (a tuple with three components---two  | 
         | 
   604 integers and a string). Tuples are helpful if you want to  | 
         | 
   605 define functions with multiple results, say the function  | 
         | 
   606 returning the quotient and reminder of two numbers. For this  | 
         | 
   607 you might define:  | 
         | 
   608   | 
         | 
   609   | 
         | 
   610 \begin{lstlisting}[ numbers=none] | 
         | 
   611 def quo_rem(m: Int, n: Int) : (Int, Int) = (m / n, m % n)  | 
         | 
   612 \end{lstlisting} | 
         | 
   613   | 
         | 
   614   | 
         | 
   615 \noindent Since this function returns a pair of integers, its  | 
         | 
   616 return type needs to be of type \code{(Int, Int)}. | 
         | 
   617 Incidentally, this is also the input type of this function.  | 
         | 
   618 Notice this function takes \emph{two} arguments, namely | 
         | 
   619 \code{m} and \code{n}, both of which are integers. They are | 
         | 
   620 ``packaged'' in a pair. Consequently the complete type of  | 
         | 
   621 \code{quo_rem} is | 
         | 
   622   | 
         | 
   623 \begin{lstlisting}[ numbers=none] | 
         | 
   624 (Int, Int) => (Int, Int)  | 
         | 
   625 \end{lstlisting} | 
         | 
   626   | 
         | 
   627 Another special type-constructor is for functions, written as  | 
         | 
   628 the arrow \code{=>}. For example, the type \code{Int => | 
         | 
   629 String} is for a function that takes an integer as input  | 
         | 
   630 argument and produces a string as result. A function of this  | 
         | 
   631 type is for instance  | 
         | 
   632   | 
         | 
   633 \begin{lstlisting}[numbers=none] | 
         | 
   634 def mk_string(n: Int) : String = n match { | 
         | 
   635   case 0 => "zero"  | 
         | 
   636   case 1 => "one"  | 
         | 
   637   case 2 => "two"  | 
         | 
   638   case _ => "many"   | 
         | 
   639 }   | 
         | 
   640 \end{lstlisting} | 
         | 
   641   | 
         | 
   642 \noindent It takes an integer as input argument and returns a  | 
         | 
   643 string. Unlike other functional programming languages, there  | 
         | 
   644 is in Scala no easy way to find out the types of existing  | 
         | 
   645 functions, except by looking into the documentation  | 
         | 
   646   | 
         | 
   647 \begin{quote} | 
         | 
   648 \url{http://www.scala-lang.org/api/current/} | 
         | 
   649 \end{quote} | 
         | 
   650   | 
         | 
   651 The function arrow can also be iterated, as in   | 
         | 
   652 \code{Int => String => Boolean}. This is the type for a function | 
         | 
   653 taking an integer as first argument and a string as second,  | 
         | 
   654 and the result of the function is a boolean. Though silly, a  | 
         | 
   655 function of this type would be  | 
         | 
   656   | 
         | 
   657   | 
         | 
   658 \begin{lstlisting}[numbers=none] | 
         | 
   659 def chk_string(n: Int)(s: String) : Boolean =   | 
         | 
   660   mk_string(n) == s  | 
         | 
   661 \end{lstlisting} | 
         | 
   662   | 
         | 
   663   | 
         | 
   664 \noindent which checks whether the integer \code{n} | 
         | 
   665 corresponds to the name \code{s} given by the function | 
         | 
   666 \code{mk\_string}. Notice the unusual way of specifying the | 
         | 
   667 arguments of this function: the arguments are given one after  | 
         | 
   668 the other, instead of being in a pair (what would be the type  | 
         | 
   669 of this function then?). This way of specifying the arguments  | 
         | 
   670 can be useful, for example in situations like this  | 
         | 
   671   | 
         | 
   672 \begin{lstlisting}[numbers=none] | 
         | 
   673 scala> List("one", "two", "three", "many").map(chk_string(2)) | 
         | 
   674 res4 = List(false, true, false, false)  | 
         | 
   675   | 
         | 
   676 scala> List("one", "two", "three", "many").map(chk_string(3)) | 
         | 
   677 res5 = List(false, false, false, true)  | 
         | 
   678 \end{lstlisting} | 
         | 
   679   | 
         | 
   680 \noindent In each case we can give to \code{map} a specialised | 
         | 
   681 version of \code{chk_string}---once specialised to 2 and once | 
         | 
   682 to 3. This kind of ``specialising'' a function is called  | 
         | 
   683 \emph{partial application}---we have not yet given to this | 
         | 
   684 function all arguments it needs, but only some of them.  | 
         | 
   685   | 
         | 
   686 Coming back to the type \code{Int => String => Boolean}. The | 
         | 
   687 rule about such function types is that the right-most type  | 
         | 
   688 specifies what the function returns (a boolean in this case).  | 
         | 
   689 The types before that specify how many arguments the function  | 
         | 
   690 expects and what their type is (in this case two arguments,  | 
         | 
   691 one of type \code{Int} and another of type \code{String}). | 
         | 
   692 Given this rule, what kind of function has type  | 
         | 
   693 \mbox{\code{(Int => String) => Boolean}}? Well, it returns a | 
         | 
   694 boolean. More interestingly, though, it only takes a single  | 
         | 
   695 argument (because of the parentheses). The single argument  | 
         | 
   696 happens to be another function (taking an integer as input and  | 
         | 
   697 returning a string). Remember that \code{mk_string} is just  | 
         | 
   698 such a function. So how can we use it? For this define  | 
         | 
   699 the somewhat silly function \code{apply_3}: | 
         | 
   700   | 
         | 
   701 \begin{lstlisting}[numbers=none] | 
         | 
   702 def apply_3(f: Int => String): Bool = f(3) == "many"  | 
         | 
   703   | 
         | 
   704 scala> apply_3(mk_string)  | 
         | 
   705 res6 = true  | 
         | 
   706 \end{lstlisting} | 
         | 
   707   | 
         | 
   708 You might ask: Apart from silly functions like above, what is  | 
         | 
   709 the point of having functions as input arguments to other  | 
         | 
   710 functions? In Java there is indeed no need of this kind of  | 
         | 
   711 feature: at least in the past it did not allow such  | 
         | 
   712 constructions. I think, the point of Java 8 is to lift this  | 
         | 
   713 restriction. But in all functional programming languages,  | 
         | 
   714 including Scala, it is really essential to allow functions as  | 
         | 
   715 input argument. Above you already seen \code{map} and | 
         | 
   716 \code{foreach} which need this. Consider the functions | 
         | 
   717 \code{print} and \code{println}, which both print out strings, | 
         | 
   718 but the latter adds a line break. You can call \code{foreach} | 
         | 
   719 with either of them and thus changing how, for example, five  | 
         | 
   720 numbers are printed.  | 
         | 
   721   | 
         | 
   722   | 
         | 
   723 \begin{lstlisting}[numbers=none] | 
         | 
   724 scala> (1 to 5).toList.foreach(print)  | 
         | 
   725 12345  | 
         | 
   726 scala> (1 to 5).toList.foreach(println)  | 
         | 
   727 1  | 
         | 
   728 2  | 
         | 
   729 3  | 
         | 
   730 4  | 
         | 
   731 5  | 
         | 
   732 \end{lstlisting} | 
         | 
   733   | 
         | 
   734   | 
         | 
   735 \noindent This is actually one of the main design principles  | 
         | 
   736 in functional programming. You have generic functions like  | 
         | 
   737 \code{map} and \code{foreach} that can traverse data containers, | 
         | 
   738 like lists or sets. They then take a function to specify what  | 
         | 
   739 should be done with each element during the traversal. This  | 
         | 
   740 requires that the generic traversal functions can cope with  | 
         | 
   741 any kind of function (not just functions that, for example,  | 
         | 
   742 take as input an integer and produce a string like above).  | 
         | 
   743 This means we cannot fix the type of the generic traversal  | 
         | 
   744 functions, but have to keep them  | 
         | 
   745 \emph{polymorphic}.\footnote{Another interestic topic about | 
         | 
   746 types, but we omit it here for the sake of brevity.}   | 
         | 
   747   | 
         | 
   748 There is one more type constructor that is rather special. It  | 
         | 
   749 is called \code{Unit}. Recall that \code{Boolean} has two | 
         | 
   750 values, namely \code{true} and \code{false}. This can be used, | 
         | 
   751 for example, to test something and decide whether the test  | 
         | 
   752 succeeds or not. In contrast the type \code{Unit} has only a | 
         | 
   753 single value, written \code{()}. This seems like a completely | 
         | 
   754 useless type and return value for a function, but is actually  | 
         | 
   755 quite useful. It indicates when the function does not return  | 
         | 
   756 any result. The purpose of these functions is to cause  | 
         | 
   757 something being written on the screen or written into a file,  | 
         | 
   758 for example. This is what is called they cause some effect on   | 
         | 
   759 the side, namely a new content displayed on the screen or some  | 
         | 
   760 new data in a file. Scala uses the \code{Unit} type to indicate | 
         | 
   761 that a function does not have a result, but potentially causes  | 
         | 
   762 some side-effect. Typical examples are the printing functions,   | 
         | 
   763 like \code{print}. | 
         | 
   764   | 
         | 
   765   | 
         | 
   766 \subsection*{Cool Stuff} | 
         | 
   767   | 
         | 
   768 The first wow-moment I had with Scala was when I came across  | 
         | 
   769 the following code-snippet for reading a web-page.   | 
         | 
   770   | 
         | 
   771   | 
         | 
   772 \begin{lstlisting}[ numbers=none] | 
         | 
   773 import io.Source  | 
         | 
   774 val url = """http://www.inf.kcl.ac.uk/staff/urbanc/"""  | 
         | 
   775 Source.fromURL(url)("ISO-8859-1").take(10000).mkString | 
         | 
   776 \end{lstlisting} | 
         | 
   777   | 
         | 
   778   | 
         | 
   779 \noindent These three lines return a string containing the  | 
         | 
   780 HTML-code of my webpage. It actually already does something  | 
         | 
   781 more sophisticated, namely only returns the first 10000  | 
         | 
   782 characters of a webpage in case it is too large. Why is that  | 
         | 
   783 code-snippet of any interest? Well, try implementing  | 
         | 
   784 reading-from-a-webpage in Java. I also like the possibility of  | 
         | 
   785 triple-quoting strings, which I have only seen in Scala so  | 
         | 
   786 far. The idea behind this is that in such a string all  | 
         | 
   787 characters are interpreted literally---there are no escaped  | 
         | 
   788 characters, like \verb|\n| for newlines.  | 
         | 
   789   | 
         | 
   790 My second wow-moment I had with a feature of Scala that other  | 
         | 
   791 functional programming languages do not have. This feature is  | 
         | 
   792 about implicit type conversions. If you have regular  | 
         | 
   793 expressions and want to use them for language processing you  | 
         | 
   794 often want to recognise keywords in a language, for example  | 
         | 
   795 \code{for},{} \code{if},{} \code{yield} and so on. But the | 
         | 
   796 basic regular expression \code{CHAR} can only recognise a | 
         | 
   797 single character. In order to recognise a whole string, like  | 
         | 
   798 \code{for}, you have to put many of those together using | 
         | 
   799 \code{SEQ}: | 
         | 
   800   | 
         | 
   801   | 
         | 
   802 \begin{lstlisting}[numbers=none] | 
         | 
   803 SEQ(CHAR('f'), SEQ(CHAR('o'), CHAR('r'))) | 
         | 
   804 \end{lstlisting} | 
         | 
   805   | 
         | 
   806 \noindent This gets quickly unreadable when the strings and  | 
         | 
   807 regular expressions get more complicated. In other functional  | 
         | 
   808 programming languages, you can explicitly write a conversion  | 
         | 
   809 function that takes a string, say \dq{\pcode{for}}, and | 
         | 
   810 generates the regular expression above. But then your code is  | 
         | 
   811 littered with such conversion functions.  | 
         | 
   812   | 
         | 
   813 In Scala you can do better by ``hiding'' the conversion  | 
         | 
   814 functions. The keyword for doing this is \code{implicit} and | 
         | 
   815 it needs a built-in library called   | 
         | 
   816   | 
         | 
   817 \begin{lstlisting}[numbers=none] | 
         | 
   818 scala.language.implicitConversions  | 
         | 
   819 \end{lstlisting} | 
         | 
   820   | 
         | 
   821 \noindent  | 
         | 
   822 Consider the code  | 
         | 
   823   | 
         | 
   824   | 
         | 
   825 \begin{lstlisting}[language=Scala] | 
         | 
   826 import scala.language.implicitConversions  | 
         | 
   827   | 
         | 
   828 def charlist2rexp(s: List[Char]) : Rexp = s match { | 
         | 
   829   case Nil => EMPTY  | 
         | 
   830   case c::Nil => CHAR(c)  | 
         | 
   831   case c::s => SEQ(CHAR(c), charlist2rexp(s))  | 
         | 
   832 }  | 
         | 
   833   | 
         | 
   834 implicit def string2rexp(s: String) : Rexp =   | 
         | 
   835   charlist2rexp(s.toList)  | 
         | 
   836 \end{lstlisting} | 
         | 
   837   | 
         | 
   838   | 
         | 
   839 \noindent where the first seven lines implement a function  | 
         | 
   840 that given a list of characters generates the corresponding  | 
         | 
   841 regular expression. In Lines 9 and 10, this function is used  | 
         | 
   842 for transforming a string into a regular expression. Since the  | 
         | 
   843 \code{string2rexp}-function is declared as \code{implicit}, | 
         | 
   844 the effect will be that whenever Scala expects a regular  | 
         | 
   845 expression, but I only give it a string, it will automatically  | 
         | 
   846 insert a call to the \code{string2rexp}-function. I can now | 
         | 
   847 write for example  | 
         | 
   848   | 
         | 
   849 \begin{lstlisting}[numbers=none] | 
         | 
   850 scala> ALT("ab", "ac") | 
         | 
   851 res9 = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c)))  | 
         | 
   852 \end{lstlisting} | 
         | 
   853   | 
         | 
   854 \noindent Recall that \code{ALT} expects two regular | 
         | 
   855 expressions as arguments, but I only supply two strings. The  | 
         | 
   856 implicit conversion function will transform the string into a  | 
         | 
   857 regular expression.  | 
         | 
   858   | 
         | 
   859 Using implicit definitions, Scala allows me to introduce  | 
         | 
   860 some further syntactic sugar for regular expressions:  | 
         | 
   861   | 
         | 
   862   | 
         | 
   863 \begin{lstlisting}[ numbers=none] | 
         | 
   864 implicit def RexpOps(r: Rexp) = new { | 
         | 
   865   def | (s: Rexp) = ALT(r, s)  | 
         | 
   866   def ~ (s: Rexp) = SEQ(r, s)  | 
         | 
   867   def % = STAR(r)  | 
         | 
   868 }  | 
         | 
   869   | 
         | 
   870 implicit def stringOps(s: String) = new { | 
         | 
   871   def | (r: Rexp) = ALT(s, r)  | 
         | 
   872   def | (r: String) = ALT(s, r)  | 
         | 
   873   def ~ (r: Rexp) = SEQ(s, r)  | 
         | 
   874   def ~ (r: String) = SEQ(s, r)  | 
         | 
   875   def % = STAR(s)  | 
         | 
   876 }  | 
         | 
   877 \end{lstlisting} | 
         | 
   878   | 
         | 
   879    | 
         | 
   880 \noindent This might seem a bit overly complicated, but its effect is  | 
         | 
   881 that I can now write regular expressions such as $ab + ac$   | 
         | 
   882 simply as  | 
         | 
   883   | 
         | 
   884   | 
         | 
   885 \begin{lstlisting}[numbers=none] | 
         | 
   886 scala> "ab" | "ac"  | 
         | 
   887 res10 = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c)))  | 
         | 
   888 \end{lstlisting} | 
         | 
   889   | 
         | 
   890    | 
         | 
   891 \noindent I leave you to figure out what the other  | 
         | 
   892 syntactic sugar in the code above stands for.  | 
         | 
   893    | 
         | 
   894 One more useful feature of Scala is the ability to define  | 
         | 
   895 functions with varying argument lists. This is a feature that  | 
         | 
   896 is already present in old languages, like C, but seems to have  | 
         | 
   897 been forgotten in the meantime---Java does not have it. In the  | 
         | 
   898 context of regular expressions this feature comes in handy:  | 
         | 
   899 Say you are fed up with writing many alternatives as  | 
         | 
   900   | 
         | 
   901   | 
         | 
   902 \begin{lstlisting}[numbers=none] | 
         | 
   903 ALT(..., ALT(..., ALT(..., ...)))  | 
         | 
   904 \end{lstlisting} | 
         | 
   905   | 
         | 
   906   | 
         | 
   907 \noindent To make it difficult, you do not know how deep such  | 
         | 
   908 alternatives are nested. So you need something flexible that  | 
         | 
   909 can take as many alternatives as needed. In Scala one can  | 
         | 
   910 achieve this by adding a \code{*} to the type of an argument. | 
         | 
   911 Consider the code  | 
         | 
   912   | 
         | 
   913   | 
         | 
   914 \begin{lstlisting}[language=Scala] | 
         | 
   915 def Alts(rs: List[Rexp]) : Rexp = rs match { | 
         | 
   916   case Nil => NULL  | 
         | 
   917   case r::Nil => r  | 
         | 
   918   case r::rs => ALT(r, Alts(rs))  | 
         | 
   919 }  | 
         | 
   920   | 
         | 
   921 def ALTS(rs: Rexp*) = Alts(rs.toList)  | 
         | 
   922 \end{lstlisting} | 
         | 
   923   | 
         | 
   924   | 
         | 
   925 \noindent The function in Lines 1 to 5 takes a list of regular  | 
         | 
   926 expressions and converts it into an appropriate alternative  | 
         | 
   927 regular expression. In Line 7 there is a wrapper for this  | 
         | 
   928 function which uses the feature of varying argument lists. The  | 
         | 
   929 effect of this code  is that I can write the regular  | 
         | 
   930 expression for keywords as  | 
         | 
   931   | 
         | 
   932   | 
         | 
   933 \begin{lstlisting}[numbers=none] | 
         | 
   934 ALTS("for", "def", "yield", "implicit", "if", "match", "case") | 
         | 
   935 \end{lstlisting} | 
         | 
   936   | 
         | 
   937   | 
         | 
   938 \noindent Again I leave it to you to find out how much this  | 
         | 
   939 simplifies the regular expression in comparison with if I had  | 
         | 
   940 to write this by hand using only the ``plain'' regular  | 
         | 
   941 expressions from the inductive datatype.  | 
         | 
   942   | 
         | 
   943 \subsection*{More Info} | 
         | 
   944   | 
         | 
   945 There is much more to Scala than I can possibly describe in  | 
         | 
   946 this document. Fortunately there are a number of free books  | 
         | 
   947 about Scala and of course lots of help online. For example  | 
         | 
   948   | 
         | 
   949 \begin{itemize} | 
         | 
   950 \item \url{http://www.scala-lang.org/docu/files/ScalaByExample.pdf} | 
         | 
   951 \item \url{http://www.scala-lang.org/docu/files/ScalaTutorial.pdf} | 
         | 
   952 \item \url{https://www.youtube.com/user/ShadowofCatron} | 
         | 
   953 \item \url{http://docs.scala-lang.org/tutorials} | 
         | 
   954 \item \url{https://www.scala-exercises.org} | 
         | 
   955 \end{itemize} | 
         | 
   956   | 
         | 
   957 \noindent There is also a course at Coursera on Functional  | 
         | 
   958 Programming Principles in Scala by Martin Odersky, the main  | 
         | 
   959 developer of the Scala language. And a document that explains  | 
         | 
   960 Scala for Java programmers  | 
         | 
   961   | 
         | 
   962 \begin{itemize} | 
         | 
   963 \item \small\url{http://docs.scala-lang.org/tutorials/scala-for-java-programmers.html} | 
         | 
   964 \end{itemize} | 
         | 
   965   | 
         | 
   966 While I am quite enthusiastic about Scala, I am also happy to  | 
         | 
   967 admit that it has more than its fair share of faults. The  | 
         | 
   968 problem seen earlier of having to give an explicit type to  | 
         | 
   969 \code{toSet}, but not \code{toList} is one of them. There are | 
         | 
   970 also many ``deep'' ideas about types in Scala, which even to  | 
         | 
   971 me as seasoned functional programmer are puzzling. Whilst  | 
         | 
   972 implicits are great, they can also be a source of great  | 
         | 
   973 headaches, for example consider the code:  | 
         | 
   974   | 
         | 
   975 \begin{lstlisting}[numbers=none] | 
         | 
   976 scala>  List (1, 2, 3) contains "your mom"  | 
         | 
   977 res1: Boolean = false  | 
         | 
   978 \end{lstlisting} | 
         | 
   979   | 
         | 
   980 \noindent Rather than returning \code{false}, this code should | 
         | 
   981 throw a typing-error. There are also many limitations Scala  | 
         | 
   982 inherited from the JVM that can be really annoying. For  | 
         | 
   983 example a fixed stack size. One can work around this  | 
         | 
   984 particular limitation, but why does one have to?  | 
         | 
   985 More such `puzzles' can be found at  | 
         | 
   986   | 
         | 
   987 \begin{center} | 
         | 
   988   \url{http://scalapuzzlers.com} and | 
         | 
   989   \url{http://latkin.org/blog/2017/05/02/when-the-scala-compiler-doesnt-help/} | 
         | 
   990 \end{center} | 
         | 
   991   | 
         | 
   992 Even if Scala has been a success in several high-profile  | 
         | 
   993 companies, there is also a company (Yammer) that first used  | 
         | 
   994 Scala in their production code, but then moved away from it.  | 
         | 
   995 Allegedly they did not like the steep learning curve of Scala  | 
         | 
   996 and also that new versions of Scala often introduced  | 
         | 
   997 incompatibilities in old code. In the past two months  | 
         | 
   998 there have also been two forks of the Scala compiler.  | 
         | 
   999 It needs to be seen what the future brings for Scala.  | 
         | 
  1000   | 
         | 
  1001 So all in all, Scala might not be a great teaching language,  | 
         | 
  1002 but I hope this is mitigated by the fact that I never require  | 
         | 
  1003 you to write any Scala code. You only need to be able to read  | 
         | 
  1004 it. In the coursework you can use any programming language you  | 
         | 
  1005 like. If you want to use Scala for this, then be my guest; if  | 
         | 
  1006 you do not want, stick with the language you are most familiar  | 
         | 
  1007 with.  | 
         | 
  1008   | 
         | 
  1009   | 
         | 
  1010   | 
         | 
  1011 \end{document} | 
         | 
  1012   | 
         | 
  1013 %%% Local Variables:   | 
         | 
  1014 %%% mode: latex  | 
         | 
  1015 %%% TeX-master: t  | 
         | 
  1016 %%% End:   |