# HG changeset patch # User Christian Urban # Date 1409752663 -3600 # Node ID 527fdb90fffebee030d7a2ac813a823f3f41db29 # Parent 370c0647a9bf3e74ec4972d32688fd529f19a92d updated diff -r 370c0647a9bf -r 527fdb90fffe handouts/scala-ho.pdf Binary file handouts/scala-ho.pdf has changed diff -r 370c0647a9bf -r 527fdb90fffe handouts/scala-ho.tex --- a/handouts/scala-ho.tex Wed Sep 03 11:01:49 2014 +0100 +++ b/handouts/scala-ho.tex Wed Sep 03 14:57:43 2014 +0100 @@ -33,7 +33,7 @@ lectures because its functional programming-style allows me to implement the functions we will discuss with very small code-snippets. If I had to do this in Java, for example, I -would first have to run through heaps of boilerplate code and +would first have to go through heaps of boilerplate code and the code-snippets would not look pretty. Since the Scala compiler is free, you can download the code-snippets and run every example I give. But if you prefer, you can also easily @@ -87,7 +87,7 @@ between a function that returns a result, like addition, and a function that causes a side-effect, like \code{print}. We shall come back to this point later, but if you are curious -now, the latter kind of functions always have as return type +now, the latter kind of functions always has as return type \code{Unit}. If you want to write a stand-alone app in Scala, you can @@ -120,13 +120,15 @@ hello world \end{lstlisting} +\noindent You might need to adapt the path to where you have +installed Scala. \subsection*{Inductive Datatypes} The elegance and conciseness of Scala programs are often a -result of inductive datatypes that can be easily defined. For -example in ``every-day mathematics'' we define regular -expressions simply by giving the grammar +result of inductive datatypes that can be easily defined in +Scala. For example in ``every-day mathematics'' we define +regular expressions simply by giving the grammar \begin{center} \begin{tabular}{r@{\hspace{2mm}}r@{\hspace{2mm}}l@{\hspace{13mm}}l} @@ -178,7 +180,7 @@ datatypes, feel free to define for example binary trees. Once you make a definition like the one above in Scala, you -can represent, for example, the regular expression for $a + b$ +can represent the regular expression for $a + b$, for example, as \code{ALT(CHAR('a'), CHAR('b'))}. Expressions such as \code{'a'} stand for ASCII characters, though in the output syntax, as you can see below, the quotes are omitted. In a @@ -284,7 +286,6 @@ right after the condition---there is no \code{then} keyword in Scala. - The real power of Scala comes, however, from the ability to define functions by \emph{pattern matching}. In the \code{collatz} function above we need to test each case using a @@ -353,10 +354,10 @@ \lstinputlisting[language=Scala]{../progs/collatz2.scala} -\noindent Although in this case the pattern-match does not -improve the code in any way. In cases like \code{rev} it -is really crucial. The underscore in the last case indicates -that we do not care what the pattern looks like. Thus Line 4 +\noindent Although in this particular case the pattern-match +does not improve the code in any way. In cases like \code{rev} +it is really crucial. The underscore in Line 4 indicates that +we do not care what the patter8n looks like. Thus this case acts like a default case whenever the cases above did not match. Cases are always tried out from top to bottom. @@ -365,13 +366,13 @@ Coming from Java or C, you might be surprised that Scala does not really have loops. It has instead, what is in functional programming called \emph{maps}. To illustrate how they work, -lets assume you have a list of numbers from 1 to 10 and want to -build the list of squares. The list of numbers from 1 to 10 +lets assume you have a list of numbers from 1 to 8 and want to +build the list of squares. The list of numbers from 1 to 8 can be constructed in Scala as follows: \begin{lstlisting}[language=Scala,numbers=none] -scala> (1 to 10).toList -res1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) +scala> (1 to 8).toList +res1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8) \end{lstlisting} \noindent Generating from this list the list of squares in a @@ -385,26 +386,26 @@ element is transformed (for example squared) and a list over which this function should work. There are two forms to express such maps in Scala. The first way is called a -for-comprehension. Squaring the numbers from 1 to 10 would -look in this form as follows: +\emph{for-comprehension}. Squaring the numbers from 1 to 8 +would look in this form as follows: \begin{lstlisting}[language=Scala,numbers=none] -scala> for (n <- (1 to 10).toList) yield n * n -res2: List[Int] = List(1, 4, 9, 16, 25, 36, 49, 64, 81, 100) +scala> for (n <- (1 to 8).toList) yield n * n +res2: List[Int] = List(1, 4, 9, 16, 25, 36, 49, 64) \end{lstlisting} \noindent The important keywords are \code{for} and \code{yield}. This for-comprehension roughly states that from the list of numbers we draw \code{n}s and compute the result of \code{n * n}. As you can see, we specified the list where -each \code{n} comes from, namely \code{(1 to 10).toList}, and +each \code{n} comes from, namely \code{(1 to 8).toList}, and how each element needs to be transformed. This can also be expressed in a second way in Scala by using directly \code{map}s as follows: \begin{lstlisting}[language=Scala,numbers=none] -scala> (1 to 10).toList.map(n => n * n) -res3 = List(1, 4, 9, 16, 25, 36, 49, 64, 81, 100) +scala> (1 to 8).toList.map(n => n * n) +res3 = List(1, 4, 9, 16, 25, 36, 49, 64) \end{lstlisting} \noindent In this way, the expression \code{n => n * n} stands @@ -424,16 +425,16 @@ this list, we can write \begin{lstlisting}[language=Scala,numbers=none] -scala> (1 to 10).toList.map(n => n % 3) -res4 = List(1, 2, 0, 1, 2, 0, 1, 2, 0, 1) +scala> (1 to 8).toList.map(n => n % 3) +res4 = List(1, 2, 0, 1, 2, 0, 1, 2) \end{lstlisting} -\noindent If we, however, transform the numbers 1 to 10 not +\noindent If we, however, transform the numbers 1 to 8 not into a list, but into a set, and then compute the reminders modulo 3 we obtain \begin{lstlisting}[language=Scala,numbers=none] -scala> (1 to 10).toSet[Int].map(n => n % 3) +scala> (1 to 8).toSet[Int].map(n => n % 3) res5 = Set(2, 1, 0) \end{lstlisting} @@ -446,7 +447,7 @@ written as \begin{lstlisting}[language=Scala,numbers=none] -scala> for (n <- (1 to 10).toSet[Int]) yield n % 3 +scala> for (n <- (1 to 8).toSet[Int]) yield n % 3 res5 = Set(2, 1, 0) \end{lstlisting} @@ -472,9 +473,8 @@ res7 = List((1,1), (1,3), (2,2), (3,1), (3,3)) \end{lstlisting} -\noindent -The \code{if}-condition filters out all pairs where the -sum is not even. +\noindent The \code{if}-condition in the for-comprehension +filters out all pairs where the sum is not even. While hopefully this all looks reasonable, there is one complication: In the examples above we always wanted to @@ -489,12 +489,8 @@ would look as follows \begin{lstlisting}[language=Scala,numbers=none] -scala> for (n <- (1 to 5).toList) println(n) -1 -2 -3 -4 -5 +scala> for (n <- (1 to 5).toList) print(n) +12345 \end{lstlisting} \noindent @@ -526,28 +522,18 @@ \begin{lstlisting}[language=Scala,numbers=none] -scala> (1 to 5).toList.foreach(n => println(n)) -1 -2 -3 -4 -5 +scala> (1 to 5).toList.foreach(n => print(n)) +12345 \end{lstlisting} \noindent or even just - \begin{lstlisting}[language=Scala,numbers=none] -scala> (1 to 5).toList.foreach(println) -1 -2 -3 -4 -5 +scala> (1 to 5).toList.foreach(print) +12345 \end{lstlisting} - \noindent Again I hope this reminds you a bit of your lambda-calculus lessons, where an explanation is given why both forms produce the same result. @@ -567,7 +553,7 @@ \code{String} and \code{BigInt}, but also user-defined ones, like \code{Rexp}. Unfortunately, types can be a thorny subject, especially in Scala. For example, why do we need to -give the type to \code{toSet[Int]} but not to \code{toList}? +give the type to \code{toSet[Int]}, but not to \code{toList}? The reason is the power of Scala, which sometimes means it cannot infer all necessary typing information. At the beginning while getting familiar with Scala, I recommend a @@ -591,11 +577,17 @@ There are a few special type-constructors that fall outside this pattern. One is for tuples, where the type is written -with parentheses. For example \code{(Int, Int, String)} for a -triple consisting of two integers and a string. Tuples are -helpful if you want to define functions with multiple -results, say the function returning the quotient and reminder -of two numbers. For this you might define: +with parentheses. For example + +\begin{lstlisting}[language=Scala, numbers=none] +(Int, Int, String) +\end{lstlisting} + +\noindent is for a triple (a tuple with three components---two +integers and a string). Tuples are helpful if you want to +define functions with multiple results, say the function +returning the quotient and reminder of two numbers. For this +you might define: \begin{lstlisting}[language=Scala, numbers=none] @@ -603,13 +595,13 @@ \end{lstlisting} -\noindent -Since this function returns a pair of integers, its -return type needs to be \code{(Int, Int)}. Incidentally, -this is also the input type of this function. Notice it takes -\emph{two} arguments, namely \code{m} and \code{n}, both -of which are integers. They are ``packaged'' in a pair. -Consequently the complete type of \code{quo_rem} is +\noindent Since this function returns a pair of integers, its +return type needs to be of type \code{(Int, Int)}. +Incidentally, this is also the input type of this function. +Notice this function takes \emph{two} arguments, namely +\code{m} and \code{n}, both of which are integers. They are +``packaged'' in a pair. Consequently the complete type of +\code{quo_rem} is \begin{lstlisting}[language=Scala, numbers=none] (Int, Int) => (Int, Int) @@ -620,7 +612,6 @@ \code{Int => String} is for a function that takes an integer as argument and produces a string. A function of this type is for instance - \begin{lstlisting}[language=Scala,numbers=none] def mk_string(n: Int) : String = n match { case 0 => "zero" @@ -630,7 +621,6 @@ } \end{lstlisting} - \noindent It takes an integer as argument and returns a string. Unlike other functional programming languages, there is in Scala no easy way to find out the types of existing @@ -648,36 +638,65 @@ \begin{lstlisting}[language=Scala,numbers=none] -def chk_string(n: Int, s: String) : Boolean = +def chk_string(n: Int)(s: String) : Boolean = mk_string(n) == s \end{lstlisting} -\noindent which checks whether the integer \code{n} corresponds -to the name \code{s} given by the function \code{mk\_string}. +\noindent which checks whether the integer \code{n} +corresponds to the name \code{s} given by the function +\code{mk\_string}. Notice the unusual way of specifying the +arguments of this function: the arguments are given one after +the other, instead of being in a pair (what would be the type +of this function then?). This way of specifying the arguments +can be useful, for example in situations like this + +\begin{lstlisting}[language=Scala,numbers=none] +scala> List("one", "two", "three", "many").map(chk_string(2)) +res4 = List(false, true, false, false) + +scala> List("one", "two", "three", "many").map(chk_string(3)) +res5 = List(false, false, false, true) +\end{lstlisting} + +\noindent In each case we can give to \code{map} a specialised +version of \code{chk_string}---once specialised to 2 and once +to 3. This kind of ``specialising'' a function is called +\emph{partial application}---we have not yet given to this +function all arguments it needs, but only one of them. Coming back to the type \code{Int => String => Boolean}. The rule about such function types is that the right-most type specifies what the function returns (a boolean in this case). The types before that specify how many arguments the function -expects and what is their type (in this case two arguments, +expects and what their type is (in this case two arguments, one of type \code{Int} and another of type \code{String}). Given this rule, what kind of function has type \mbox{\code{(Int => String) => Boolean}}? Well, it returns a boolean. More interestingly, though, it only takes a single argument (because of the parentheses). The single argument happens to be another function (taking an integer as input and -returning a string). +returning a string). Remember that \code{mk_string} is just +such a function. So how can we use it? For this define +the somewhat silly function \code{apply_3} + +\begin{lstlisting}[language=Scala,numbers=none] +def apply_3(f: Int => String): Bool = f(3) == "many" + +scala> apply_3(mk_string) +res6 = true +\end{lstlisting} -Now you might ask, what is the point of having function as -arguments to other functions? In Java there is no need of this -kind of feature. But in all functional programming languages, -including Scala, it is really essential. Above you already -seen \code{map} and \code{foreach} which need this. Consider -the functions \code{print} and \code{println}, which both -print out strings, but the latter adds a line break. You can -call \code{foreach} with either of them and thus changing how, -for example, five numbers are printed. +You might ask, apart from silly functions like above, what is +the point of having function as arguments to other functions? +In Java there is indeed no need of this kind of feature. But +in all functional programming languages, including Scala, it +is really essential. Above you already seen \code{map} and +\code{foreach} which need this. Consider the functions +\code{print} and \code{println}, which both print out strings, +but the latter adds a line break. You can call \code{foreach} +with either of them and thus changing how, for example, five +numbers are printed. \begin{lstlisting}[language=Scala,numbers=none] @@ -752,7 +771,7 @@ feature is about implicit type conversions. If you have regular expressions and want to use them for language processing you often want to recognise keywords in a language, -for example \code{for}, \code{if}, \code{yield} and so on. But +for example \code{for},{} \code{if},{} \code{yield} and so on. But the basic regular expression, \code{CHAR}, can only recognise a single character. In order to recognise a whole string, like \code{ for}, you have to put many of those together using @@ -773,7 +792,13 @@ In Scala you can do better by ``hiding'' the conversion functions. The keyword for doing this is \code{implicit} and -it needs a built-in library called \code{implicitConversions}. +it needs a built-in library called + +\begin{lstlisting}[language=Scala,numbers=none] +scala.language.implicitConversions +\end{lstlisting} + +\noindent Consider the code @@ -803,7 +828,7 @@ \begin{lstlisting}[language=Scala,numbers=none] scala> ALT("ab", "ac") -res9: ALT = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c))) +res9 = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c))) \end{lstlisting} \noindent \code{ALT} expects two regular expressions @@ -815,7 +840,7 @@ some further syntactic sugar for regular expressions: -\begin{lstlisting}[language=Scala] +\begin{lstlisting}[language=Scala, numbers=none] implicit def RexpOps(r: Rexp) = new { def | (s: Rexp) = ALT(r, s) def ~ (s: Rexp) = SEQ(r, s) @@ -839,7 +864,7 @@ \begin{lstlisting}[language=Scala,numbers=none] scala> "ab" | "ac" -res10: ALT = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c))) +res10 = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c))) \end{lstlisting} diff -r 370c0647a9bf -r 527fdb90fffe langs.sty --- a/langs.sty Wed Sep 03 11:01:49 2014 +0100 +++ b/langs.sty Wed Sep 03 14:57:43 2014 +0100 @@ -1,4 +1,5 @@ \usepackage{listings} +\usepackage{etoolbox} \setmonofont[Scale=.95]{Consolas} \newfontfamily{\consolas}{Consolas} @@ -8,7 +9,8 @@ \definecolor{codedocblue}{rgb}{0.25,0.35,0.75} % doc \definecolor{codeblue}{rgb}{0.25,0.35,0.75} % types - +\BeforeBeginEnvironment{lstlisting}{\par\noindent\begin{minipage}{\linewidth}} +\AfterEndEnvironment{lstlisting}{\end{minipage}\par} \lstdefinelanguage{Scala}{ morekeywords={abstract,case,catch,class,def,%