24 Scala compiles to the Java Virtual Machine (JVM) and therefore |
27 Scala compiles to the Java Virtual Machine (JVM) and therefore |
25 Scala programs can run under MacOSX, Linux and |
28 Scala programs can run under MacOSX, Linux and |
26 Windows.\footnote{There are also experimental backends for |
29 Windows.\footnote{There are also experimental backends for |
27 Android and JavaScript.} Unlike Java, however, Scala often |
30 Android and JavaScript.} Unlike Java, however, Scala often |
28 allows programmers to write very concise and elegant code. |
31 allows programmers to write very concise and elegant code. |
29 Some therefore say Scala is the much better Java. Some |
32 Some therefore say Scala is the much better Java. A number of |
30 companies (The Guardian, Twitter, Coursera, LinkedIn to name a |
33 companies, The Guardian, Twitter, Coursera, LinkedIn to name a |
31 few) either use Scala excusively in production code, or some |
34 few, either use Scala exclusively in production code, or at |
32 part of it are written in Scala. If you want to try out Scala |
35 least to some substantial degree. If you want to try out Scala |
33 yourself, the Scala compiler can be downloaded from |
36 yourself, the Scala compiler can be downloaded from |
34 |
37 |
35 \begin{quote} |
38 \begin{quote} |
36 \url{http://www.scala-lang.org} |
39 \url{http://www.scala-lang.org} |
37 \end{quote} |
40 \end{quote} |
38 |
41 |
39 Why do I use Scala in the AFL module? Actually, you can do |
42 Why do I use Scala in the AFL module? Actually, you can do |
40 \emph{any} part of the programming coursework in \emph{any} |
43 \emph{any} part of the coursework in \emph{any} programming |
41 programming language you like. I use Scala for showing you |
44 language you like. I use Scala for showing you code during the |
42 code during the lectures because its functional |
45 lectures because its functional programming-style allows me to |
43 programming-style allows me to implement the functions we will |
46 implement the functions we will discuss with very small |
44 discuss with very small code-snippets. Since the compiler is |
47 code-snippets. If I had to do this in Java, for example, I |
45 free, you can download them and run every example I give. But |
48 would first have to run through heaps of boilerplate code. |
46 if you prefer, you can also easily translate the code-snippets |
49 Since the Scala compiler is free, you can download the |
47 into any other functional language, for example Haskell, |
50 code-snippets and run every example I give. But if you prefer, |
48 Standard ML, F\#, Ocaml and so on. |
51 you can also easily translate them into any other functional |
|
52 language, for example Haskell, Standard ML, F$^\#$, Ocaml and |
|
53 so on. |
49 |
54 |
50 Developing programs in Scala can be done with the Eclipse IDE |
55 Developing programs in Scala can be done with the Eclipse IDE |
51 and also with IntelliJ IDE, but for the small programs I will |
56 and also with IntelliJ IDE, but for the small programs I will |
52 develop the good old Emacs-editor is adequate for me and I |
57 develop the good old Emacs-editor is adequate for me and I |
53 will run the programs on the command line. One advantage of |
58 will run the programs on the command line. One advantage of |
54 Scala over Java is that it includes an interpreter (a REPL, or |
59 Scala over Java is that it includes an interpreter (a REPL, or |
55 Read-Eval-Print-Loop) with which you can run and test small |
60 \underline{R}ead-\underline{E}val-\underline{P}rint-\underline{L}oop) |
56 code-snippets without the need of the compiler. This helps a |
61 with which you can run and test small code-snippets without |
57 lot with interactively developing programs. Once you installed |
62 the need of the compiler. This helps a lot with interactively |
58 Scala correctly, you can start the interpreter by typing on |
63 developing programs. Once you installed Scala, you can start |
59 the command line: |
64 the interpreter by typing on the command line: |
60 |
65 |
61 \begin{quote} |
66 \begin{lstlisting}[language={},numbers=none,basicstyle=\ttfamily\small] |
62 \begin{alltt} |
67 $ scala |
63 $ scala\small |
|
64 Welcome to Scala version 2.11.2 (Java HotSpot(TM) 64-Bit Server VM). |
68 Welcome to Scala version 2.11.2 (Java HotSpot(TM) 64-Bit Server VM). |
65 Type in expressions to have them evaluated. |
69 Type in expressions to have them evaluated. |
66 Type :help for more information.\normalsize |
70 Type :help for more information. |
67 |
71 |
68 scala> |
72 scala> |
69 \end{alltt} |
73 \end{lstlisting} |
70 \end{quote} |
|
71 |
74 |
72 \noindent The precise response may vary due to the platform |
75 \noindent The precise response may vary due to the platform |
73 where you installed Scala. At the Scala prompt you can type |
76 where you installed Scala. At the Scala prompt you can type |
74 things like {\tt 2 + 3} \keys{Ret} and the output will be |
77 things like \code{2 + 3} \keys{Ret} and the output will be |
75 |
78 |
76 \begin{quote} |
79 \begin{lstlisting}[language=Scala,numbers=none] |
77 \begin{alltt} |
|
78 scala> 2 + 3 |
80 scala> 2 + 3 |
79 res0: Int = 5 |
81 res0: Int = 5 |
80 \end{alltt} |
82 \end{lstlisting} |
81 \end{quote} |
|
82 |
83 |
83 \noindent indicating that the result of the addition is of |
84 \noindent indicating that the result of the addition is of |
84 type {\tt Int} and the actual result is {\tt 5}. Another |
85 type \code{Int} and the actual result is 5. Another classic |
85 classic example you can try out is |
86 example you can try out is |
86 |
87 |
87 \begin{quote} |
88 \begin{lstlisting}[language=Scala,numbers=none] |
88 \begin{alltt} |
89 scala> print("hello world") |
89 scala> print ("hello world") |
|
90 hello world |
90 hello world |
91 \end{alltt} |
91 \end{lstlisting} |
92 \end{quote} |
|
93 |
92 |
94 \noindent Note that in this case there is no result. The |
93 \noindent Note that in this case there is no result. The |
95 reason is that {\tt print} does not actually produce a result |
94 reason is that \code{print} does not actually produce a result |
96 (there is no {\tt resXX}), rather it is a function that causes |
95 (there is no \code{resXX}), rather it is a function that |
97 the \emph{side-effect} of printing out a string. Once you are |
96 causes the \emph{side-effect} of printing out a string. Once |
98 more familiar with the functional programming-style, you will |
97 you are more familiar with the functional programming-style, |
99 know what the difference is between a function that returns a |
98 you will know what the difference is between a function that |
100 result, like addition, and a function that causes a |
99 returns a result, like addition, and a function that causes a |
101 side-effect, like {\tt print}. We shall come back to this |
100 side-effect, like \code{print}. We shall come back to this |
102 point later, but if you are curious now, the latter kind of |
101 point later, but if you are curious now, the latter kind of |
103 functions always have as return type {\tt Unit}. |
102 functions always have as return type \code{Unit}. |
104 |
103 |
105 If you want to write a stand-alone app in Scala, you can |
104 If you want to write a stand-alone app in Scala, you can |
106 implement an object that is an instance of {\tt App}, say |
105 implement an object that is an instance of \code{App}, say |
107 |
106 |
108 \begin{quote} |
|
109 \begin{lstlisting}[language=Scala,numbers=none] |
107 \begin{lstlisting}[language=Scala,numbers=none] |
110 object Hello extends App { |
108 object Hello extends App { |
111 println ("hello world") |
109 println ("hello world") |
112 } |
110 } |
113 \end{lstlisting} |
111 \end{lstlisting} |
114 \end{quote} |
112 |
115 |
113 \noindent save it in a file, say {\tt hello-world.scala}, and |
116 \noindent save it in a file, say {\tt hellow-world.scala}, and |
|
117 then run the compiler and runtime environment: |
114 then run the compiler and runtime environment: |
118 |
115 |
119 \begin{quote} |
116 \begin{lstlisting}[language={},numbers=none,basicstyle=\ttfamily\small] |
120 \begin{alltt} |
|
121 $ scalac hello-world.scala |
117 $ scalac hello-world.scala |
122 $ scala Hello |
118 $ scala Hello |
123 hello world |
119 hello world |
124 \end{alltt} |
120 \end{lstlisting} |
125 \end{quote} |
|
126 |
121 |
127 As mentioned above, Scala targets the JVM and consequently |
122 As mentioned above, Scala targets the JVM and consequently |
128 Scala programs can also be executed by the bog-standard Java |
123 Scala programs can also be executed by the bog-standard Java |
129 Runtime. This only requires the inclusion of {\tt |
124 Runtime. This only requires the inclusion of {\tt |
130 scala-library.jar}, which on my computer can be done as |
125 scala-library.jar}, which on my computer can be done as |
131 follows: |
126 follows: |
132 |
127 |
133 \begin{quote} |
128 \begin{lstlisting}[language={},numbers=none,basicstyle=\ttfamily\small] |
134 \begin{alltt} |
|
135 $ scalac hello-world.scala |
129 $ scalac hello-world.scala |
136 $ java -cp /usr/local/src/scala/lib/scala-library.jar:. Hello |
130 $ java -cp /usr/local/src/scala/lib/scala-library.jar:. Hello |
137 hello world |
131 hello world |
138 \end{alltt} |
132 \end{lstlisting} |
139 \end{quote} |
|
140 |
133 |
141 |
134 |
142 \subsection*{Inductive Datatypes} |
135 \subsection*{Inductive Datatypes} |
143 |
136 |
144 The elegance and conciseness of Scala programs are often a |
137 The elegance and conciseness of Scala programs are often a |
145 result of inductive datatypes that can be easily defined. For |
138 result of inductive datatypes that can be easily defined. For |
146 example in ``every-day mathematics'' we would define regular |
139 example in ``every-day mathematics'' we define regular |
147 expressions simply by giving the grammar |
140 expressions simply by giving the grammar |
148 |
141 |
149 \begin{center} |
142 \begin{center} |
150 \begin{tabular}{r@{\hspace{2mm}}r@{\hspace{2mm}}l@{\hspace{13mm}}l} |
143 \begin{tabular}{r@{\hspace{2mm}}r@{\hspace{2mm}}l@{\hspace{13mm}}l} |
151 $r$ & $::=$ & $\varnothing$ & null\\ |
144 $r$ & $::=$ & $\varnothing$ & null\\ |
161 (essentially a kind of tree-structure with three kinds of |
154 (essentially a kind of tree-structure with three kinds of |
162 inner nodes---sequence, alternative and star---and three kinds |
155 inner nodes---sequence, alternative and star---and three kinds |
163 of leave nodes---null, empty and character). If you are |
156 of leave nodes---null, empty and character). If you are |
164 familiar with Java, it might be an instructive exercise to |
157 familiar with Java, it might be an instructive exercise to |
165 define this kind of inductive datatypes in Java\footnote{Happy |
158 define this kind of inductive datatypes in Java\footnote{Happy |
166 programming! \Smiley} and then compare it how |
159 programming! \Smiley} and then compare it with how it can be |
167 it can be defined in Scala. |
160 defined in Scala. |
168 |
161 |
169 Implementing the regular expressions from above in Scala is |
162 Implementing the regular expressions from above in Scala is |
170 actually very simple: It first requires an \emph{abstract |
163 actually very simple: It first requires an \emph{abstract |
171 class}, say, {\tt Rexp}. This will act as the type for regular |
164 class}, say, \code{Rexp}. This will act as the type for |
172 expressions. Second, it requires a case for each clause in the |
165 regular expressions. Second, it requires a case for each |
173 grammar. The cases for $\varnothing$ and $\epsilon$ do not |
166 clause in the grammar. The cases for $\varnothing$ and |
174 have any arguments, while in all the other cases we do have |
167 $\epsilon$ do not have any arguments, while in all the other |
175 arguments. For example the character regular expression needs |
168 cases we do have arguments. For example the character regular |
176 to take as an argument the character it is supposed to |
169 expression needs to take as an argument the character it is |
177 recognise. In Scala, the cases without arguments are called |
170 supposed to recognise. In Scala, the cases without arguments |
178 \emph{case objects}, while the ones with arguments are |
171 are called \emph{case objects}, while the ones with arguments |
179 \emph{case classes}. The corresponding Scala code is as |
172 are \emph{case classes}. The corresponding Scala code is as |
180 follows: |
173 follows: |
181 |
174 |
182 \begin{quote} |
|
183 \begin{lstlisting}[language=Scala,numbers=none] |
175 \begin{lstlisting}[language=Scala,numbers=none] |
184 abstract class Rexp |
176 abstract class Rexp |
185 case object NULL extends Rexp |
177 case object NULL extends Rexp |
186 case object EMPTY extends Rexp |
178 case object EMPTY extends Rexp |
187 case class CHAR (c: Char) extends Rexp |
179 case class CHAR (c: Char) extends Rexp |
188 case class SEQ (r1: Rexp, r2: Rexp) extends Rexp |
180 case class SEQ (r1: Rexp, r2: Rexp) extends Rexp |
189 case class ALT (r1: Rexp, r2: Rexp) extends Rexp |
181 case class ALT (r1: Rexp, r2: Rexp) extends Rexp |
190 case class STAR (r: Rexp) extends Rexp |
182 case class STAR (r: Rexp) extends Rexp |
191 \end{lstlisting} |
183 \end{lstlisting} |
192 \end{quote} |
184 |
193 |
185 \noindent In order to be an instance of \code{Rexp}, each case |
194 \noindent In order to be an instance of {\tt Rexp}, each case |
186 object and case class needs to extend \code{Rexp}. Given the |
195 object and case class needs to extend {\tt Rexp}. Given the |
|
196 grammar above, I hope you can see the underlying pattern. If |
187 grammar above, I hope you can see the underlying pattern. If |
197 you want to play further with such definitions of inductive |
188 you want to play further with such definitions of inductive |
198 datatypes, feel free to define for example binary trees. |
189 datatypes, feel free to define for example binary trees. |
199 |
190 |
200 Once you make a definition like the one above, you can |
191 Once you make a definition like the one above in Scala, you |
201 represent, for example, the regular expression for $a + b$ in |
192 can represent, for example, the regular expression for $a + b$ |
202 Scala as {\tt ALT(CHAR('a'), CHAR('b'))}. Expressions such as |
193 as \code{ALT(CHAR('a'), CHAR('b'))}. Expressions such as |
203 {\tt 'a'} stand for ASCII characters, though in the output |
194 \code{'a'} stand for ASCII characters, though in the output |
204 syntax the quotes are omitted. If you want to assign this |
195 syntax the quotes are omitted. If you want to assign this |
205 regular expression to a variable, you can use the keyword {\tt |
196 regular expression to a variable, you can use the keyword |
206 val} and type |
197 \code{val} and type |
207 |
198 |
208 \begin{quote} |
199 \begin{lstlisting}[language=Scala,numbers=none] |
209 \begin{alltt} |
|
210 scala> val r = ALT(CHAR('a'), CHAR('b')) |
200 scala> val r = ALT(CHAR('a'), CHAR('b')) |
211 r: ALT = ALT(CHAR(a),CHAR(b)) |
201 r: ALT = ALT(CHAR(a),CHAR(b)) |
212 \end{alltt} |
202 \end{lstlisting} |
213 \end{quote} |
|
214 |
203 |
215 \noindent As you can see, in order to make such assignments, |
204 \noindent As you can see, in order to make such assignments, |
216 no constructor is required in the class (as in Java). However, |
205 no constructor is required in the class (as in Java). However, |
217 if there is the need for some non-standard initialisation, you |
206 if there is the need for some non-standard initialisation, you |
218 can of course define such a constructor in Scala too. But we |
207 can of course define such a constructor in Scala too. But we |
219 omit such ``tricks'' here. |
208 omit such ``tricks'' here. |
220 |
209 |
221 Note that Scala in its response says the variable {\tt r} is |
210 Note that Scala in its response says the variable \code{r} is |
222 of type {\tt ALT}, not {\tt Rexp}. This might be a bit |
211 of type \lstinline[emph={ALT}]!ALT!, not \code{Rexp}. This |
223 unexpected, but can be explained as follows: Scala always |
212 might be a bit unexpected, but can be explained as follows: |
224 tries to find the most general type that is needed for a |
213 Scala always tries to find the most general type that is |
225 variable or expression, but does not ``over-generalise''. In |
214 needed for a variable or expression, but does not |
226 our definition the type {\tt Rexp} is more general than {\tt |
215 ``over-generalise''. In our definition the type \code{Rexp} is |
227 ALT}, since it is the abstract class. But in this case there |
216 more general than \lstinline[emph={ALT}]!ALT!, since it is the |
228 is no need to give {\tt r} the more general type of {\tt |
217 abstract class. But in this case there is no need to give |
229 Rexp}. This is different if you want to form a list of regular |
218 \code{r} the more general type of \code{Rexp}. This is |
230 expressions, for example |
219 different if you want to form a list of regular expressions, |
231 |
220 for example |
232 \begin{quote} |
221 |
233 \begin{alltt} |
222 |
|
223 \begin{lstlisting}[language=Scala,numbers=none] |
234 scala> val ls = List(ALT(CHAR('a'), CHAR('b')), NULL) |
224 scala> val ls = List(ALT(CHAR('a'), CHAR('b')), NULL) |
235 ls: List[Rexp] = List(ALT(CHAR(a),CHAR(b)), NULL) |
225 ls: List[Rexp] = List(ALT(CHAR(a),CHAR(b)), NULL) |
236 \end{alltt} |
226 \end{lstlisting} |
237 \end{quote} |
227 |
238 |
228 |
239 \noindent In this case, Scala needs to assign a common type to |
229 \noindent In this case, Scala needs to assign a common type to |
240 the regular expressions so that it is compatible with the |
230 the regular expressions so that it is compatible with the |
241 fact that lists can only contain elements of a single type. In |
231 fact that lists can only contain elements of a single type. In |
242 this case the first common type is {\tt Rexp}.\footnote{If you |
232 this case the first common type is \code{Rexp}.\footnote{If you |
243 type in this example, you will notice that the type contains |
233 type in this example, you will notice that the type contains |
244 some further information, but lets ignore this for the |
234 some further information, but lets ignore this for the |
245 moment.} |
235 moment.} |
246 |
236 |
247 For compound types like {\tt List[...]}, the general rule is |
237 For compound types like \code{List[...]}, the general rule is |
248 that when a type takes another type as argument, then this |
238 that when a type takes another type as argument, then this |
249 argument type is written in angle-brackets. This can also |
239 argument type is written in angle-brackets. This can also |
250 contain nested types as in {\tt List[Set[Rexp]]}, which is a |
240 contain nested types as in \code{List[Set[Rexp]]}, which is a |
251 list of sets each of which contains regular expressions. |
241 list of sets each of which contains regular expressions. |
252 |
242 |
253 \subsection*{Functions and Pattern-Matching} |
243 \subsection*{Functions and Pattern-Matching} |
254 |
244 |
255 I mentioned above that Scala is a very elegant programming |
245 I mentioned above that Scala is a very elegant programming |
335 \noindent This function is defined by recursion analysing each |
325 \noindent This function is defined by recursion analysing each |
336 pattern of what the regular expression could look like. The |
326 pattern of what the regular expression could look like. The |
337 corresponding Scala code looks very similar to this |
327 corresponding Scala code looks very similar to this |
338 definition, thanks to pattern-matching. |
328 definition, thanks to pattern-matching. |
339 |
329 |
340 |
|
341 \begin{quote} |
|
342 \lstinputlisting[language=Scala]{../progs/rev.scala} |
330 \lstinputlisting[language=Scala]{../progs/rev.scala} |
343 \end{quote} |
331 |
344 |
332 \noindent The keyword for starting a pattern-match is |
345 \noindent The keyword for starting a pattern-match is {\tt |
333 \code{match} followed by a list of \code{case}s. Before the match |
346 match} followed by a list of {\tt case}s. Before the match |
|
347 keyword can be another pattern, but often as in the case |
334 keyword can be another pattern, but often as in the case |
348 above, it is just a variable you want to pattern-match |
335 above, it is just a variable you want to pattern-match |
349 (the {\tt r} after {\tt =} in Line 1). |
336 (the \code{r} after \code{=} in Line 1). |
350 |
337 |
351 Each case in this definition follows the structure of how we |
338 Each case in this definition follows the structure of how we |
352 defined regular expressions as inductive datatype. For example |
339 defined regular expressions as inductive datatype. For example |
353 the case in Line 3 you can read as: if the regular expression |
340 the case in Line 3 you can read as: if the regular expression |
354 {\tt r} is of the form {\tt EMPTY} then do whatever follows |
341 \code{r} is of the form \code{EMPTY} then do whatever follows |
355 the {\tt =>} (in this case just return {\tt EMPTY}). Line 5 |
342 the \code{=>} (in this case just return \code{EMPTY}). Line 5 |
356 reads as: if the regular expression {\tt r} is of the form |
343 reads as: if the regular expression \code{r} is of the form |
357 {\tt ALT(r1, r2)}, where the left-branch of the alternative is |
344 \code{ALT(r1, r2)}, where the left-branch of the alternative is |
358 matched by the variable {\tt r1} and the right-branch by {\tt |
345 matched by the variable \code{r1} and the right-branch by |
359 r2} then do ``something''. The ``something'' can now use the |
346 \code{r2} then do ``something''. The ``something'' can now use the |
360 variables {\tt r1} and {\tt r2} from the match. |
347 variables \code{r1} and \code{r2} from the match. |
361 |
348 |
362 If you want to play with this function, call it for example |
349 If you want to play with this function, call it for example |
363 with the regular expression $ab + ac$. This regular expression |
350 with the regular expression $ab + ac$. This regular expression |
364 can recognise the strings $ab$ and $ac$. The function {\tt |
351 can recognise the strings $ab$ and $ac$. The function |
365 rev} produces $ba + ca$, which can recognise the reversed |
352 \code{rev} produces $ba + ca$, which can recognise the reversed |
366 strings $ba$ and $ca$. |
353 strings $ba$ and $ca$. |
367 |
354 |
368 In Scala each pattern-match can also be guarded as in |
355 In Scala each pattern-match can also be guarded as in |
369 |
356 |
370 \begin{quote} |
|
371 \begin{lstlisting}[language=Scala, numbers=none] |
357 \begin{lstlisting}[language=Scala, numbers=none] |
372 case Pattern if Condition => Do_Something |
358 case Pattern if Condition => Do_Something |
373 \end{lstlisting} |
359 \end{lstlisting} |
374 \end{quote} |
360 |
375 |
361 \noindent This allows us, for example, to re-write the |
376 \noindent This allows us, for example, to re-write the {\tt |
362 \code{collatz}-function from above as follows: |
377 collatz}-function from above as follows: |
|
378 |
363 |
379 \begin{quote} |
|
380 \lstinputlisting[language=Scala]{../progs/collatz2.scala} |
364 \lstinputlisting[language=Scala]{../progs/collatz2.scala} |
381 \end{quote} |
365 |
382 |
366 |
383 \noindent Although in this case the pattern-match does not |
367 \noindent Although in this case the pattern-match does not |
384 improve the code in any way. The underscore in the last case |
368 improve the code in any way. In cases like \code{rev} it |
385 indicates that we do not care what the pattern looks like. |
369 is really crucial. The underscore in the last case indicates |
386 Thus Line 4 acts like a default case whenever the cases above |
370 that we do not care what the pattern looks like. Thus Line 4 |
387 did not match. Cases are always tried out from top to bottom. |
371 acts like a default case whenever the cases above did not |
|
372 match. Cases are always tried out from top to bottom. |
388 |
373 |
389 \subsection*{Loops} |
374 \subsection*{Loops, or the Absence of} |
390 |
375 |
391 Coming from Java or C, you might be surprised that Scala does |
376 Coming from Java or C, you might be surprised that Scala does |
392 not really have loops. It has instead, what is in functional |
377 not really have loops. It has instead, what is in functional |
393 programming called \emph{maps}. To illustrate how they work, |
378 programming called \emph{maps}. To illustrate how they work, |
394 lets assume you have a list of numbers from 1 to 10 and want to |
379 lets assume you have a list of numbers from 1 to 10 and want to |
395 build the list of squares. The list of numbers from 1 to 10 |
380 build the list of squares. The list of numbers from 1 to 10 |
396 can be constructed in Scala as follows: |
381 can be constructed in Scala as follows: |
397 |
382 |
398 \begin{quote} |
383 \begin{lstlisting}[language=Scala,numbers=none] |
399 \begin{alltt} |
|
400 scala> (1 to 10).toList |
384 scala> (1 to 10).toList |
401 res1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) |
385 res1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) |
402 \end{alltt} |
386 \end{lstlisting} |
403 \end{quote} |
|
404 |
387 |
405 \noindent Generating from this list the list of squares in a |
388 \noindent Generating from this list the list of squares in a |
406 non-functional programming language (e.g.~Java), you would |
389 programming language such as Java, you would assume the list |
407 assume the list is given as a kind of array. You would then |
390 is given as a kind of array. You would then iterate, or loop, |
408 iterate, or loop, an index over this array and replace each |
391 an index over this array and replace each entry in the array |
409 entry in the array by the square. Right? In Scala, and in |
392 by the square. Right? In Scala, and in other functional |
410 other functional programming languages, you use maps to |
393 programming languages, you use maps to achieve the same. |
411 achieve the same. |
394 |
412 |
395 A map essentially takes a function that describes how each |
413 Maps essentially take a function that describes how each |
396 element is transformed (for example squared) and a list over |
414 element is transformed (for example squaring) and a list over |
|
415 which this function should work. There are two forms to |
397 which this function should work. There are two forms to |
416 express such maps in Scala. The first way is in a {\tt |
398 express such maps in Scala. The first way is called a |
417 for}-construction. Squaring the numbers from 1 to 10 would |
399 for-comprehension. Squaring the numbers from 1 to 10 would |
418 look in this form as follows: |
400 look in this form as follows: |
419 |
401 |
420 \begin{quote} |
402 \begin{lstlisting}[language=Scala,numbers=none] |
421 \begin{alltt} |
|
422 scala> for (n <- (1 to 10).toList) yield n * n |
403 scala> for (n <- (1 to 10).toList) yield n * n |
423 res2: List[Int] = List(1, 4, 9, 16, 25, 36, 49, 64, 81, 100) |
404 res2: List[Int] = List(1, 4, 9, 16, 25, 36, 49, 64, 81, 100) |
424 \end{alltt} |
405 \end{lstlisting} |
425 \end{quote} |
406 |
426 |
407 \noindent The important keywords are \code{for} and |
427 \noindent The keywords are {\tt for} and {\tt yield}. This |
408 \code{yield}. This for-comprehension roughly states that from |
428 {\tt for}-construction roughly says that from the list of |
409 the list of numbers we draw \code{n}s and compute the result |
429 numbers we draw {\tt n}s and compute the result of {\tt n * |
410 of \code{n * n}. As you can see, we specified the list where |
430 n}. As you can see, we specified the list where each {\tt n} |
411 each \code{n} comes from, namely \code{(1 to 10).toList}, and |
431 comes from, namely {\tt (1 to 10).toList}, and how each |
412 how each element needs to be transformed. This can also be |
432 element needs to be transformed. This can also be expressed in |
413 expressed in a second way in Scala by using directly |
433 a second way in Scala by using directly {\tt map} as follows: |
414 \code{map}s as follows: |
434 |
415 |
435 \begin{quote} |
416 \begin{lstlisting}[language=Scala,numbers=none] |
436 \begin{alltt} |
|
437 scala> (1 to 10).toList.map(n => n * n) |
417 scala> (1 to 10).toList.map(n => n * n) |
438 res3 = List(1, 4, 9, 16, 25, 36, 49, 64, 81, 100) |
418 res3 = List(1, 4, 9, 16, 25, 36, 49, 64, 81, 100) |
439 \end{alltt} |
419 \end{lstlisting} |
440 \end{quote} |
420 |
441 |
421 \noindent In this way, the expression \code{n => n * n} stands |
442 \noindent In this way, the expression {\tt n => n * n} stands |
|
443 for the function that calculates the square (this is how the |
422 for the function that calculates the square (this is how the |
444 {\tt n}s are transformed). This expression for functions might |
423 \code{n}s are transformed). This expression for functions |
445 remind you of your lessons about the lambda-calculus where |
424 might remind you of your lessons about the lambda-calculus |
446 this would have been written as $\lambda n.\,n * n$. It might |
425 where this would have been written as $\lambda n.\,n * n$. It |
447 not be obvious, but {\tt for}-constructions are just syntactic |
426 might not be obvious, but for-comprehensions are just |
448 sugar: when compiling, Scala translates {\tt |
427 syntactic sugar: when compiling, Scala translates |
449 for}-constructions into equivalent maps. |
428 for-comprehensions into equivalent maps. This even works |
450 |
429 when for-comprehensions get more complicated (see below). |
451 |
430 |
452 The very charming feature of Scala is that such maps or {\tt |
431 The very charming feature of Scala is that such maps or |
453 for}-constructions can be written for any kind of data |
432 for-comprehensions can be written for any kind of data |
454 collection, such as lists, sets, vectors and so on. For |
433 collection, such as lists, sets, vectors, options and so on. |
455 example if we instead compute the reminders modulo $3$ of this |
434 For example if we instead compute the reminders modulo 3 of |
456 list, we can write |
435 this list, we can write |
457 |
436 |
458 \begin{quote} |
437 \begin{lstlisting}[language=Scala,numbers=none] |
459 \begin{alltt} |
438 scala> (1 to 10).toList.map(n => n % 3) |
460 scala> (1 to 10).toList.map(n => n \% 3) |
|
461 res4 = List(1, 2, 0, 1, 2, 0, 1, 2, 0, 1) |
439 res4 = List(1, 2, 0, 1, 2, 0, 1, 2, 0, 1) |
462 \end{alltt} |
440 \end{lstlisting} |
463 \end{quote} |
|
464 |
441 |
465 \noindent If we, however, transform the numbers 1 to 10 not |
442 \noindent If we, however, transform the numbers 1 to 10 not |
466 into a list, but into a set, and then compute the reminders |
443 into a list, but into a set, and then compute the reminders |
467 modulo $3$ we obtain |
444 modulo 3 we obtain |
468 |
445 |
469 \begin{quote} |
446 \begin{lstlisting}[language=Scala,numbers=none] |
470 \begin{alltt} |
447 scala> (1 to 10).toSet[Int].map(n => n % 3) |
471 scala> (1 to 10).toSet[Int].map(n => n \% 3) |
|
472 res5 = Set(2, 1, 0) |
448 res5 = Set(2, 1, 0) |
473 \end{alltt} |
449 \end{lstlisting} |
474 \end{quote} |
|
475 |
450 |
476 \noindent This is the correct result for sets, as there are |
451 \noindent This is the correct result for sets, as there are |
477 only three equivalence classes of integers modulo 3. Note that |
452 only three equivalence classes of integers modulo 3. Note that |
478 in this example we need to ``help'' Scala to transform the |
453 in this example we need to ``help'' Scala to transform the |
479 numbers into a set of integers by explicitly annotating the |
454 numbers into a set of integers by explicitly annotating the |
480 type {\tt Int}. Since maps and {\tt for}-constructions are |
455 type \code{Int}. Since maps and for-comprehensions are |
481 just syntactic variants of each other, the latter can also be |
456 just syntactic variants of each other, the latter can also be |
482 written as |
457 written as |
483 |
458 |
484 \begin{quote} |
459 \begin{lstlisting}[language=Scala,numbers=none] |
485 \begin{alltt} |
460 scala> for (n <- (1 to 10).toSet[Int]) yield n % 3 |
486 scala> for (n <- (1 to 10).toSet[Int]) yield n \% 3 |
|
487 res5 = Set(2, 1, 0) |
461 res5 = Set(2, 1, 0) |
488 \end{alltt} |
462 \end{lstlisting} |
489 \end{quote} |
463 |
|
464 For-comprehensions can also be nested and the selection of |
|
465 elements can be guarded. For example if we want to pair up |
|
466 the numbers 1 to 4 with the letters a to c, we can write |
|
467 |
|
468 \begin{lstlisting}[language=Scala,numbers=none] |
|
469 scala> for (n <- (1 to 4).toList; |
|
470 c <- ('a' to 'c').toList) yield (n, c) |
|
471 res6 = List((1,a), (1,b), (1,c), (2,a), (2,b), (2,c), |
|
472 (3,a), (3,b), (3,c), (4,a), (4,b), (4,c)) |
|
473 \end{lstlisting} |
|
474 |
|
475 \noindent |
|
476 Or if we want to find all pairs of numbers between 1 and 3 |
|
477 where the sum is an even number, we can write |
|
478 |
|
479 \begin{lstlisting}[language=Scala,numbers=none] |
|
480 scala> for (n <- (1 to 3).toList; |
|
481 m <- (1 to 3).toList; |
|
482 if (n + m) % 2 == 0) yield (n, m) |
|
483 res7 = List((1,1), (1,3), (2,2), (3,1), (3,3)) |
|
484 \end{lstlisting} |
|
485 |
|
486 \noindent |
|
487 The \code{if}-condition filters out all pairs where the |
|
488 sum is not even. |
490 |
489 |
491 While hopefully this all looks reasonable, there is one |
490 While hopefully this all looks reasonable, there is one |
492 complication: In the examples above we always wanted to |
491 complication: In the examples above we always wanted to |
493 transform one list into another list (e.g.~list of squares), |
492 transform one list into another list (e.g.~list of squares), |
494 or one set into another set (set of numbers into set of |
493 or one set into another set (set of numbers into set of |
495 reminders modulo 3). What happens if we just want to print out |
494 reminders modulo 3). What happens if we just want to print out |
496 a list of integers? Then actually the {\tt for}-construction |
495 a list of integers? Then actually the for-comprehension |
497 needs to be modified. The reason is that {\tt print}, you |
496 needs to be modified. The reason is that \code{print}, you |
498 guessed it, does not produce any result, but only produces |
497 guessed it, does not produce any result, but only produces |
499 what is in the functional-programming-lingo called a |
498 what is in the functional-programming-lingo called a |
500 side-effect. Printing out the list of numbers from 1 to 5 |
499 side-effect. Printing out the list of numbers from 1 to 5 |
501 would look as follows |
500 would look as follows |
502 |
501 |
503 \begin{quote} |
502 \begin{lstlisting}[language=Scala,numbers=none] |
504 \begin{alltt} |
|
505 scala> for (n <- (1 to 5).toList) println(n) |
503 scala> for (n <- (1 to 5).toList) println(n) |
506 1 |
504 1 |
507 2 |
505 2 |
508 3 |
506 3 |
509 4 |
507 4 |
510 5 |
508 5 |
511 \end{alltt} |
509 \end{lstlisting} |
512 \end{quote} |
|
513 |
510 |
514 \noindent |
511 \noindent |
515 where you need to omit the keyword {\tt yield}. You can |
512 where you need to omit the keyword \code{yield}. You can |
516 also do more elaborate calculations such as |
513 also do more elaborate calculations such as |
517 |
514 |
518 \begin{quote} |
515 \begin{lstlisting}[language=Scala,numbers=none] |
519 \begin{alltt} |
516 scala> for (n <- (1 to 5).toList) { |
520 scala> for (n <- (1 to 5).toList) \{ |
|
521 val square_n = n * n |
517 val square_n = n * n |
522 println(s"$n * $n = $square_n") |
518 println(s"$n * $n = $square_n") |
523 \} |
519 } |
524 1 * 1 = 1 |
520 1 * 1 = 1 |
525 2 * 2 = 4 |
521 2 * 2 = 4 |
526 3 * 3 = 9 |
522 3 * 3 = 9 |
527 4 * 4 = 16 |
523 4 * 4 = 16 |
528 5 * 5 = 25 |
524 5 * 5 = 25 |
529 \end{alltt} |
525 \end{lstlisting} |
530 \end{quote} |
526 |
531 |
527 \noindent In this code I use a variable assignment (\code{val |
532 \noindent In this code I use a variable assignment and a |
528 square_n = ...} ) and what is called a |
533 \emph{string interpolation}, written {\tt s"..."}, in order to |
529 \emph{string interpolation}, written \code{s"..."}, in order to |
534 print out an equation. The string interpolation allows me to |
530 print out an equation. The string interpolation allows me to |
535 refer to the integer values {\tt n} and {\tt square\_n} inside |
531 refer to the integer values \code{n} and \code{square\_n} inside |
536 a string. This is very convenient for printing out ``things''. |
532 a string. This is very convenient for printing out ``things''. |
537 |
533 |
538 The corresponding map construction for functions with |
534 The corresponding map construction for functions with |
539 side-effects is in Scala called {\tt foreach}. So you |
535 side-effects is in Scala called \code{foreach}. So you |
540 could also write |
536 could also write |
541 |
537 |
542 \begin{quote} |
538 |
543 \begin{alltt} |
539 \begin{lstlisting}[language=Scala,numbers=none] |
544 scala> (1 to 5).toList.foreach(n => println(n)) |
540 scala> (1 to 5).toList.foreach(n => println(n)) |
545 1 |
541 1 |
546 2 |
542 2 |
547 3 |
543 3 |
548 4 |
544 4 |
549 5 |
545 5 |
550 \end{alltt} |
546 \end{lstlisting} |
551 \end{quote} |
547 |
552 |
548 |
553 \noindent or even just |
549 \noindent or even just |
554 |
550 |
555 \begin{quote} |
551 |
556 \begin{alltt} |
552 \begin{lstlisting}[language=Scala,numbers=none] |
557 scala> (1 to 5).toList.foreach(println) |
553 scala> (1 to 5).toList.foreach(println) |
558 1 |
554 1 |
559 2 |
555 2 |
560 3 |
556 3 |
561 4 |
557 4 |
562 5 |
558 5 |
563 \end{alltt} |
559 \end{lstlisting} |
564 \end{quote} |
560 |
565 |
561 |
566 \noindent Again I hope this reminds you a bit of your |
562 \noindent Again I hope this reminds you a bit of your |
567 lambda-calculus lessons, where an explanation is given why |
563 lambda-calculus lessons, where an explanation is given why |
568 both forms produce the same result. |
564 both forms produce the same result. |
569 |
565 |
570 |
566 |
571 If you want to find out more about maps and functions with |
567 If you want to find out more about maps and functions with |
572 side-effects, you can ponder about the response Scala gives if |
568 side-effects, you can ponder about the response Scala gives if |
573 you replace {\tt foreach} by {\tt map} in the expression |
569 you replace \code{foreach} by \code{map} in the expression |
574 above. Scala will still allow {\tt map} with side-effect |
570 above. Scala will still allow \code{map} with side-effect |
575 functions, but then reacts with a slightly interesting result. |
571 functions, but then reacts with a slightly interesting result. |
576 |
572 |
577 \subsection*{Types} |
573 \subsection*{Types} |
578 |
574 |
579 In most functional programming languages types play an |
575 In most functional programming languages types play an |
580 important role. Scala is such a language. You have already |
576 important role. Scala is such a language. You have already |
581 seen built-in types, like {\tt Int}, {\tt Boolean}, {\tt |
577 seen built-in types, like \code{Int}, \code{Boolean}, |
582 String} and {\tt BigInt}, but also user-defined ones, like {\tt Rexp}. |
578 \code{String} and \code{BigInt}, but also user-defined ones, |
583 Unfortunately, types can be a thorny subject, especially in |
579 like \code{Rexp}. Unfortunately, types can be a thorny |
584 Scala. For example, why do we need to give the type to {\tt |
580 subject, especially in Scala. For example, why do we need to |
585 toSet[Int]} but not to {\tt toList}? The reason is the power |
581 give the type to \code{toSet[Int]} but not to \code{toList}? |
586 of Scala, which sometimes means it cannot infer all necessary |
582 The reason is the power of Scala, which sometimes means it |
587 typing information. At the beginning while getting familiar |
583 cannot infer all necessary typing information. At the |
588 with Scala, I recommend a ``play-it-by-ear-approach'' to |
584 beginning while getting familiar with Scala, I recommend a |
589 types. Fully understanding type-systems, especially complicated |
585 ``play-it-by-ear-approach'' to types. Fully understanding |
590 ones like in Scala, can take a module on their |
586 type-systems, especially complicated ones like in Scala, can |
591 own.\footnote{Still, such a study can be a rewarding training: |
587 take a module on their own.\footnote{Still, such a study can |
592 If you are in the business of designing new programming |
588 be a rewarding training: If you are in the business of |
593 languages, you will not be able to turn a blind eye to types. |
589 designing new programming languages, you will not be able to |
594 They essentially help programmers to avoid common programming |
590 turn a blind eye to types. They essentially help programmers |
595 errors and help with maintaining code.} |
591 to avoid common programming errors and help with maintaining |
|
592 code.} |
596 |
593 |
597 In Scala, types are needed whenever you define an inductive |
594 In Scala, types are needed whenever you define an inductive |
598 datatype and also whenever you define functions (their |
595 datatype and also whenever you define functions (their |
599 arguments and their results need a type). Base types are types |
596 arguments and their results need a type). Base types are types |
600 that do not take any (type)arguments, for example {\tt Int} |
597 that do not take any (type)arguments, for example \code{Int} |
601 and {\tt String}. Compound types take one or more arguments, |
598 and \code{String}. Compound types take one or more arguments, |
602 which as seen earlier need to be given in angle-brackets, for |
599 which as seen earlier need to be given in angle-brackets, for |
603 example {\tt List[Int]} or {\tt Set[List[String]]} or {\tt |
600 example \code{List[Int]} or \code{Set[List[String]]} or |
604 Map[Int, Int]}. |
601 \code{Map[Int, Int]}. |
605 |
602 |
606 There are a few special type-constructors that fall outside |
603 There are a few special type-constructors that fall outside |
607 this pattern. One is for tuples, where the type is written |
604 this pattern. One is for tuples, where the type is written |
608 with parentheses. For example {\tt (Int, Int, String)} for a |
605 with parentheses. For example \code{(Int, Int, String)} for a |
609 triple consisting of two integers and a string. Tuples are |
606 triple consisting of two integers and a string. Tuples are |
610 helpful if you want to define functions with multiple |
607 helpful if you want to define functions with multiple |
611 results, say the function returning the quotient and reminder |
608 results, say the function returning the quotient and reminder |
612 of two numbers. For this you might define: |
609 of two numbers. For this you might define: |
613 |
610 |
614 \begin{quote} |
611 |
615 \begin{lstlisting}[language=Scala, numbers=none] |
612 \begin{lstlisting}[language=Scala, numbers=none] |
616 def quo_rem(m: Int, n: Int) : (Int, Int) = (m / n, m \% n) |
613 def quo_rem(m: Int, n: Int) : (Int, Int) = (m / n, m \% n) |
617 \end{lstlisting} |
614 \end{lstlisting} |
618 \end{quote} |
615 |
619 |
616 |
620 \noindent |
617 \noindent |
621 Since this function returns a pair of integers, its type |
618 Since this function returns a pair of integers, its type |
622 needs to be {\tt (Int, Int)}. |
619 needs to be \code{(Int, Int)}. |
623 |
620 |
624 Another special type-constructor is for functions, written |
621 Another special type-constructor is for functions, written |
625 as the arrow {\tt =>}. For example, the type {\tt Int => |
622 as the arrow \code{=>}. For example, the type \code{Int => |
626 String} is for a function that takes an integer as argument |
623 String} is for a function that takes an integer as argument |
627 and produces a string. A function of this type is for instance |
624 and produces a string. A function of this type is for instance |
628 |
625 |
629 \begin{quote} |
626 |
630 \begin{lstlisting}[language=Scala,numbers=none] |
627 \begin{lstlisting}[language=Scala,numbers=none] |
631 def mk_string(n: Int) : String = n match { |
628 def mk_string(n: Int) : String = n match { |
632 case 0 => "zero" |
629 case 0 => "zero" |
633 case 1 => "one" |
630 case 1 => "one" |
634 case 2 => "two" |
631 case 2 => "two" |
635 case _ => "many" |
632 case _ => "many" |
636 } |
633 } |
637 \end{lstlisting} |
634 \end{lstlisting} |
638 \end{quote} |
635 |
639 |
636 |
640 \noindent Unlike other functional programming languages, there |
637 \noindent Unlike other functional programming languages, there |
641 is in Scala no easy way to find out the types of existing |
638 is in Scala no easy way to find out the types of existing |
642 functions, except by looking into the documentation |
639 functions, except by looking into the documentation |
643 |
640 |
644 \begin{quote} |
641 \begin{quote} |
645 \url{http://www.scala-lang.org/api/current/} |
642 \url{http://www.scala-lang.org/api/current/} |
646 \end{quote} |
643 \end{quote} |
647 |
644 |
648 The function arrow can also be iterated, as in {\tt |
645 The function arrow can also be iterated, as in |
649 Int => String => Boolean}. This is the type for a function |
646 \code{Int => String => Boolean}. This is the type for a function |
650 taking an integer as first argument and a string as second, |
647 taking an integer as first argument and a string as second, |
651 and the result of the function is a boolean. Though silly, a |
648 and the result of the function is a boolean. Though silly, a |
652 function of this type would be |
649 function of this type would be |
653 |
650 |
654 \begin{quote} |
651 |
655 \begin{lstlisting}[language=Scala,numbers=none] |
652 \begin{lstlisting}[language=Scala,numbers=none] |
656 def chk_string(n: Int, s: String) : Boolean = |
653 def chk_string(n: Int, s: String) : Boolean = |
657 mk_string(n) == s |
654 mk_string(n) == s |
658 \end{lstlisting} |
655 \end{lstlisting} |
659 \end{quote} |
656 |
660 |
657 |
661 \noindent which checks whether the integer {\tt n} corresponds |
658 \noindent which checks whether the integer \code{n} corresponds |
662 to the name {\tt s} given by the function {\tt mk\_string}. |
659 to the name \code{s} given by the function \code{mk\_string}. |
663 |
660 |
664 Coming back to the type {\tt Int => String => Boolean}. The |
661 Coming back to the type \code{Int => String => Boolean}. The |
665 rule about such function types is that the right-most type |
662 rule about such function types is that the right-most type |
666 specifies what the function returns (a boolean in this case). |
663 specifies what the function returns (a boolean in this case). |
667 The types before that specify how many arguments the function |
664 The types before that specify how many arguments the function |
668 expects and what is their type (in this case two arguments, |
665 expects and what is their type (in this case two arguments, |
669 one of type {\tt Int} and another of type {\tt String}). Given |
666 one of type \code{Int} and another of type \code{String}). |
670 this rule, what kind of function has type \mbox{\tt (Int => |
667 Given this rule, what kind of function has type |
671 String) => Boolean}? Well, it returns a boolean. More |
668 \mbox{\code{(Int => String) => Boolean}}? Well, it returns a |
672 interestingly, though, it only takes a single argument |
669 boolean. More interestingly, though, it only takes a single |
673 (because of the parentheses). The single argument happens to |
670 argument (because of the parentheses). The single argument |
674 be another function (taking an integer as input and returning |
671 happens to be another function (taking an integer as input and |
675 a string). |
672 returning a string). |
676 |
673 |
677 Now you might ask, what is the point of having function as |
674 Now you might ask, what is the point of having function as |
678 arguments to other functions? In Java there is no need of this |
675 arguments to other functions? In Java there is no need of this |
679 kind of feature. But in all functional programming languages, |
676 kind of feature. But in all functional programming languages, |
680 including Scala, it is really essential. Above you already |
677 including Scala, it is really essential. Above you already |
681 seen {\tt map} and {\tt foreach} which need this. Consider |
678 seen \code{map} and \code{foreach} which need this. Consider |
682 the functions {\tt print} and {\tt println}, which both |
679 the functions \code{print} and \code{println}, which both |
683 print out strings, but the latter adds a line break. You can |
680 print out strings, but the latter adds a line break. You can |
684 call {\tt foreach} with either of them and thus changing how, |
681 call \code{foreach} with either of them and thus changing how, |
685 for example, five numbers are printed. |
682 for example, five numbers are printed. |
686 |
683 |
687 \begin{quote} |
684 |
688 \begin{alltt} |
685 \begin{lstlisting}[language=Scala,numbers=none] |
689 scala> (1 to 5).toList.foreach(print) |
686 scala> (1 to 5).toList.foreach(print) |
690 12345 |
687 12345 |
691 scala> (1 to 5).toList.foreach(println) |
688 scala> (1 to 5).toList.foreach(println) |
692 1 |
689 1 |
693 2 |
690 2 |
694 3 |
691 3 |
695 4 |
692 4 |
696 5 |
693 5 |
697 \end{alltt} |
694 \end{lstlisting} |
698 \end{quote} |
695 |
699 |
696 |
700 \noindent This is actually one of the main design principles |
697 \noindent This is actually one of the main design principles |
701 in functional programming. You have generic functions like |
698 in functional programming. You have generic functions like |
702 {\tt map} and {\tt foreach} that can traverse data containers, |
699 \code{map} and \code{foreach} that can traverse data containers, |
703 like lists or sets. They then take a function to specify what |
700 like lists or sets. They then take a function to specify what |
704 should be done with each element during the traversal. This |
701 should be done with each element during the traversal. This |
705 requires that the generic traversal functions can cope with |
702 requires that the generic traversal functions can cope with |
706 any kind of function (not just functions that, for example, |
703 any kind of function (not just functions that, for example, |
707 take as input an integer and produce a string like above). |
704 take as input an integer and produce a string like above). |