4 \usepackage{alltt} |
4 \usepackage{alltt} |
5 \usepackage{menukeys} |
5 \usepackage{menukeys} |
6 \usepackage{amsmath} |
6 \usepackage{amsmath} |
7 \usepackage{../langs} |
7 \usepackage{../langs} |
8 \usepackage{mathpazo} |
8 \usepackage{mathpazo} |
9 \usepackage[scaled=.9]{helvet} |
9 \usepackage{marvosym} |
|
10 %%%\usepackage[scaled=.95]{helvet} |
10 |
11 |
11 \newcommand{\dn}{\stackrel{\mbox{\scriptsize def}}{=}}% |
12 \newcommand{\dn}{\stackrel{\mbox{\scriptsize def}}{=}}% |
12 \definecolor{codegray}{gray}{0.9} |
13 \definecolor{codegray}{gray}{0.9} |
13 \newcommand{\code}[1]{\colorbox{codegray}{\texttt{#1}}} |
14 \newcommand{\code}[1]{\colorbox{codegray}{\texttt{#1}}} |
14 |
15 |
15 \begin{document} |
16 \begin{document} |
16 |
17 |
17 \section*{A Crash-Course on Scala} |
18 \section*{A Crash-Course on Scala} |
18 |
19 |
19 Scala is programming language that combines functional and |
20 Scala is a programming language that combines functional and |
20 object-oriented programming-styles, and has received in the |
21 object-oriented programming-styles, and has received in the |
21 last five years quite a bit of attention. One reason for this |
22 last five years quite a bit of attention. One reason for this |
22 attention is that, like the Java programming language, Scala |
23 attention is that, like the Java programming language, Scala |
23 compiles to the Java Virtual Machine (JVM) and therefore can |
24 compiles to the Java Virtual Machine (JVM) and therefore can |
24 run under MacOSX, Linux and Windows.\footnote{There are also |
25 run under MacOSX, Linux and Windows.\footnote{There are also |
25 experimental backends for Android and JavaScript.} Unlike |
26 experimental backends for Android and JavaScript.} Unlike |
26 Java, however, Scala often allows programmers to write concise |
27 Java, however, Scala often allows programmers to write very |
27 and elegant code. Some therefore say Scala is the much better |
28 concise and elegant code. Some therefore say Scala is the much |
28 Java. If you want to try it out yourself, the Scala compiler |
29 better Java. The Guardian, Twitter, Coursera, LinkedIn to name |
29 can be downloaded from |
30 a few either rely entirely in their infrastructures on Scala, |
|
31 or some parts of there infrastructure uses it. If you want to |
|
32 try it out yourself, the Scala compiler can be downloaded from |
30 |
33 |
31 \begin{quote} |
34 \begin{quote} |
32 \url{http://www.scala-lang.org} |
35 \url{http://www.scala-lang.org} |
33 \end{quote} |
36 \end{quote} |
34 |
37 |
35 Why do I use Scala in the AFL course? Actually, you can do |
38 Why do I use Scala in the AFL module? Actually, you can do |
36 \emph{any} part of the programming coursework in \emph{any} |
39 \emph{any} part of the programming coursework in \emph{any} |
37 programming language you like. I use Scala for showing you |
40 programming language you like. I use Scala for showing you |
38 code during the lectures because its functional |
41 code during the lectures because its functional |
39 programming-style allows me to implement the functions we will |
42 programming-style allows me to implement the functions we will |
40 discuss with very small code-snippets. Since the compiler is |
43 discuss with very small code-snippets. Since the compiler is |
41 free, you can download them and run every example I give. But |
44 free, you can download them and run every example I give. But |
42 if you prefer, you can also easily translate the code-snippets |
45 if you prefer, you can also easily translate the code-snippets |
43 into any other functional language, for example Haskell, ML, |
46 into any other functional language, for example Haskell, |
44 F\# and so on. |
47 Standard ML, F\#, Ocaml and so on. |
45 |
48 |
46 Writing programs in Scala can be done with the Eclipse IDE and |
49 Developing programs in Scala can be done with the Eclipse IDE |
47 also with IntelliJ, but for the small programs I will develop |
50 and also with IntelliJ IDE, but for the small programs I will |
48 the good old Emacs-editor is adequate for me and I will run |
51 develop the good old Emacs-editor is adequate for me and I |
49 the programs on the command line. One advantage of Scala over Java |
52 will run the programs on the command line. One advantage of |
50 is that it includes an interpreter (a REPL, or |
53 Scala over Java is that it includes an interpreter (a REPL, or |
51 Read-Eval-Print-Loop) with which you can run and test small |
54 Read-Eval-Print-Loop) with which you can run and test small |
52 code-snippets without the need of the compiler. This helps a |
55 code-snippets without the need of the compiler. This helps a |
53 lot with interactively developing programs. Once you installed |
56 lot with interactively developing programs. Once you installed |
54 Scala correctly, you can start the interpreter by typing |
57 Scala correctly, you can start the interpreter by typing |
55 |
58 |
56 |
59 |
|
60 \begin{quote} |
57 \begin{alltt} |
61 \begin{alltt} |
58 $ scala\small |
62 $ scala\small |
59 Welcome to Scala version 2.11.2 (Java HotSpot(TM) 64-Bit Server VM). |
63 Welcome to Scala version 2.11.2 (Java HotSpot(TM) 64-Bit Server VM). |
60 Type in expressions to have them evaluated. |
64 Type in expressions to have them evaluated. |
61 Type :help for more information.\normalsize |
65 Type :help for more information.\normalsize |
62 |
66 |
63 scala> |
67 scala> |
64 \end{alltt} |
68 \end{alltt} |
|
69 \end{quote} |
65 |
70 |
66 \noindent The precise response may vary due to the platform |
71 \noindent The precise response may vary due to the platform |
67 where you installed Scala. At the scala prompt you can type |
72 where you installed Scala. At the Scala prompt you can type |
68 things like {\tt 2 + 3} \keys{Ret} and the output will be |
73 things like {\tt 2 + 3} \keys{Ret} and the output will be |
69 |
74 |
|
75 \begin{quote} |
70 \begin{alltt} |
76 \begin{alltt} |
71 scala> 2 + 3 |
77 scala> 2 + 3 |
72 res0: Int = 5 |
78 res0: Int = 5 |
73 \end{alltt} |
79 \end{alltt} |
|
80 \end{quote} |
74 |
81 |
75 \noindent indicating that the result of the addition is of |
82 \noindent indicating that the result of the addition is of |
76 type {\tt Int} and the actual result is {\tt 5}. Another |
83 type {\tt Int} and the actual result is {\tt 5}. Another |
77 classic example you can try out is |
84 classic example you can try out is |
78 |
85 |
|
86 \begin{quote} |
79 \begin{alltt} |
87 \begin{alltt} |
80 scala> print ("hello world") |
88 scala> print ("hello world") |
81 hello world |
89 hello world |
82 \end{alltt} |
90 \end{alltt} |
83 |
91 \end{quote} |
84 \noindent Note that in this case there is no result: the |
92 |
|
93 \noindent Note that in this case there is no result. The |
85 reason is that {\tt print} does not actually produce a result |
94 reason is that {\tt print} does not actually produce a result |
86 (there is no {\tt resXX}), rather it is a function that causes |
95 (there is no {\tt resXX}), rather it is a function that causes |
87 the \emph{side-effect} of printing out a string. Once you are |
96 the \emph{side-effect} of printing out a string. Once you are |
88 more familiar with the functional programming-style, you will |
97 more familiar with the functional programming-style, you will |
89 know what the difference is between a function that returns a |
98 know what the difference is between a function that returns a |
90 result, like addition, and a function that causes a |
99 result, like addition, and a function that causes a |
91 side-effect, like {\tt print}. We shall come back to this |
100 side-effect, like {\tt print}. We shall come back to this |
92 point in due course, but if you are curious now, the latter |
101 point later, but if you are curious now, the latter kind of |
93 kind of functions always have as return type {\tt Unit}. |
102 functions always have as return type {\tt Unit}. |
94 |
103 |
95 If you want to write a stand-alone app, you can implement |
104 If you want to write a stand-alone app in Scala, you can |
96 an object that is an instance of {\tt App}, say |
105 implement an object that is an instance of {\tt App}, say |
97 |
106 |
98 \begin{quote} |
107 \begin{quote} |
99 \begin{lstlisting}[language=Scala] |
108 \begin{lstlisting}[language=Scala,numbers=none] |
100 object Hello extends App { |
109 object Hello extends App { |
101 println ("hello world") |
110 println ("hello world") |
102 } |
111 } |
103 \end{lstlisting} |
112 \end{lstlisting} |
104 \end{quote} |
113 \end{quote} |
105 |
114 |
106 \noindent save it in a file, say {\tt hellow-world.scala}, and |
115 \noindent save it in a file, say {\tt hellow-world.scala}, and |
107 then run the compiler and runtime environment: |
116 then run the compiler and runtime environment: |
108 |
117 |
|
118 \begin{quote} |
109 \begin{alltt} |
119 \begin{alltt} |
110 $ scalac hello-world.scala |
120 $ scalac hello-world.scala |
111 $ scala Hello |
121 $ scala Hello |
112 hello world |
122 hello world |
113 \end{alltt} |
123 \end{alltt} |
114 |
124 \end{quote} |
115 As mentioned above, Scala targets the JVM and |
125 |
116 consequently Scala programs can also be executed by the |
126 As mentioned above, Scala targets the JVM and consequently |
117 bog-standard Java runtime. This only requires the inclusion of |
127 Scala programs can also be executed by the bog-standard Java |
118 {\tt scala-library.jar}, which on my computer can be done as |
128 Runtime. This only requires the inclusion of {\tt |
|
129 scala-library.jar}, which on my computer can be done as |
119 follows: |
130 follows: |
120 |
131 |
|
132 \begin{quote} |
121 \begin{alltt} |
133 \begin{alltt} |
122 $ scalac hello-world.scala |
134 $ scalac hello-world.scala |
123 $ java -cp /usr/local/src/scala/lib/scala-library.jar:. Hello |
135 $ java -cp /usr/local/src/scala/lib/scala-library.jar:. Hello |
124 hello world |
136 hello world |
125 \end{alltt} |
137 \end{alltt} |
|
138 \end{quote} |
126 |
139 |
127 |
140 |
128 \subsection*{Inductive Datatypes} |
141 \subsection*{Inductive Datatypes} |
129 |
142 |
130 The elegance and conciseness of Scala programs are often a |
143 The elegance and conciseness of Scala programs are often a |
146 \noindent This grammar specifies what regular expressions are |
159 \noindent This grammar specifies what regular expressions are |
147 (essentially a kind of tree-structure with three kinds of |
160 (essentially a kind of tree-structure with three kinds of |
148 inner nodes---sequence, alternative and star---and three kinds |
161 inner nodes---sequence, alternative and star---and three kinds |
149 of leave nodes---null, empty and character). If you are |
162 of leave nodes---null, empty and character). If you are |
150 familiar with Java, it might be an instructive exercise to |
163 familiar with Java, it might be an instructive exercise to |
151 define this kind of inductive datatypes in |
164 define this kind of inductive datatypes in Java\footnote{Happy |
152 Java.\footnote{Happy programming! ;o)} |
165 programming! \Smiley} and then compare it how |
|
166 it can be defined in Scala. |
153 |
167 |
154 Implementing the regular expressions from above in Scala is |
168 Implementing the regular expressions from above in Scala is |
155 very simple: It first requires an \emph{abstract class}, say, |
169 actually very simple: It first requires an \emph{abstract |
156 {\tt Rexp}. This will act as the type for regular expressions. |
170 class}, say, {\tt Rexp}. This will act as the type for regular |
157 Second, it requires some instances. The cases for |
171 expressions. Second, it requires a case for each clause in the |
158 $\varnothing$ and $\epsilon$ do not have any arguments, while |
172 grammar. The cases for $\varnothing$ and $\epsilon$ do not |
159 in all the other cases we do have arguments. For example the |
173 have any arguments, while in all the other cases we do have |
160 character regular expression needs to take as an argument the |
174 arguments. For example the character regular expression needs |
161 character it is supposed to recognise. In Scala, the cases |
175 to take as an argument the character it is supposed to |
162 without arguments are called \emph{case objects}, while the |
176 recognise. In Scala, the cases without arguments are called |
163 ones with arguments are \emph{case classes}. The corresponding |
177 \emph{case objects}, while the ones with arguments are |
164 code is as follows: |
178 \emph{case classes}. The corresponding Scala code is as |
165 |
179 follows: |
166 \begin{quote} |
180 |
167 \begin{lstlisting}[language=Scala] |
181 \begin{quote} |
|
182 \begin{lstlisting}[language=Scala,numbers=none] |
168 abstract class Rexp |
183 abstract class Rexp |
169 case object NULL extends Rexp |
184 case object NULL extends Rexp |
170 case object EMPTY extends Rexp |
185 case object EMPTY extends Rexp |
171 case class CHAR (c: Char) extends Rexp |
186 case class CHAR (c: Char) extends Rexp |
172 case class SEQ (r1: Rexp, r2: Rexp) extends Rexp |
187 case class SEQ (r1: Rexp, r2: Rexp) extends Rexp |
176 \end{quote} |
191 \end{quote} |
177 |
192 |
178 \noindent In order to be an instance of {\tt Rexp}, each case |
193 \noindent In order to be an instance of {\tt Rexp}, each case |
179 object and case class needs to extend {\tt Rexp}. Given the |
194 object and case class needs to extend {\tt Rexp}. Given the |
180 grammar above, I hope you can see the underlying pattern. If |
195 grammar above, I hope you can see the underlying pattern. If |
181 you want to play further with such definitions, feel free to |
196 you want to play further with such definitions of inductive |
182 define for example binary trees. |
197 datatypes, feel free to define for example binary trees. |
183 |
198 |
184 Once you make a definition like the one above, you can |
199 Once you make a definition like the one above, you can |
185 represent, for example, the regular expression for $a + b$ in |
200 represent, for example, the regular expression for $a + b$ in |
186 Scala as {\tt ALT(CHAR('a'), CHAR('b'))}. Expressions such as |
201 Scala as {\tt ALT(CHAR('a'), CHAR('b'))}. Expressions such as |
187 {\tt 'a'} stand for ASCII characters. If you want to assign |
202 {\tt 'a'} stand for ASCII characters, though in the output |
188 this regular expression to a variable, you can use the keyword |
203 syntax the quotes are omitted. If you want to assign this |
189 {\tt val} and type |
204 regular expression to a variable, you can use the keyword {\tt |
190 |
205 val} and type |
|
206 |
|
207 \begin{quote} |
191 \begin{alltt} |
208 \begin{alltt} |
192 scala> val r = ALT(CHAR('a'), CHAR('b')) |
209 scala> val r = ALT(CHAR('a'), CHAR('b')) |
193 r: ALT = ALT(CHAR(a),CHAR(b)) |
210 r: ALT = ALT(CHAR(a),CHAR(b)) |
194 \end{alltt} |
211 \end{alltt} |
|
212 \end{quote} |
195 |
213 |
196 \noindent As you can see, in order to make such assignments, |
214 \noindent As you can see, in order to make such assignments, |
197 no constructor is required in the class (as in Java). However, |
215 no constructor is required in the class (as in Java). However, |
198 if there is the need for some non-standard initialisation, you |
216 if there is the need for some non-standard initialisation, you |
199 can of course define such a constructor in Scala. But we omit |
217 can of course define such a constructor in Scala too. But we |
200 this here. |
218 omit such ``tricks'' here. |
201 |
219 |
202 Note that Scala in its response says the variable {\tt r} is |
220 Note that Scala in its response says the variable {\tt r} is |
203 of type {\tt ALT}, not {\tt Rexp}. This might be a bit |
221 of type {\tt ALT}, not {\tt Rexp}. This might be a bit |
204 unexpected, but can be explained as follows: Scala always |
222 unexpected, but can be explained as follows: Scala always |
205 tries to find the most general type that is needed for a |
223 tries to find the most general type that is needed for a |
206 variable or expression, but does not ``over-generalise''. In |
224 variable or expression, but does not ``over-generalise''. In |
207 this case there is no need to give {\tt r} the more general |
225 our definition the type {\tt Rexp} is more general than {\tt |
208 type of {\tt Rexp}. This is different if you want to form a |
226 ALT}, since it is the abstract class. But in this case there |
209 list of regular expressions, for example |
227 is no need to give {\tt r} the more general type of {\tt |
210 |
228 Rexp}. This is different if you want to form a list of regular |
|
229 expressions, for example |
|
230 |
|
231 \begin{quote} |
211 \begin{alltt} |
232 \begin{alltt} |
212 scala> val ls = List(ALT(CHAR('a'), CHAR('b')), NULL) |
233 scala> val ls = List(ALT(CHAR('a'), CHAR('b')), NULL) |
213 ls: List[Rexp] = List(ALT(CHAR(a),CHAR(b)), NULL) |
234 ls: List[Rexp] = List(ALT(CHAR(a),CHAR(b)), NULL) |
214 \end{alltt} |
235 \end{alltt} |
215 |
236 \end{quote} |
216 \noindent In this case Scala needs to assign a common type to |
237 |
217 the regular expressions, so that it is compatible with the |
238 \noindent In this case, Scala needs to assign a common type to |
218 fact that lists can only contain elements of a single type, in |
239 the regular expressions so that it is compatible with the |
219 this case the type is {\tt Rexp}.\footnote{If you type in this |
240 fact that lists can only contain elements of a single type. In |
220 example, you will notice that the type contains some further |
241 this case the first common type is {\tt Rexp}.\footnote{If you |
221 information, but lets ignore this for the moment.} |
242 type in this example, you will notice that the type contains |
222 |
243 some further information, but lets ignore this for the |
223 For types like {\tt List[...]} the general rule is that when a |
244 moment.} |
224 type takes another type as argument, then this is written in |
245 |
225 angle-brackets. This can also contain nested types as in {\tt |
246 For compound types like {\tt List[...]}, the general rule is |
226 List[Set[Rexp]]}, which is a list of sets each of which |
247 that when a type takes another type as argument, then this |
227 contains regular expressions. |
248 argument type is written in angle-brackets. This can also |
|
249 contain nested types as in {\tt List[Set[Rexp]]}, which is a |
|
250 list of sets each of which contains regular expressions. |
228 |
251 |
229 \subsection*{Functions and Pattern-Matching} |
252 \subsection*{Functions and Pattern-Matching} |
230 |
253 |
231 I mentioned above that Scala is a very elegant programming |
254 I mentioned above that Scala is a very elegant programming |
232 language for the code we will write in this module. This |
255 language for the code we will write in this module. This |
233 elegance mainly stems from the fact that functions can be |
256 elegance mainly stems from the fact that in addition to |
234 implemented very easily in Scala. Lets first consider a |
257 inductive datatypes, also functions can be implemented very |
|
258 easily in Scala. To show you this, lets first consider a |
235 problem from number theory, called the \emph{Collatz-series}, |
259 problem from number theory, called the \emph{Collatz-series}, |
236 which corresponds to a famous unsolved problem in |
260 which corresponds to a famous unsolved problem in |
237 mathematics.\footnote{See for example |
261 mathematics.\footnote{See for example |
238 \url{http://mathworld.wolfram.com/CollatzProblem.html}.} |
262 \url{http://mathworld.wolfram.com/CollatzProblem.html}.} |
239 Mathematician define this series as: |
263 Mathematician define this series as: |
254 |
278 |
255 If we want to avoid the head-scratching, we could implement |
279 If we want to avoid the head-scratching, we could implement |
256 this as the following function in Scala: |
280 this as the following function in Scala: |
257 |
281 |
258 \begin{quote} |
282 \begin{quote} |
259 \lstinputlisting[language=Scala]{../progs/collatz.scala} |
283 \lstinputlisting[language=Scala,numbers=none]{../progs/collatz.scala} |
260 \end{quote} |
284 \end{quote} |
261 |
285 |
262 \noindent The keyword for function definitions is {\tt def} |
286 \noindent The keyword for function definitions is {\tt def} |
263 followed by the name of the function. After that you have a |
287 followed by the name of the function. After that you have a |
264 list of arguments (enclosed in parentheses and separated by |
288 list of arguments (enclosed in parentheses and separated by |
265 commas). Each argument in this list needs its type annotated. |
289 commas). Each argument in this list needs its type annotated. |
266 In this case we only have one argument, which is of type {\tt |
290 In this case we only have one argument, which is of type {\tt |
267 BigInt}. This type stands for arbitrary precision integers. |
291 BigInt}. This type stands in Scala for arbitrary precision |
268 After the arguments comes the type of what the function |
292 integers (in case you want to try out the function on really |
269 returns---a Boolean in this case for indicating that the |
293 big numbers). After the arguments comes the type of what the |
270 function has reached {\tt 1}. Finally, after the {\tt =} comes |
294 function returns---a Boolean in this case for indicating that |
271 the \emph{body} of the function implementing what the function |
295 the function has reached {\tt 1}. Finally, after the {\tt =} |
272 is supposed to do. What the {\tt collatz} function does should |
296 comes the \emph{body} of the function implementing what the |
273 be pretty self-explanatory: the function first tests whether |
297 function is supposed to do. What the {\tt collatz} function |
274 {\tt n} is equal to $1$ in which case it returns {\tt true} |
298 does should be pretty self-explanatory: the function first |
275 and so on. |
299 tests whether {\tt n} is equal to $1$ in which case it returns |
|
300 {\tt true} and so on. |
276 |
301 |
277 Notice a quirk in Scala's syntax for {\tt if}s: The condition |
302 Notice a quirk in Scala's syntax for {\tt if}s: The condition |
278 needs to be enclosed in parentheses and the then-case comes |
303 needs to be enclosed in parentheses and the then-case comes |
279 right after the condition---there is no {\tt then} keyword in |
304 right after the condition---there is no {\tt then} keyword in |
280 Scala. |
305 Scala. |
|
306 |
281 |
307 |
282 The real power of Scala comes, however, from the ability to |
308 The real power of Scala comes, however, from the ability to |
283 define functions by \emph{pattern matching}. In the {\tt |
309 define functions by \emph{pattern matching}. In the {\tt |
284 collatz} function above we need to test each case using a |
310 collatz} function above we need to test each case using a |
285 sequence of {\tt if}s. This can be very cumbersome and brittle |
311 sequence of {\tt if}s. This can be very cumbersome and brittle |
286 if there are many cases. If we wanted to define a function |
312 if there are many cases. If we wanted to define a function |
287 over regular expressions in say Java, which does not have |
313 over regular expressions in Java, for example, which does not |
288 pattern-matching, the resulting code would just be awkward. |
314 have pattern-matching, the resulting code would be just |
289 |
315 awkward. |
290 Mathematicians already use the power of pattern-matching, for |
316 |
291 example, when they define the function that takes a regular |
317 Mathematicians already use the power of pattern-matching when |
292 expression and produces another regular expression that can |
318 they define the function that takes a regular expression and |
293 recognise the reversed strings. The resulting recursive |
319 produces another regular expression that can recognise the |
294 function is often defined as follows: |
320 reversed strings. The resulting recursive function is often |
|
321 defined as follows: |
295 |
322 |
296 \begin{center} |
323 \begin{center} |
297 \begin{tabular}{r@{\hspace{1mm}}c@{\hspace{1mm}}l} |
324 \begin{tabular}{r@{\hspace{2mm}}c@{\hspace{2mm}}l} |
298 $rev(\varnothing)$ & $\dn$ & $\varnothing$\\ |
325 $rev(\varnothing)$ & $\dn$ & $\varnothing$\\ |
299 $rev(\epsilon)$ & $\dn$ & $\epsilon$\\ |
326 $rev(\epsilon)$ & $\dn$ & $\epsilon$\\ |
300 $rev(c)$ & $\dn$ & $c$\\ |
327 $rev(c)$ & $\dn$ & $c$\\ |
301 $rev(r_1 + r_2)$ & $\dn$ & $rev(r_1) + rev(r_2)$\\ |
328 $rev(r_1 + r_2)$ & $\dn$ & $rev(r_1) + rev(r_2)$\\ |
302 $rev(r_1 \cdot r_2)$ & $\dn$ & $rev(r_2) \cdot rev(r_1)$\\ |
329 $rev(r_1 \cdot r_2)$ & $\dn$ & $rev(r_2) \cdot rev(r_1)$\\ |
303 $rev(r^*)$ & $\dn$ & $rev(r)^*$\\ |
330 $rev(r^*)$ & $\dn$ & $rev(r)^*$\\ |
304 \end{tabular} |
331 \end{tabular} |
305 \end{center} |
332 \end{center} |
306 |
333 |
307 \noindent The corresponding Scala code looks very similar |
334 \noindent This function is defined by recursion analysing each |
308 to this definition, thanks to pattern-matching. |
335 pattern of what the regular expression could look like. The |
|
336 corresponding Scala code looks very similar to this |
|
337 definition, thanks to pattern-matching. |
309 |
338 |
310 |
339 |
311 \begin{quote} |
340 \begin{quote} |
312 \lstinputlisting[language=Scala]{../progs/rev.scala} |
341 \lstinputlisting[language=Scala]{../progs/rev.scala} |
313 \end{quote} |
342 \end{quote} |
314 |
343 |
315 \noindent The keyword for starting a pattern-match is {\tt |
344 \noindent The keyword for starting a pattern-match is {\tt |
316 match} followed by a list of {\tt case}s. Before the match |
345 match} followed by a list of {\tt case}s. Before the match |
317 can be another pattern, but often as in the case above, |
346 keyword can be another pattern, but often as in the case |
318 it is just a variable you want to pattern-match. |
347 above, it is just a variable you want to pattern-match |
319 |
348 (the {\tt r} after {\tt =} in Line 1). |
320 In the {\tt rev}-function above, each case follows the |
349 |
321 structure of how we defined regular expressions as inductive |
350 Each case in this definition follows the structure of how we |
322 datatype. For example the case in Line 3 you can read as: if |
351 defined regular expressions as inductive datatype. For example |
323 the regular expression {\tt r} is of the form {\tt EMPTY} then |
352 the case in Line 3 you can read as: if the regular expression |
324 do whatever follows the {\tt =>} (in this case just return |
353 {\tt r} is of the form {\tt EMPTY} then do whatever follows |
325 {\tt EMPTY}). Line 5 reads as: if the regular expression |
354 the {\tt =>} (in this case just return {\tt EMPTY}). Line 5 |
326 {\tt r} is of the form {\tt ALT(r1, r2)}, where the |
355 reads as: if the regular expression {\tt r} is of the form |
327 left-branch of the alternative is matched by the variable {\tt |
356 {\tt ALT(r1, r2)}, where the left-branch of the alternative is |
328 r1} and the right-branch by {\tt r2} then do ``something''. |
357 matched by the variable {\tt r1} and the right-branch by {\tt |
329 The ``something'' can now use the variables {\tt r1} and {\tt |
358 r2} then do ``something''. The ``something'' can now use the |
330 r2} from the match. |
359 variables {\tt r1} and {\tt r2} from the match. |
331 |
360 |
332 If you want to play with this function, call, it for |
361 If you want to play with this function, call it for example |
333 example, with the regular expression $ab + ac$. This regular |
362 with the regular expression $ab + ac$. This regular expression |
334 expression can recognise the strings $ab$ and $ac$. The |
363 can recognise the strings $ab$ and $ac$. The function {\tt |
335 function {\tt rev} produces the result $ba + ca$, which |
364 rev} produces $ba + ca$, which can recognise the reversed |
336 can recognise the reversed strings $ba$ and $ca$. |
365 strings $ba$ and $ca$. |
337 |
366 |
338 In Scala each pattern-match can also be guarded as in |
367 In Scala each pattern-match can also be guarded as in |
339 |
368 |
340 \begin{quote} |
369 \begin{quote} |
341 \begin{lstlisting}[language=Scala, numbers=none] |
370 \begin{lstlisting}[language=Scala, numbers=none] |
363 programming called \emph{maps}. To illustrate how they work, |
392 programming called \emph{maps}. To illustrate how they work, |
364 lets assume you have a list of numbers from 1 to 10 and want to |
393 lets assume you have a list of numbers from 1 to 10 and want to |
365 build the list of squares. The list of numbers from 1 to 10 |
394 build the list of squares. The list of numbers from 1 to 10 |
366 can be constructed in Scala as follows: |
395 can be constructed in Scala as follows: |
367 |
396 |
|
397 \begin{quote} |
368 \begin{alltt} |
398 \begin{alltt} |
369 scala> (1 to 10).toList |
399 scala> (1 to 10).toList |
370 res1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) |
400 res1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) |
371 \end{alltt} |
401 \end{alltt} |
|
402 \end{quote} |
372 |
403 |
373 \noindent Generating from this list the list of squares in a |
404 \noindent Generating from this list the list of squares in a |
374 non-functional language (e.g.~Java), you would assume the list |
405 non-functional programming language (e.g.~Java), you would |
375 is given as a kind of array. You would then iterate, or loop, |
406 assume the list is given as a kind of array. You would then |
376 an index over this array and replace each entry in the array |
407 iterate, or loop, an index over this array and replace each |
377 by the square. Right? In Scala, and in other functional |
408 entry in the array by the square. Right? In Scala, and in |
378 programming languages, you use maps to achieve the same. |
409 other functional programming languages, you use maps to |
|
410 achieve the same. |
379 |
411 |
380 Maps essentially take a function that describes how each |
412 Maps essentially take a function that describes how each |
381 element is transformed (for example squaring) and a list over |
413 element is transformed (for example squaring) and a list over |
382 which this function should work. There are two forms to |
414 which this function should work. There are two forms to |
383 express maps in Scala. The first way is in a {\tt |
415 express such maps in Scala. The first way is in a {\tt |
384 for}-construction. Squaring the numbers from 1 to 10 would |
416 for}-construction. Squaring the numbers from 1 to 10 would |
385 look in this form as follows: |
417 look in this form as follows: |
386 |
418 |
|
419 \begin{quote} |
387 \begin{alltt} |
420 \begin{alltt} |
388 scala> for (n <- (1 to 10).toList) yield n * n |
421 scala> for (n <- (1 to 10).toList) yield n * n |
389 res2: List[Int] = List(1, 4, 9, 16, 25, 36, 49, 64, 81, 100) |
422 res2: List[Int] = List(1, 4, 9, 16, 25, 36, 49, 64, 81, 100) |
390 \end{alltt} |
423 \end{alltt} |
|
424 \end{quote} |
391 |
425 |
392 \noindent The keywords are {\tt for} and {\tt yield}. This |
426 \noindent The keywords are {\tt for} and {\tt yield}. This |
393 {\tt for}-construction roughly says that from the list of |
427 {\tt for}-construction roughly says that from the list of |
394 numbers we draw {\tt n}s and compute the result of {\tt n * |
428 numbers we draw {\tt n}s and compute the result of {\tt n * |
395 n}. In consequence we specified the list where each {\tt n} |
429 n}. As you can see, we specified the list where each {\tt n} |
396 comes from, namely {\tt (1 to 10).toList}, and how each |
430 comes from, namely {\tt (1 to 10).toList}, and how each |
397 element needs to be transformed. This can also be expressed |
431 element needs to be transformed. This can also be expressed in |
398 in a second way by using directly {\tt map} as follows: |
432 a second way in Scala by using directly {\tt map} as follows: |
399 |
433 |
|
434 \begin{quote} |
400 \begin{alltt} |
435 \begin{alltt} |
401 scala> (1 to 10).toList.map(n => n * n) |
436 scala> (1 to 10).toList.map(n => n * n) |
402 res3 = List(1, 4, 9, 16, 25, 36, 49, 64, 81, 100) |
437 res3 = List(1, 4, 9, 16, 25, 36, 49, 64, 81, 100) |
403 \end{alltt} |
438 \end{alltt} |
404 |
439 \end{quote} |
405 \noindent In this way the expression {\tt n => n * n} stands |
440 |
|
441 \noindent In this way, the expression {\tt n => n * n} stands |
406 for the function that calculates the square (this is how the |
442 for the function that calculates the square (this is how the |
407 {\tt n}s are transformed). This expression for functions might |
443 {\tt n}s are transformed). This expression for functions might |
408 remind you on your lessons about the lambda-calculus where |
444 remind you of your lessons about the lambda-calculus where |
409 this would have been written as $\lambda n.\,n * n$. |
445 this would have been written as $\lambda n.\,n * n$. It might |
410 |
446 not be obvious, but {\tt for}-constructions are just syntactic |
411 It might not be obvious, but {\tt for}-constructions are just |
447 sugar: when compiling, Scala translates {\tt |
412 syntactic sugar: when compiling, Scala translates them into |
448 for}-constructions into equivalent maps. |
413 equivalent maps. |
|
414 |
449 |
415 |
450 |
416 The very charming feature of Scala is that such maps or {\tt |
451 The very charming feature of Scala is that such maps or {\tt |
417 for}-constructions can be written for any kind of data |
452 for}-constructions can be written for any kind of data |
418 collection, such as lists, sets, vectors and so on. For |
453 collection, such as lists, sets, vectors and so on. For |
419 example if we instead compute the reminders modulo $3$ of this |
454 example if we instead compute the reminders modulo $3$ of this |
420 list, we can write |
455 list, we can write |
421 |
456 |
|
457 \begin{quote} |
422 \begin{alltt} |
458 \begin{alltt} |
423 scala> (1 to 10).toList.map(n => n \% 3) |
459 scala> (1 to 10).toList.map(n => n \% 3) |
424 res4 = List(1, 2, 0, 1, 2, 0, 1, 2, 0, 1) |
460 res4 = List(1, 2, 0, 1, 2, 0, 1, 2, 0, 1) |
425 \end{alltt} |
461 \end{alltt} |
|
462 \end{quote} |
426 |
463 |
427 \noindent If we, however, transform the numbers 1 to 10 not |
464 \noindent If we, however, transform the numbers 1 to 10 not |
428 into a list, but into a set, and then compute the reminders |
465 into a list, but into a set, and then compute the reminders |
429 modulo $3$ we obtain |
466 modulo $3$ we obtain |
430 |
467 |
|
468 \begin{quote} |
431 \begin{alltt} |
469 \begin{alltt} |
432 scala> (1 to 10).toSet[Int].map(n => n \% 3) |
470 scala> (1 to 10).toSet[Int].map(n => n \% 3) |
433 res5 = Set(2, 1, 0) |
471 res5 = Set(2, 1, 0) |
434 \end{alltt} |
472 \end{alltt} |
|
473 \end{quote} |
435 |
474 |
436 \noindent This is the correct result for sets, as there are |
475 \noindent This is the correct result for sets, as there are |
437 only 3 equivalence classes of integers modulo 3. Note that we |
476 only three equivalence classes of integers modulo 3. Note that |
438 need to ``help'' Scala in this example to transform the |
477 in this example we need to ``help'' Scala to transform the |
439 numbers into a set of integers by explicitly annotating the |
478 numbers into a set of integers by explicitly annotating the |
440 type {\tt Int}. Since maps and {\tt for}-construction are just |
479 type {\tt Int}. Since maps and {\tt for}-constructions are |
441 syntactic variants of each other, the latter can also be |
480 just syntactic variants of each other, the latter can also be |
442 written as |
481 written as |
443 |
482 |
|
483 \begin{quote} |
444 \begin{alltt} |
484 \begin{alltt} |
445 scala> for (n <- (1 to 10).toSet[Int]) yield n \% 3 |
485 scala> for (n <- (1 to 10).toSet[Int]) yield n \% 3 |
446 res5 = Set(2, 1, 0) |
486 res5 = Set(2, 1, 0) |
447 \end{alltt} |
487 \end{alltt} |
|
488 \end{quote} |
448 |
489 |
449 While hopefully this all looks reasonable, there is one |
490 While hopefully this all looks reasonable, there is one |
450 complication: In the examples above we always wanted to |
491 complication: In the examples above we always wanted to |
451 transform one list into another list (e.g.~list of squares), |
492 transform one list into another list (e.g.~list of squares), |
452 or one set into another set (set of numbers into set of |
493 or one set into another set (set of numbers into set of |
453 reminders modulo 3). What happens if we just want to print out |
494 reminders modulo 3). What happens if we just want to print out |
454 a list of integers? Then actually the {\tt for}-construction |
495 a list of integers? Then actually the {\tt for}-construction |
455 needs to be modified. The reason is that {\tt print}, you |
496 needs to be modified. The reason is that {\tt print}, you |
456 guessed it, does not produce any result, but only produces, in |
497 guessed it, does not produce any result, but only produces |
457 the functional-programming-lingo, a side-effect. Printing out |
498 what is in the functional-programming-lingo called a |
458 the list of numbers from 1 to 5 would look as follows |
499 side-effect. Printing out the list of numbers from 1 to 5 |
459 |
500 would look as follows |
|
501 |
|
502 \begin{quote} |
460 \begin{alltt} |
503 \begin{alltt} |
461 scala> for (n <- (1 to 5).toList) println(n) |
504 scala> for (n <- (1 to 5).toList) println(n) |
462 1 |
505 1 |
463 2 |
506 2 |
464 3 |
507 3 |
465 4 |
508 4 |
466 5 |
509 5 |
467 \end{alltt} |
510 \end{alltt} |
|
511 \end{quote} |
468 |
512 |
469 \noindent |
513 \noindent |
470 where you need to omit the keyword {\tt yield}. You can |
514 where you need to omit the keyword {\tt yield}. You can |
471 also do more elaborate calculations such as |
515 also do more elaborate calculations such as |
472 |
516 |
|
517 \begin{quote} |
473 \begin{alltt} |
518 \begin{alltt} |
474 scala> for (n <- (1 to 5).toList) \{ |
519 scala> for (n <- (1 to 5).toList) \{ |
475 val square_n = n * n |
520 val square_n = n * n |
476 println(s"$n * $n = $square_n") |
521 println(s"$n * $n = $square_n") |
477 \} |
522 \} |
542 In Scala, types are needed whenever you define an inductive |
596 In Scala, types are needed whenever you define an inductive |
543 datatype and also whenever you define functions (their |
597 datatype and also whenever you define functions (their |
544 arguments and their results need a type). Base types are types |
598 arguments and their results need a type). Base types are types |
545 that do not take any (type)arguments, for example {\tt Int} |
599 that do not take any (type)arguments, for example {\tt Int} |
546 and {\tt String}. Compound types take one or more arguments, |
600 and {\tt String}. Compound types take one or more arguments, |
547 which need to be given in angle-brackets, for example {\tt |
601 which as seen earlier need to be given in angle-brackets, for |
548 List[Int]} or {\tt Set[List[String]]} or {\tt Map[Int, Int]}. |
602 example {\tt List[Int]} or {\tt Set[List[String]]} or {\tt |
549 |
603 Map[Int, Int]}. |
550 There are e few special type-constructors. One is for tuples, |
604 |
551 which is written with parentheses. For example {\tt (Int, Int, |
605 There are a few special type-constructors that fall outside |
552 String)} for a triple consisting of two integers and a string. |
606 this pattern. One is for tuples, where the type is written |
553 Tuples are helpful if you want to define a function with |
607 with parentheses. For example {\tt (Int, Int, String)} for a |
554 multiple results, say the function returning the quotient and |
608 triple consisting of two integers and a string. Tuples are |
555 reminder of two numbers. For this you might define: |
609 helpful if you want to define functions with multiple |
556 |
610 results, say the function returning the quotient and reminder |
557 \begin{alltt} |
611 of two numbers. For this you might define: |
558 def quo_rem(m: Int, n: Int) : (Int, Int) = (m \verb|/| n, m \% n) |
612 |
559 \end{alltt} |
613 \begin{quote} |
|
614 \begin{lstlisting}[language=Scala, numbers=none] |
|
615 def quo_rem(m: Int, n: Int) : (Int, Int) = (m / n, m \% n) |
|
616 \end{lstlisting} |
|
617 \end{quote} |
560 |
618 |
561 \noindent |
619 \noindent |
562 Since this function returns a pair of integers, its type |
620 Since this function returns a pair of integers, its type |
563 needs to be {\tt (Int, Int)}. |
621 needs to be {\tt (Int, Int)}. |
564 |
622 |
565 Another special type-constructor is for functions, written |
623 Another special type-constructor is for functions, written |
566 {\tt =>}. For example, the type {\tt Int => String} stands |
624 as the arrow {\tt =>}. For example, the type {\tt Int => |
567 for a function that takes an integer as argument and produces |
625 String} is for a function that takes an integer as argument |
568 a string. A function of this type is for instance |
626 and produces a string. A function of this type is for instance |
569 |
627 |
570 \begin{quote} |
628 \begin{quote} |
571 \begin{lstlisting}[language=Scala] |
629 \begin{lstlisting}[language=Scala,numbers=none] |
572 def mk_string(n: Int) : String = n match { |
630 def mk_string(n: Int) : String = n match { |
573 case 0 => "zero" |
631 case 0 => "zero" |
574 case 1 => "one" |
632 case 1 => "one" |
575 case 2 => "two" |
633 case 2 => "two" |
576 case _ => "many" |
634 case _ => "many" |
577 } |
635 } |
578 \end{lstlisting} |
636 \end{lstlisting} |
579 \end{quote} |
637 \end{quote} |
580 |
638 |
581 \noindent Unlike other functional programming languages, there |
639 \noindent Unlike other functional programming languages, there |
582 is no easy way to find out the types of existing functions |
640 is in Scala no easy way to find out the types of existing |
583 except for looking into the documentation |
641 functions, except by looking into the documentation |
584 |
642 |
585 \begin{quote} |
643 \begin{quote} |
586 \url{http://www.scala-lang.org/api/current/} |
644 \url{http://www.scala-lang.org/api/current/} |
587 \end{quote} |
645 \end{quote} |
588 |
646 |
589 \noindent The function arrow can also be iterated, as in {\tt |
647 The function arrow can also be iterated, as in {\tt |
590 Int => String => Boolean}. This is the type for a function |
648 Int => String => Boolean}. This is the type for a function |
591 taking an integer as first argument and a string as second, |
649 taking an integer as first argument and a string as second, |
592 and the result of the function is a boolean. Though silly, a |
650 and the result of the function is a boolean. Though silly, a |
593 function of this type is |
651 function of this type would be |
594 |
652 |
595 \begin{quote} |
653 \begin{quote} |
596 \begin{lstlisting}[language=Scala] |
654 \begin{lstlisting}[language=Scala,numbers=none] |
597 def chk_string(n: Int, s: String) : Boolean = |
655 def chk_string(n: Int, s: String) : Boolean = |
598 mk_string(n) == s |
656 mk_string(n) == s |
599 \end{lstlisting} |
657 \end{lstlisting} |
600 \end{quote} |
658 \end{quote} |
601 |
659 |
602 \noindent |
660 \noindent which checks whether the integer {\tt n} corresponds |
|
661 to the name {\tt s} given by the function {\tt mk\_string}. |
|
662 |
|
663 Coming back to the type {\tt Int => String => Boolean}. The |
|
664 rule about such function types is that the right-most type |
|
665 specifies what the function returns (a boolean in this case). |
|
666 The types before that specify how many arguments the function |
|
667 expects and what is their type (in this case two arguments, |
|
668 one of type {\tt Int} and another of type {\tt String}). Given |
|
669 this rule, what kind of function has type \mbox{\tt (Int => |
|
670 String) => Boolean}? Well, it returns a boolean. More |
|
671 interestingly, though, it only takes a single argument |
|
672 (because of the parentheses). The single argument happens to |
|
673 be another function (taking an integer as input and returning |
|
674 a string). |
|
675 |
|
676 Now you might ask, what is the point of having function as |
|
677 arguments to other functions? In Java there is no need of this |
|
678 kind of feature. But in all functional programming languages, |
|
679 including Scala, it is really essential. Above you already |
|
680 seen {\tt map} and {\tt foreach} which need this. Consider |
|
681 the functions {\tt print} and {\tt println}, which both |
|
682 print out strings, but the latter adds a line break. You can |
|
683 call {\tt foreach} with either of them and thus changing how, |
|
684 for example, five numbers are printed. |
|
685 |
|
686 \begin{quote} |
|
687 \begin{alltt} |
|
688 scala> (1 to 5).toList.foreach(print) |
|
689 12345 |
|
690 scala> (1 to 5).toList.foreach(println) |
|
691 1 |
|
692 2 |
|
693 3 |
|
694 4 |
|
695 5 |
|
696 \end{alltt} |
|
697 \end{quote} |
|
698 |
|
699 \noindent This is actually one of the main design principles |
|
700 in functional programming. You have generic functions like |
|
701 {\tt map} and {\tt foreach} that can traverse data containers, |
|
702 like lists or sets. They then take a function to specify what |
|
703 should be done with each element during the traversal. This |
|
704 requires that the generic traversal functions can cope with |
|
705 any kind of function (not just functions that, for example, |
|
706 take as input an integer and produce a string like above). |
|
707 This means we cannot fix the type of the generic traversal |
|
708 functions, but have to keep them |
|
709 \emph{polymorphic}.\footnote{Another interestic topic about |
|
710 types, but we omit it here for the sake of brevity.} |
|
711 |
|
712 There is one more type constructor that is rather special. It |
|
713 is called {\tt Unit}. Recall that {\tt Boolean} has two |
|
714 values, namely {\tt true} and {\tt false}. This can be used, |
|
715 for example, to test something and decide whether the test |
|
716 succeeds or not. In contrast the type {\tt Unit} has only a |
|
717 single value, written {\tt ()}. This seems like a completely |
|
718 useless type and return value for a function, but is actually |
|
719 quite useful. It indicates when the function does not return |
|
720 any result. The purpose of these functions is to cause |
|
721 something being written on the screen or written into a file, |
|
722 for example. This is what is called they cause some effect on |
|
723 the side, namely a new content displayed on the screen or some |
|
724 new data in a file. Scala uses the {\tt Unit} type to indicate |
|
725 that a function does not have a result, but potentially causes |
|
726 some side-effect. Typical examples are the printing functions, |
|
727 like {\tt print}. |
|
728 |
603 |
729 |
604 \subsection*{Cool Stuff} |
730 \subsection*{Cool Stuff} |
605 |
731 |
|
732 The first wow-moment I had with Scala when I came across the |
|
733 following code-snippet for reading a web-page. |
|
734 |
|
735 \begin{quote} |
|
736 \begin{lstlisting}[language=Scala, numbers=none] |
|
737 import io.Source |
|
738 val url = """http://www.inf.kcl.ac.uk/staff/urbanc/""" |
|
739 Source.fromURL(url).take(10000).mkString |
|
740 \end{lstlisting} |
|
741 \end{quote} |
|
742 |
|
743 \noindent These three lines return a string containing the |
|
744 HTML-code of my webpage. It actually already does something |
|
745 more sophisticated, namely only returns the first 10000 |
|
746 characters of a webpage in case a ``webpage'' is too large. |
|
747 Why is that code-snippet of any interest? Well, try |
|
748 implementing reading from a webpage in Java. I also like the |
|
749 possibility of triple-quoting strings, which I have only seen |
|
750 in Scala so far. The idea behind this is that in such a string |
|
751 all characters are interpreted literally---there are no |
|
752 escaped characters, like \verb|\n| for newlines. |
|
753 |
|
754 My second wow-moment I had with a feature of Scala that other |
|
755 functional programming languages do not have. This feature is |
|
756 about implicit type conversions. If you have regular |
|
757 expressions and want to use them for language processing you |
|
758 often want to recognise keywords in a language, for example |
|
759 {\tt for}, {\tt if}, {\tt yield} and so on. But the basic |
|
760 regular expression, {\tt CHAR}, can only recognise a single |
|
761 character. In order to recognise a whole string, like {\tt |
|
762 for}, you have to put many of those together using {\tt SEQ}: |
|
763 |
|
764 \begin{quote} |
|
765 \begin{alltt} |
|
766 SEQ(CHAR('f'), SEQ(CHAR('o'), CHAR('r'))) |
|
767 \end{alltt} |
|
768 \end{quote} |
|
769 |
|
770 \noindent This gets quickly unreadable when the strings and |
|
771 regular expressions get more complicated. In other functional |
|
772 programming language, you can explicitly write a conversion |
|
773 function that takes a string, say {\tt for}, and generates the |
|
774 regular expression above. But then your code is littered with |
|
775 such conversion function. |
|
776 |
|
777 In Scala you can do better by ``hiding'' the conversion |
|
778 functions. The keyword for doing this is {\tt implicit}. |
|
779 Consider the code |
|
780 |
|
781 \begin{quote} |
|
782 \begin{lstlisting}[language=Scala] |
|
783 import scala.language.implicitConversions |
|
784 |
|
785 def charlist2rexp(s: List[Char]) : Rexp = s match { |
|
786 case Nil => EMPTY |
|
787 case c::Nil => CHAR(c) |
|
788 case c::s => SEQ(CHAR(c), charlist2rexp(s)) |
|
789 } |
|
790 |
|
791 implicit def string2rexp(s: String) : Rexp = |
|
792 charlist2rexp(s.toList) |
|
793 \end{lstlisting} |
|
794 \end{quote} |
|
795 |
|
796 \noindent where the first seven lines implement a function |
|
797 that given a list of characters generates the corresponding |
|
798 regular expression. In Lines 9 and 10, this function is used |
|
799 for transforming a string into a regular expression. Since the |
|
800 {\tt string2rexp}-function is declared as {\tt implicit} the |
|
801 effect will be that whenever Scala expects a regular |
|
802 expression, but I only give it a string, it will automatically |
|
803 insert a call to the {\tt string2rexp}-function. I can now |
|
804 write for example |
|
805 |
|
806 \begin{quote} |
|
807 \begin{alltt} |
|
808 scala> ALT("ab", "ac") |
|
809 res9: ALT = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c))) |
|
810 \end{alltt} |
|
811 \end{quote} |
|
812 |
|
813 |
|
814 Using implicit definitions, Scala allows me to introduce |
|
815 some further syntactic sugar for regular expressions: |
|
816 |
|
817 \begin{quote} |
|
818 \begin{lstlisting}[language=Scala] |
|
819 implicit def RexpOps(r: Rexp) = new { |
|
820 def | (s: Rexp) = ALT(r, s) |
|
821 def ~ (s: Rexp) = SEQ(r, s) |
|
822 def % = STAR(r) |
|
823 } |
|
824 |
|
825 implicit def stringOps(s: String) = new { |
|
826 def | (r: Rexp) = ALT(s, r) |
|
827 def | (r: String) = ALT(s, r) |
|
828 def ~ (r: Rexp) = SEQ(s, r) |
|
829 def ~ (r: String) = SEQ(s, r) |
|
830 def % = STAR(s) |
|
831 |
|
832 } |
|
833 \end{lstlisting} |
|
834 \end{quote} |
|
835 |
|
836 \noindent This might seem a bit complicated, but its effect is |
|
837 that I can now write regular expressions such as $ab + ac$ |
|
838 even simpler as |
|
839 |
|
840 \begin{quote} |
|
841 \begin{alltt} |
|
842 scala> "ab" | "ac" |
|
843 res10: ALT = ALT(SEQ(CHAR(a),CHAR(b)),SEQ(CHAR(a),CHAR(c))) |
|
844 \end{alltt} |
|
845 \end{quote} |
|
846 |
|
847 \noindent I leave you to figure out what the other |
|
848 syntactic sugar in the code above stands for. |
|
849 |
|
850 One more useful feature of Scala is the ability to define |
|
851 functions with variable argument lists. This is a feature that |
|
852 is already present in old languages, like C, but seems to have |
|
853 been forgotten in the meantime---Java does not have it. In the |
|
854 context of regular expressions this feature comes in handy: |
|
855 Say you are fed up with writing many alternatives as |
|
856 |
|
857 \begin{quote} |
|
858 \begin{alltt} |
|
859 ALT(..., ALT(..., ALT(..., ...))) |
|
860 \end{alltt} |
|
861 \end{quote} |
|
862 |
|
863 \noindent To make it difficult, you do not know how deep such |
|
864 alternatives are nested. So you need something flexible that |
|
865 can take as many alternatives as needed. In Scala one can |
|
866 achieve this by adding a {\tt *} to the type of an argument. |
|
867 Consider the code |
|
868 |
|
869 \begin{quote} |
|
870 \begin{lstlisting}[language=Scala] |
|
871 def Alts(rs: List[Rexp]) : Rexp = rs match { |
|
872 case Nil => NULL |
|
873 case r::Nil => r |
|
874 case r::rs => ALT(r, Alts(rs)) |
|
875 } |
|
876 |
|
877 def ALTS(rs: Rexp*) = Alts(rs.toList) |
|
878 \end{lstlisting} |
|
879 \end{quote} |
|
880 |
|
881 \noindent The function in Lines 1 to 5 takes a list of regular |
|
882 expressions and converts it into an appropriate alternative |
|
883 regular expression. In Line 7 there is a wrapper for this |
|
884 function which uses the feature of varying argument lists. The |
|
885 effect of this code is that I can write the regular |
|
886 expression for keywords as |
|
887 |
|
888 \begin{quote} |
|
889 \begin{alltt} |
|
890 ALTS("for", "def", "yield", "implicit", "if", "match", "case") |
|
891 \end{alltt} |
|
892 \end{quote} |
|
893 |
|
894 \noindent Again I leave you to it how much this simplifies the |
|
895 regular expression in comparison if I had to write this by |
|
896 hand using only the ``plain'' regular expressions from the |
|
897 inductive datatype. |
|
898 |
606 \subsection*{More Info} |
899 \subsection*{More Info} |
|
900 |
|
901 |
607 |
902 |
608 \end{document} |
903 \end{document} |
609 |
904 |
610 %%% Local Variables: |
905 %%% Local Variables: |
611 %%% mode: latex |
906 %%% mode: latex |