coursework/cw02.tex
author Christian Urban <christian dot urban at kcl dot ac dot uk>
Mon, 19 Oct 2015 23:49:25 +0100
changeset 358 b3129cff41e9
parent 333 8890852e18b7
child 363 0d6deecdb2eb
permissions -rw-r--r--
updated
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     1
\documentclass{article}
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
     2
\usepackage{../style}
216
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 200
diff changeset
     3
\usepackage{../langs}
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     4
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     5
\begin{document}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     6
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
     7
\section*{Coursework 2 (Strand 1)}
198
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 182
diff changeset
     8
358
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
     9
\noindent This coursework is worth 5\% and is due on 6
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
    10
November at 16:00. You are asked to implement the Sulzmann \&
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
    11
Lu tokeniser for the WHILE language. You can do the
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
    12
implementation in any programming language you like, but you
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
    13
need to submit the source code with which you answered the
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
    14
questions, otherwise a mark of 0\% will be awarded. However,
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
    15
the coursework will \emph{only} be judged according to the
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
    16
answers. You can submit your answers in a txt-file or as pdf.
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    17
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    18
\subsection*{Disclaimer}
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    19
358
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
    20
It should be understood that the work you submit represents
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
    21
your own effort. You have not copied from anyone else. An
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
    22
exception is the Scala code I showed during the lectures,
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
    23
which you can use. You can also use your own code from the
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
    24
CW~1.
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    25
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    26
\subsection*{Question 1 (marked with 1\%)}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    27
358
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
    28
To implement a tokeniser for the WHILE language, you first
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
    29
need to design the appropriate regular expressions for the
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
    30
following eight syntactic entities:
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    31
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    32
\begin{enumerate}
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    33
\item keywords are
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    34
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    35
\begin{quote}
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    36
\texttt{while}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    37
\texttt{if}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    38
\texttt{then}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    39
\texttt{else}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    40
\texttt{do}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    41
\texttt{for}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    42
\texttt{to}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    43
\texttt{true}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    44
\texttt{false}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    45
\texttt{read}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    46
\texttt{write},
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    47
\texttt{skip}
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    48
\end{quote} 
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    49
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    50
\item operators are
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    51
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    52
\begin{quote}
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    53
\texttt{+}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    54
\texttt{-}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    55
\texttt{*}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    56
\texttt{\%},
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    57
\texttt{/},
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    58
\texttt{==}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    59
\texttt{!=}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    60
\texttt{>}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    61
\texttt{<}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    62
\texttt{:=},
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    63
\texttt{\&\&},
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    64
\texttt{||}
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    65
\end{quote} 
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    66
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    67
\item strings are enclosed by \texttt{"\ldots"} 
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    68
\item parentheses are \texttt{(}, \texttt{\{}, \texttt{)} and \texttt{\}}
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    69
\item there are semicolons \texttt{;}
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    70
\item whitespaces are either \texttt{" "} (one or more) or \texttt{$\backslash$n}
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    71
\item identifiers are letters followed by underscores \texttt{\_\!\_}, letters
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    72
or digits
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    73
\item numbers are \texttt{0}, \text{1}, \ldots
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    74
\end{enumerate}
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    75
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    76
\noindent
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    77
You can use the basic regular expressions 
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    78
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    79
\[
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    80
\varnothing, \epsilon, c, r_1 + r_2, r_1 \cdot r_2, r^*
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    81
\]
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    82
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    83
\noindent
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    84
but also the following extended regular expressions
182
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
    85
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
    86
\begin{center}
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    87
\begin{tabular}{ll}
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    88
$[c_1 c_2 \ldots c_n]$ & a range of characters\\
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    89
$r^+$ & one or more times $r$\\
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    90
$r^?$ & optional $r$\\
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    91
$r^{\{n\}}$ & n-times $r$\\
182
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
    92
\end{tabular}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
    93
\end{center}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
    94
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
    95
\noindent
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    96
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    97
\subsection*{Question 2 (marked with 3\%)}
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    98
358
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
    99
Implement the Sulzmann \& Lu tokeniser from the lectures. For
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
   100
this you need to implement the functions $nullable$ and $der$
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
   101
(you can use your code from CW 1), as well as $mkeps$ and
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
   102
$inj$. These functions need to be appropriately extended for
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
   103
the extended regular expressions from Q1. Also add the record
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
   104
regular expression from the lectures and implement a function,
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
   105
say \pcode{env}, that returns all assignments from a value
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
   106
(such that you can extract easily the tokens from a value). 
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   107
358
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
   108
The functions $mkeps$ and $inj$ return values. Calculate the
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
   109
value for your regular expressions from Q1 and the string
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   110
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   111
\begin{center}
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   112
\code{"read n;"}
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   113
\end{center} 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   114
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   115
\noindent
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   116
and use your \pcode{env} function to give the token sequence.
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   117
333
8890852e18b7 updated coursework
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 328
diff changeset
   118
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   119
\subsection*{Question 3 (marked with 1\%)}
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   120
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   121
Extend your tokenizer from Q2 to also simplify regular expressions
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   122
after each derivation step and rectify the computed values after each
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   123
injection. Use this tokenizer to tokenize the programs in
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   124
Figure~\ref{fib} and \ref{loop}.
182
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
   125
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   126
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   127
\begin{figure}[p]
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   128
\begin{center}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   129
\mbox{\lstinputlisting[language=while]{../progs/fib.while}}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   130
\end{center}
181
1f98d215df71 added material
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 180
diff changeset
   131
\caption{Fibonacci program in the WHILE language.\label{fib}}
1f98d215df71 added material
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 180
diff changeset
   132
\end{figure}
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   133
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   134
\begin{figure}[p]
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   135
\begin{center}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   136
\mbox{\lstinputlisting[language=while]{../progs/loops.while}}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   137
\end{center}
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   138
\caption{The three-nested-loops program in the WHILE language. 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   139
Usually used for timing measurements.\label{loop}}
181
1f98d215df71 added material
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 180
diff changeset
   140
\end{figure}
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   141
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   142
\end{document}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   143
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   144
%%% Local Variables: 
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   145
%%% mode: latex
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   146
%%% TeX-master: t
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   147
%%% End: