coursework/cw02.tex
author Christian Urban <christian dot urban at kcl dot ac dot uk>
Thu, 01 Oct 2015 23:50:11 +0100
changeset 340 c49122dbcdd1
parent 333 8890852e18b7
child 358 b3129cff41e9
permissions -rw-r--r--
updated
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     1
\documentclass{article}
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
     2
\usepackage{../style}
216
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 200
diff changeset
     3
\usepackage{../langs}
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     4
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     5
\begin{document}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     6
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
     7
\section*{Coursework 2 (Strand 1)}
198
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 182
diff changeset
     8
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 182
diff changeset
     9
\noindent
328
bc03ff3d347c updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 275
diff changeset
    10
This coursework is worth 5\% and is due on 6 November at 16:00. You
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    11
are asked to implement the Sulzmann tokeniser for the WHILE language.
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    12
You need to submit a document containing the answers for the questions
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    13
below. You can do the implementation in any programming language you
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    14
like, but you need to submit the source code with which you answered
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    15
the questions. However, the coursework will \emph{only} be judged
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    16
according to the answers. You can submit your answers in a txt-file or
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    17
as pdf.
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    18
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    19
\subsection*{Disclaimer}
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    20
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    21
It should be understood that the work you submit represents your own
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    22
effort.  You have not copied from anyone else. An exception is the
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    23
Scala code I showed during the lectures, which you can use.
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    24
You can also use your own code from the CW~1.
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    25
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    26
\subsection*{Question 1 (marked with 1\%)}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    27
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    28
To implement a tokeniser for the WHILE language, you first need to design 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    29
the appropriate regular expressions for the following eight syntactic entities:
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    30
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    31
\begin{enumerate}
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    32
\item keywords are
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    33
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    34
\begin{quote}
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    35
\texttt{while}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    36
\texttt{if}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    37
\texttt{then}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    38
\texttt{else}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    39
\texttt{do}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    40
\texttt{for}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    41
\texttt{to}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    42
\texttt{true}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    43
\texttt{false}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    44
\texttt{read}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    45
\texttt{write},
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    46
\texttt{skip}
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    47
\end{quote} 
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    48
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    49
\item operators are
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    50
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    51
\begin{quote}
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    52
\texttt{+}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    53
\texttt{-}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    54
\texttt{*}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    55
\texttt{\%},
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    56
\texttt{/},
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    57
\texttt{==}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    58
\texttt{!=}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    59
\texttt{>}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    60
\texttt{<}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    61
\texttt{:=},
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    62
\texttt{\&\&},
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    63
\texttt{||}
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    64
\end{quote} 
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    65
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    66
\item strings are enclosed by \texttt{"\ldots"} 
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    67
\item parentheses are \texttt{(}, \texttt{\{}, \texttt{)} and \texttt{\}}
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    68
\item there are semicolons \texttt{;}
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    69
\item whitespaces are either \texttt{" "} (one or more) or \texttt{$\backslash$n}
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    70
\item identifiers are letters followed by underscores \texttt{\_\!\_}, letters
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    71
or digits
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    72
\item numbers are \texttt{0}, \text{1}, \ldots
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    73
\end{enumerate}
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    74
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    75
\noindent
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    76
You can use the basic regular expressions 
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    77
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    78
\[
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    79
\varnothing, \epsilon, c, r_1 + r_2, r_1 \cdot r_2, r^*
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    80
\]
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    81
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    82
\noindent
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    83
but also the following extended regular expressions
182
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
    84
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
    85
\begin{center}
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    86
\begin{tabular}{ll}
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    87
$[c_1 c_2 \ldots c_n]$ & a range of characters\\
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    88
$r^+$ & one or more times $r$\\
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    89
$r^?$ & optional $r$\\
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    90
$r^{\{n\}}$ & n-times $r$\\
182
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
    91
\end{tabular}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
    92
\end{center}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
    93
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
    94
\noindent
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    95
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    96
\subsection*{Question 2 (marked with 3\%)}
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    97
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    98
Implement the Sulzmann tokeniser from the lectures. For this you need
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    99
to implement the functions $nullable$ and $der$ (you can use your code
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   100
from CW 1), as well as $mkeps$ and $inj$. These functions need to be
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   101
appropriately extended for the extended regular expressions from
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   102
Q1. Also add the record regular expression from the lectures and
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   103
implement a function, say \pcode{env}, that returns all assignments
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   104
from a value (such that you can extract easily the tokens from a
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   105
value). 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   106
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   107
The functions $mkeps$ and $inj$ return values. Calculate
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   108
the value for your regular expressions from Q1 and the string
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   109
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   110
\begin{center}
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   111
\code{"read n;"}
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   112
\end{center} 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   113
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   114
\noindent
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   115
and use your \pcode{env} function to give the token sequence.
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   116
333
8890852e18b7 updated coursework
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 328
diff changeset
   117
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   118
\subsection*{Question 3 (marked with 1\%)}
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   119
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   120
Extend your tokenizer from Q2 to also simplify regular expressions
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   121
after each derivation step and rectify the computed values after each
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   122
injection. Use this tokenizer to tokenize the programs in
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   123
Figure~\ref{fib} and \ref{loop}.
182
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
   124
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   125
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   126
\begin{figure}[p]
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   127
\begin{center}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   128
\mbox{\lstinputlisting[language=while]{../progs/fib.while}}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   129
\end{center}
181
1f98d215df71 added material
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 180
diff changeset
   130
\caption{Fibonacci program in the WHILE language.\label{fib}}
1f98d215df71 added material
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 180
diff changeset
   131
\end{figure}
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   132
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   133
\begin{figure}[p]
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   134
\begin{center}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   135
\mbox{\lstinputlisting[language=while]{../progs/loops.while}}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   136
\end{center}
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   137
\caption{The three-nested-loops program in the WHILE language. 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   138
Usually used for timing measurements.\label{loop}}
181
1f98d215df71 added material
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 180
diff changeset
   139
\end{figure}
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   140
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   141
\end{document}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   142
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   143
%%% Local Variables: 
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   144
%%% mode: latex
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   145
%%% TeX-master: t
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   146
%%% End: