cws/cw02.tex
author Christian Urban <christian.urban@kcl.ac.uk>
Sat, 28 Oct 2023 21:00:11 +0100
changeset 946 bee7c57c18c3
parent 943 5365ef60707e
child 968 d8d8911a3d6f
permissions -rw-r--r--
corrected
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
630
9b1c15c3eb6f updated
Christian Urban <urbanc@in.tum.de>
parents: 598
diff changeset
     1
% !TEX program = xelatex
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     2
\documentclass{article}
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
     3
\usepackage{../style}
216
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 200
diff changeset
     4
\usepackage{../langs}
918
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
     5
\usepackage[normalem]{ulem}
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     6
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     7
\begin{document}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     8
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
     9
\section*{Coursework 2}
198
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 182
diff changeset
    10
835
08b157566a73 cwupdates
Christian Urban <christian.urban@kcl.ac.uk>
parents: 833
diff changeset
    11
\noindent This coursework is worth 10\% and is due on \cwTWO{} at
877
43460c7b2010 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 860
diff changeset
    12
16:00. You are asked to implement the Sulzmann \& Lu lexer for the
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    13
WHILE language. You can do the implementation in any programming
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    14
language you like, but you need to submit the source code with which
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    15
you answered the questions, otherwise a mark of 0\% will be
943
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 934
diff changeset
    16
awarded. You need to submit your written answers as pdf---see attached
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 934
diff changeset
    17
questionaire.  Code send as code. If you use Scala in your code, a
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 934
diff changeset
    18
good place to start is the file \texttt{lexer.sc} and
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 934
diff changeset
    19
\texttt{token.sc} uploaded to KEATS. The template file on Github is
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 934
diff changeset
    20
called \texttt{cw02.sc}. Your code needs to be uploaded to Github by
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 934
diff changeset
    21
the deadline.
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    22
750
e93a9e74ca8e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 748
diff changeset
    23
\subsection*{Disclaimer\alert}
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    24
358
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
    25
It should be understood that the work you submit represents
918
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
    26
your own effort. You have not copied from anyone else
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
    27
including CoPilot, ChatGPT \& Co. An
363
0d6deecdb2eb updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 358
diff changeset
    28
exception is the Scala code from KEATS and the code I showed
419
4110ab35e5d8 updated courseworks
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 396
diff changeset
    29
during the lectures, which you can both freely use. You can
918
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
    30
also use your own code from the CW~1.
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
    31
%But do not
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
    32
%be tempted to ask Github Copilot for help or do any other
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
    33
%shenanigans like this!
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    34
419
4110ab35e5d8 updated courseworks
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 396
diff changeset
    35
\subsection*{Question 1}
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    36
419
4110ab35e5d8 updated courseworks
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 396
diff changeset
    37
To implement a lexer for the WHILE language, you first
358
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
    38
need to design the appropriate regular expressions for the
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    39
following eleven syntactic entities:
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    40
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    41
\begin{enumerate}
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    42
\item keywords are
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    43
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    44
\begin{center}
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    45
\texttt{while}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    46
\texttt{if}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    47
\texttt{then}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    48
\texttt{else}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    49
\texttt{do}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    50
\texttt{for}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    51
\texttt{to}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    52
\texttt{true}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    53
\texttt{false}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    54
\texttt{read}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    55
\texttt{write},
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    56
\texttt{skip}
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    57
\end{center} 
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    58
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    59
\item operators are:
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    60
\texttt{+}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    61
\texttt{-}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    62
\texttt{*}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    63
\texttt{\%},
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    64
\texttt{/},
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    65
\texttt{==}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    66
\texttt{!=}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    67
\texttt{>}, 
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    68
\texttt{<},
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    69
\texttt{<=}, 
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    70
\texttt{>=},
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    71
\texttt{:=},
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    72
\texttt{\&\&},
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    73
\texttt{||}
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    74
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    75
\item letters are uppercase and lowercase
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    76
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    77
\item symbols are letters plus the characters
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    78
  \texttt{.},
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    79
  \texttt{\_},
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    80
  \texttt{>},
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    81
  \texttt{<},
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    82
  \texttt{=},
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    83
  \texttt{;},
850
Christian Urban <christian.urban@kcl.ac.uk>
parents: 845
diff changeset
    84
  \texttt{,} (comma),
833
aad5957eb7e4 cwupdates
Christian Urban <christian.urban@kcl.ac.uk>
parents: 797
diff changeset
    85
  \texttt{$\backslash$} and
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    86
  \texttt{:}
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    87
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    88
\item parentheses are \texttt{(}, \texttt{\{}, \texttt{)} and \texttt{\}}
934
ee35eeb5831a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 918
diff changeset
    89
\item digits are \pcode{0} to \pcode{9}
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    90
\item there are semicolons \texttt{;}
447
68769db65185 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 428
diff changeset
    91
\item whitespaces are either \texttt{" "} (one or more) or \texttt{$\backslash$n} or
845
ddd9659971ec updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 835
diff changeset
    92
  \texttt{$\backslash$t} or \texttt{$\backslash$r}
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    93
\item identifiers are letters followed by underscores \texttt{\_\!\_}, letters
934
ee35eeb5831a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 918
diff changeset
    94
  or digits  
ee35eeb5831a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 918
diff changeset
    95
\item numbers for numbers give 
396
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
    96
a regular expression that can recognise \pcode{0}, but not numbers 
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
    97
with leading zeroes, such as \pcode{001}
934
ee35eeb5831a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 918
diff changeset
    98
\item strings are enclosed by double quotes, like \texttt{"\ldots"}, and consisting of
ee35eeb5831a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 918
diff changeset
    99
  symbols, digits, parentheses, whitespaces and \texttt{$\backslash$n} (note the latter is not the escaped version but \texttt{$\backslash$} followed by \texttt{n}, otherwise we would not be able to indicate in our strings when to write a newline).
946
bee7c57c18c3 corrected
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   100
\item comments start with \texttt{//} and contain symbols, spaces, parentheses and digits until the end-of-the-line markers
934
ee35eeb5831a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 918
diff changeset
   101
\item endo-of-line-markers are \texttt{$\backslash$n} and \texttt{$\backslash$r$\backslash$n}  
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
   102
\end{enumerate}
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   103
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   104
\noindent
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   105
You can use the basic regular expressions 
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   106
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   107
\[
419
4110ab35e5d8 updated courseworks
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 396
diff changeset
   108
\ZERO,\; \ONE,\; c,\; r_1 + r_2,\; r_1 \cdot r_2,\; r^*
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   109
\]
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   110
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   111
\noindent
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   112
but also the following extended regular expressions
182
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
   113
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
   114
\begin{center}
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   115
\begin{tabular}{ll}
494
d0fc671bcbbf updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   116
$[c_1,c_2,\ldots,c_n]$ & a set of characters\\
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   117
$r^+$ & one or more times $r$\\
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   118
$r^?$ & optional $r$\\
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   119
$r^{\{n\}}$ & n-times $r$\\
182
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
   120
\end{tabular}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
   121
\end{center}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
   122
458
896a5f91838d updated
Christian Urban <urbanc@in.tum.de>
parents: 447
diff changeset
   123
\noindent
473
dc528091eb70 updated
Christian Urban <urbanc@in.tum.de>
parents: 468
diff changeset
   124
Later on you will also need the record regular expression:
458
896a5f91838d updated
Christian Urban <urbanc@in.tum.de>
parents: 447
diff changeset
   125
896a5f91838d updated
Christian Urban <urbanc@in.tum.de>
parents: 447
diff changeset
   126
\begin{center}
896a5f91838d updated
Christian Urban <urbanc@in.tum.de>
parents: 447
diff changeset
   127
\begin{tabular}{ll}
896a5f91838d updated
Christian Urban <urbanc@in.tum.de>
parents: 447
diff changeset
   128
$REC(x:r)$ & record regular expression\\
896a5f91838d updated
Christian Urban <urbanc@in.tum.de>
parents: 447
diff changeset
   129
\end{tabular}
896a5f91838d updated
Christian Urban <urbanc@in.tum.de>
parents: 447
diff changeset
   130
\end{center}
896a5f91838d updated
Christian Urban <urbanc@in.tum.de>
parents: 447
diff changeset
   131
396
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   132
\noindent Try to design your regular expressions to be as
494
d0fc671bcbbf updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   133
small as possible. For example you should use character sets
d0fc671bcbbf updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   134
for identifiers and numbers. Feel free to use the general
d0fc671bcbbf updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   135
character constructor \textit{CFUN} introduced in CW 1.
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   136
419
4110ab35e5d8 updated courseworks
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 396
diff changeset
   137
\subsection*{Question 2}
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   138
419
4110ab35e5d8 updated courseworks
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 396
diff changeset
   139
Implement the Sulzmann \& Lu lexer from the lectures. For
358
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
   140
this you need to implement the functions $nullable$ and $der$
369
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   141
(you can use your code from CW~1), as well as $mkeps$ and
358
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
   142
$inj$. These functions need to be appropriately extended for
918
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   143
the extended regular expressions from Q1. Write down in the
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   144
questionaire at the end the 
369
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   145
clauses for
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   146
369
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   147
\begin{center}
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   148
\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
494
d0fc671bcbbf updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   149
$mkeps([c_1,c_2,\ldots,c_n])$  & $\dn$ & $?$\\
369
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   150
$mkeps(r^+)$                   & $\dn$ & $?$\\
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   151
$mkeps(r^?)$                   & $\dn$ & $?$\\
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   152
$mkeps(r^{\{n\}})$             & $\dn$ & $?$\medskip\\
494
d0fc671bcbbf updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   153
$inj\, ([c_1,c_2,\ldots,c_n])\,c\,\ldots$  & $\dn$ & $?$\\
369
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   154
$inj\, (r^+)\,c\,\ldots$                   & $\dn$ & $?$\\
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   155
$inj\, (r^?)\,c\,\ldots$                   & $\dn$ & $?$\\
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   156
$inj\, (r^{\{n\}})\,c\,\ldots$             & $\dn$ & $?$\\
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   157
\end{tabular}
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   158
\end{center}
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   159
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   160
\noindent where $inj$ takes three arguments: a regular
396
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   161
expression, a character and a value. Test your lexer code
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   162
with at least the two small examples below:
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   163
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   164
\begin{center}
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   165
\begin{tabular}{ll}
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   166
regex: & string:\smallskip\\
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   167
$a^{\{3\}}$ & $aaa$\\
458
896a5f91838d updated
Christian Urban <urbanc@in.tum.de>
parents: 447
diff changeset
   168
$(a + \ONE)^{\{3\}}$ & $aa$
396
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   169
\end{tabular}
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   170
\end{center}
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   171
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   172
598
e3ad67cd5123 updated
Christian Urban <urbanc@in.tum.de>
parents: 578
diff changeset
   173
\noindent Both strings should be successfully lexed by the
396
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   174
respective regular expression, that means the lexer returns 
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   175
in both examples a value.
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   176
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   177
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   178
Also add the record regular expression from the
419
4110ab35e5d8 updated courseworks
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 396
diff changeset
   179
lectures to your lexer and implement a function, say
396
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   180
\pcode{env}, that returns all assignments from a value (such
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   181
that you can extract easily the tokens from a value).\medskip 
369
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   182
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   183
\noindent
934
ee35eeb5831a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 918
diff changeset
   184
Finally give \textbf{all} the tokens for your regular expressions from Q1 and the
369
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   185
string
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   186
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   187
\begin{center}
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   188
\code{"read n;"}
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   189
\end{center} 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   190
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   191
\noindent
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   192
and use your \pcode{env} function to give the token sequence.
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   193
333
8890852e18b7 updated coursework
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 328
diff changeset
   194
419
4110ab35e5d8 updated courseworks
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 396
diff changeset
   195
\subsection*{Question 3}
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   196
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
   197
Extend your lexer from Q2 to also simplify regular expressions after
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
   198
each derivation step and rectify the computed values after each
934
ee35eeb5831a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 918
diff changeset
   199
injection. Use this lexer to tokenize six WHILE programs some of which
ee35eeb5831a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 918
diff changeset
   200
are given in Figures~\ref{fib} -- \ref{collatz}. You can find these programms also on
ee35eeb5831a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 918
diff changeset
   201
Github under the \texttt{cw2} directory. Give the tokens of these
ee35eeb5831a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 918
diff changeset
   202
programs where whitespaces and comments are
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
   203
filtered out. Make sure you can tokenise \textbf{exactly} these
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
   204
programs.\bigskip
182
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
   205
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   206
578
6e5e3adc9eb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 567
diff changeset
   207
\begin{figure}[h]
860
6f80e6df34f7 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 850
diff changeset
   208
\mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../cwtests/cw02/fib.while}}
181
1f98d215df71 added material
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 180
diff changeset
   209
\caption{Fibonacci program in the WHILE language.\label{fib}}
1f98d215df71 added material
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 180
diff changeset
   210
\end{figure}
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   211
578
6e5e3adc9eb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 567
diff changeset
   212
\begin{figure}[h]
860
6f80e6df34f7 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 850
diff changeset
   213
\mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../cwtests/cw02/loops.while}}
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   214
\caption{The three-nested-loops program in the WHILE language. 
578
6e5e3adc9eb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 567
diff changeset
   215
(Usually used for timing measurements.)\label{loop}}
181
1f98d215df71 added material
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 180
diff changeset
   216
\end{figure}
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   217
659
15b69ca63b29 optional
Christian Urban <urbanc@in.tum.de>
parents: 657
diff changeset
   218
\begin{figure}[h]
860
6f80e6df34f7 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 850
diff changeset
   219
\mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../cwtests/cw02/factors.while}}
659
15b69ca63b29 optional
Christian Urban <urbanc@in.tum.de>
parents: 657
diff changeset
   220
\caption{A program that calculates factors for numbers in the WHILE
15b69ca63b29 optional
Christian Urban <urbanc@in.tum.de>
parents: 657
diff changeset
   221
  language.\label{factors}}
15b69ca63b29 optional
Christian Urban <urbanc@in.tum.de>
parents: 657
diff changeset
   222
\end{figure}
15b69ca63b29 optional
Christian Urban <urbanc@in.tum.de>
parents: 657
diff changeset
   223
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
   224
\begin{figure}[h]
934
ee35eeb5831a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 918
diff changeset
   225
\mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../cwtests/cw02/collatz2.while}}
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
   226
\caption{A program that calculates the Collatz series for numbers
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
   227
  between 1 and 100.\label{collatz}}
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
   228
\end{figure}
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
   229
918
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   230
\clearpage
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   231
\newpage
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   232
\section*{Answers}
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   233
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   234
\mbox{}
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   235
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   236
\noindent
934
ee35eeb5831a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 918
diff changeset
   237
\textbf{Question 2:}\\ (Use mathematical notation, such as $r^+$, rather than code, such as \code{PLUS(r)})
918
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   238
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   239
\begin{center}
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   240
  \def\arraystretch{1.6}  
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   241
\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   242
$mkeps([c_1,c_2,\ldots,c_n])$  & $\dn$ & \uline{\hspace{8cm}}\\
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   243
$mkeps(r^+)$                   & $\dn$ & \uline{\hspace{8cm}}\\
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   244
$mkeps(r^?)$                   & $\dn$ & \uline{\hspace{8cm}}\\
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   245
$mkeps(r^{\{n\}})$             & $\dn$ & \uline{\hspace{8cm}}\bigskip\\
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   246
$inj\, ([c_1,c_2,\ldots,c_n])\,c\,\ldots$  & $\dn$ & \uline{\hspace{8cm}}\\
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   247
$inj\, (r^+)\,c\,\ldots$                   & $\dn$ & \uline{\hspace{8cm}}\\
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   248
$inj\, (r^?)\,c\,\ldots$                   & $\dn$ & \uline{\hspace{8cm}}\\
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   249
$inj\, (r^{\{n\}})\,c\,\ldots$             & $\dn$ & \uline{\hspace{8cm}}\\
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   250
\end{tabular}
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   251
\end{center}\bigskip
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   252
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   253
\noindent
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   254
Tokens for \code{"read n;"}\\
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   255
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   256
\noindent
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   257
\uline{\hfill}\medskip
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   258
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   259
\noindent
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   260
\uline{\hfill}\medskip
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   261
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   262
\noindent
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   263
\uline{\hfill}\medskip
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   264
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   265
\noindent
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   266
\uline{\hfill}\medskip
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   267
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   268
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   269
\end{document}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   270
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   271
%%% Local Variables: 
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   272
%%% mode: latex
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   273
%%% TeX-master: t
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   274
%%% End: