cws/cw02.tex
author Christian Urban <christian.urban@kcl.ac.uk>
Fri, 22 Nov 2024 12:42:07 +0000
changeset 974 0cb4bf2469d1
parent 969 0dfa2923a7c6
permissions -rw-r--r--
updated
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
630
9b1c15c3eb6f updated
Christian Urban <urbanc@in.tum.de>
parents: 598
diff changeset
     1
% !TEX program = xelatex
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     2
\documentclass{article}
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
     3
\usepackage{../style}
216
f5ec7c597c5b updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 200
diff changeset
     4
\usepackage{../langs}
918
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
     5
\usepackage[normalem]{ulem}
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     6
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     7
\begin{document}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     8
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
     9
\section*{Coursework 2}
198
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 182
diff changeset
    10
835
08b157566a73 cwupdates
Christian Urban <christian.urban@kcl.ac.uk>
parents: 833
diff changeset
    11
\noindent This coursework is worth 10\% and is due on \cwTWO{} at
877
43460c7b2010 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 860
diff changeset
    12
16:00. You are asked to implement the Sulzmann \& Lu lexer for the
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    13
WHILE language. You can do the implementation in any programming
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    14
language you like, but you need to submit the source code with which
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    15
you answered the questions, otherwise a mark of 0\% will be
968
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
    16
awarded. %You need to submit your written answers as pdf---see attached
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
    17
% questionaire.  Code send as code.
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
    18
If you use Scala in your code, a
943
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 934
diff changeset
    19
good place to start is the file \texttt{lexer.sc} and
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 934
diff changeset
    20
\texttt{token.sc} uploaded to KEATS. The template file on Github is
968
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
    21
called \texttt{cw02.sc}. The example files are in the subdirectory
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
    22
\texttt{examples}. The main function that will be tested is
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
    23
called \texttt{tokenise}. The marks will be distributed such that
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
    24
3 marks are given for the correct \texttt{WHILE\_REGS} regular
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
    25
expression; 5 marks for the correct \texttt{inj} and \texttt{mkeps}
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
    26
definitions; and two marks when \texttt{tokenise} produces the correct
969
0dfa2923a7c6 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 968
diff changeset
    27
results for the example files.
968
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
    28
969
0dfa2923a7c6 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 968
diff changeset
    29
\subsection*{Testing\alert}
0dfa2923a7c6 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 968
diff changeset
    30
0dfa2923a7c6 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 968
diff changeset
    31
For the marking, the functions that will be tested are
0dfa2923a7c6 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 968
diff changeset
    32
\texttt{tokenise}, \texttt{inj} and \texttt{mkeps}.
968
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
    33
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    34
750
e93a9e74ca8e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 748
diff changeset
    35
\subsection*{Disclaimer\alert}
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    36
358
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
    37
It should be understood that the work you submit represents
918
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
    38
your own effort. You have not copied from anyone else
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
    39
including CoPilot, ChatGPT \& Co. An
363
0d6deecdb2eb updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 358
diff changeset
    40
exception is the Scala code from KEATS and the code I showed
419
4110ab35e5d8 updated courseworks
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 396
diff changeset
    41
during the lectures, which you can both freely use. You can
918
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
    42
also use your own code from the CW~1.
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
    43
%But do not
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
    44
%be tempted to ask Github Copilot for help or do any other
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
    45
%shenanigans like this!
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    46
419
4110ab35e5d8 updated courseworks
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 396
diff changeset
    47
\subsection*{Question 1}
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    48
419
4110ab35e5d8 updated courseworks
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 396
diff changeset
    49
To implement a lexer for the WHILE language, you first
358
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
    50
need to design the appropriate regular expressions for the
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    51
following eleven syntactic entities:
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    52
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    53
\begin{enumerate}
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    54
\item keywords are
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    55
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    56
\begin{center}
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    57
\texttt{while}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    58
\texttt{if}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    59
\texttt{then}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    60
\texttt{else}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    61
\texttt{do}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    62
\texttt{for}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    63
\texttt{to}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    64
\texttt{true}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    65
\texttt{false}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    66
\texttt{read}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    67
\texttt{write},
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    68
\texttt{skip}
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    69
\end{center} 
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    70
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    71
\item operators are:
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    72
\texttt{+}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    73
\texttt{-}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    74
\texttt{*}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    75
\texttt{\%},
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    76
\texttt{/},
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    77
\texttt{==}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    78
\texttt{!=}, 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    79
\texttt{>}, 
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    80
\texttt{<},
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    81
\texttt{<=}, 
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    82
\texttt{>=},
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    83
\texttt{:=},
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    84
\texttt{\&\&},
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
    85
\texttt{||}
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    86
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    87
\item letters are uppercase and lowercase
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
    88
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    89
\item symbols are letters plus the characters
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    90
  \texttt{.},
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    91
  \texttt{\_},
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    92
  \texttt{>},
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    93
  \texttt{<},
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    94
  \texttt{=},
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    95
  \texttt{;},
850
Christian Urban <christian.urban@kcl.ac.uk>
parents: 845
diff changeset
    96
  \texttt{,} (comma),
833
aad5957eb7e4 cwupdates
Christian Urban <christian.urban@kcl.ac.uk>
parents: 797
diff changeset
    97
  \texttt{$\backslash$} and
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    98
  \texttt{:}
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
    99
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
   100
\item parentheses are \texttt{(}, \texttt{\{}, \texttt{)} and \texttt{\}}
934
ee35eeb5831a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 918
diff changeset
   101
\item digits are \pcode{0} to \pcode{9}
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
   102
\item there are semicolons \texttt{;}
447
68769db65185 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 428
diff changeset
   103
\item whitespaces are either \texttt{" "} (one or more) or \texttt{$\backslash$n} or
845
ddd9659971ec updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 835
diff changeset
   104
  \texttt{$\backslash$t} or \texttt{$\backslash$r}
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
   105
\item identifiers are letters followed by underscores \texttt{\_\!\_}, letters
934
ee35eeb5831a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 918
diff changeset
   106
  or digits  
ee35eeb5831a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 918
diff changeset
   107
\item numbers for numbers give 
396
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   108
a regular expression that can recognise \pcode{0}, but not numbers 
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   109
with leading zeroes, such as \pcode{001}
934
ee35eeb5831a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 918
diff changeset
   110
\item strings are enclosed by double quotes, like \texttt{"\ldots"}, and consisting of
ee35eeb5831a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 918
diff changeset
   111
  symbols, digits, parentheses, whitespaces and \texttt{$\backslash$n} (note the latter is not the escaped version but \texttt{$\backslash$} followed by \texttt{n}, otherwise we would not be able to indicate in our strings when to write a newline).
946
bee7c57c18c3 corrected
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   112
\item comments start with \texttt{//} and contain symbols, spaces, parentheses and digits until the end-of-the-line markers
934
ee35eeb5831a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 918
diff changeset
   113
\item endo-of-line-markers are \texttt{$\backslash$n} and \texttt{$\backslash$r$\backslash$n}  
180
50e8dcd95ae3 added cw
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 179
diff changeset
   114
\end{enumerate}
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   115
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   116
\noindent
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   117
You can use the basic regular expressions 
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   118
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   119
\[
419
4110ab35e5d8 updated courseworks
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 396
diff changeset
   120
\ZERO,\; \ONE,\; c,\; r_1 + r_2,\; r_1 \cdot r_2,\; r^*
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   121
\]
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   122
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   123
\noindent
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   124
but also the following extended regular expressions
182
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
   125
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
   126
\begin{center}
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   127
\begin{tabular}{ll}
494
d0fc671bcbbf updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   128
$[c_1,c_2,\ldots,c_n]$ & a set of characters\\
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   129
$r^+$ & one or more times $r$\\
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   130
$r^?$ & optional $r$\\
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   131
$r^{\{n\}}$ & n-times $r$\\
182
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
   132
\end{tabular}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
   133
\end{center}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
   134
458
896a5f91838d updated
Christian Urban <urbanc@in.tum.de>
parents: 447
diff changeset
   135
\noindent
473
dc528091eb70 updated
Christian Urban <urbanc@in.tum.de>
parents: 468
diff changeset
   136
Later on you will also need the record regular expression:
458
896a5f91838d updated
Christian Urban <urbanc@in.tum.de>
parents: 447
diff changeset
   137
896a5f91838d updated
Christian Urban <urbanc@in.tum.de>
parents: 447
diff changeset
   138
\begin{center}
896a5f91838d updated
Christian Urban <urbanc@in.tum.de>
parents: 447
diff changeset
   139
\begin{tabular}{ll}
896a5f91838d updated
Christian Urban <urbanc@in.tum.de>
parents: 447
diff changeset
   140
$REC(x:r)$ & record regular expression\\
896a5f91838d updated
Christian Urban <urbanc@in.tum.de>
parents: 447
diff changeset
   141
\end{tabular}
896a5f91838d updated
Christian Urban <urbanc@in.tum.de>
parents: 447
diff changeset
   142
\end{center}
896a5f91838d updated
Christian Urban <urbanc@in.tum.de>
parents: 447
diff changeset
   143
396
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   144
\noindent Try to design your regular expressions to be as
494
d0fc671bcbbf updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   145
small as possible. For example you should use character sets
d0fc671bcbbf updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   146
for identifiers and numbers. Feel free to use the general
d0fc671bcbbf updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   147
character constructor \textit{CFUN} introduced in CW 1.
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   148
419
4110ab35e5d8 updated courseworks
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 396
diff changeset
   149
\subsection*{Question 2}
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   150
419
4110ab35e5d8 updated courseworks
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 396
diff changeset
   151
Implement the Sulzmann \& Lu lexer from the lectures. For
358
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
   152
this you need to implement the functions $nullable$ and $der$
369
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   153
(you can use your code from CW~1), as well as $mkeps$ and
358
b3129cff41e9 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 333
diff changeset
   154
$inj$. These functions need to be appropriately extended for
968
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   155
the extended regular expressions from Q1. The definitions
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   156
you need to create are:
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   157
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   158
369
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   159
\begin{center}
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   160
\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
494
d0fc671bcbbf updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   161
$mkeps([c_1,c_2,\ldots,c_n])$  & $\dn$ & $?$\\
369
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   162
$mkeps(r^+)$                   & $\dn$ & $?$\\
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   163
$mkeps(r^?)$                   & $\dn$ & $?$\\
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   164
$mkeps(r^{\{n\}})$             & $\dn$ & $?$\medskip\\
494
d0fc671bcbbf updated
Christian Urban <urbanc@in.tum.de>
parents: 473
diff changeset
   165
$inj\, ([c_1,c_2,\ldots,c_n])\,c\,\ldots$  & $\dn$ & $?$\\
369
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   166
$inj\, (r^+)\,c\,\ldots$                   & $\dn$ & $?$\\
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   167
$inj\, (r^?)\,c\,\ldots$                   & $\dn$ & $?$\\
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   168
$inj\, (r^{\{n\}})\,c\,\ldots$             & $\dn$ & $?$\\
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   169
\end{tabular}
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   170
\end{center}
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   171
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   172
\noindent where $inj$ takes three arguments: a regular
396
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   173
expression, a character and a value. Test your lexer code
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   174
with at least the two small examples below:
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   175
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   176
\begin{center}
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   177
\begin{tabular}{ll}
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   178
regex: & string:\smallskip\\
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   179
$a^{\{3\}}$ & $aaa$\\
458
896a5f91838d updated
Christian Urban <urbanc@in.tum.de>
parents: 447
diff changeset
   180
$(a + \ONE)^{\{3\}}$ & $aa$
396
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   181
\end{tabular}
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   182
\end{center}
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   183
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   184
598
e3ad67cd5123 updated
Christian Urban <urbanc@in.tum.de>
parents: 578
diff changeset
   185
\noindent Both strings should be successfully lexed by the
396
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   186
respective regular expression, that means the lexer returns 
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   187
in both examples a value.
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   188
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   189
4cd75c619e06 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 395
diff changeset
   190
Also add the record regular expression from the
968
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   191
lectures to your lexer and complete the function
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   192
\pcode{env} so that it returns all assignments from a value (this then 
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   193
allows you to extract easily the tokens from a value in the next
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   194
question).\medskip 
369
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   195
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 364
diff changeset
   196
\noindent
968
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   197
Finally make that the function \texttt{lexing\_simp} generates
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   198
with the regular expression from Q1 for the string
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   199
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   200
\begin{center}
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   201
\code{"read n;"}
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   202
\end{center} 
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   203
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   204
\noindent
968
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   205
the following pairs:
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   206
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   207
\begin{center}
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   208
\texttt{List((k,read), (w, ), (i,n), (s,;))}
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   209
\end{center} 
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   210
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   211
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   212
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   213
333
8890852e18b7 updated coursework
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 328
diff changeset
   214
419
4110ab35e5d8 updated courseworks
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 396
diff changeset
   215
\subsection*{Question 3}
275
618c7640cf66 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 216
diff changeset
   216
968
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   217
Make sure your lexer from Q2 also simplifies regular expressions after
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   218
each derivation step and rectifies the computed values after each
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   219
injection. Use this lexer to tokenise the six WHILE programs
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   220
in the \texttt{examples} directory. Make sure that the \texttt{tokenise}
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   221
function filters out whitespaces and comments.\bigskip
182
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 181
diff changeset
   222
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   223
968
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   224
% \begin{figure}[h]
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   225
% \mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../cwtests/cw02/fib.while}}
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   226
% \caption{Fibonacci program in the WHILE language.\label{fib}}
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   227
% \end{figure}
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   228
968
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   229
% \begin{figure}[h]
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   230
% \mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../cwtests/cw02/loops.while}}
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   231
% \caption{The three-nested-loops program in the WHILE language. 
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   232
% (Usually used for timing measurements.)\label{loop}}
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   233
% \end{figure}
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   234
968
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   235
% \begin{figure}[h]
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   236
% \mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../cwtests/cw02/factors.while}}
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   237
% \caption{A program that calculates factors for numbers in the WHILE
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   238
%   language.\label{factors}}
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   239
% \end{figure}
659
15b69ca63b29 optional
Christian Urban <urbanc@in.tum.de>
parents: 657
diff changeset
   240
968
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   241
% \begin{figure}[h]
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   242
% \mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../cwtests/cw02/collatz2.while}}
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   243
% \caption{A program that calculates the Collatz series for numbers
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   244
%   between 1 and 100.\label{collatz}}
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   245
% \end{figure}
748
383f2a5952ce updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 719
diff changeset
   246
968
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   247
% \clearpage
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   248
% \newpage
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   249
% \section*{Answers}
918
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   250
968
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   251
% \mbox{}
918
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   252
968
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   253
% \noindent
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   254
% \textbf{Question 2:}\\ (Use mathematical notation, such as $r^+$, rather than code, such as \code{PLUS(r)})
918
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   255
968
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   256
% \begin{center}
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   257
%   \def\arraystretch{1.6}  
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   258
% \begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   259
% $mkeps([c_1,c_2,\ldots,c_n])$  & $\dn$ & \uline{\hspace{8cm}}\\
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   260
% $mkeps(r^+)$                   & $\dn$ & \uline{\hspace{8cm}}\\
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   261
% $mkeps(r^?)$                   & $\dn$ & \uline{\hspace{8cm}}\\
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   262
% $mkeps(r^{\{n\}})$             & $\dn$ & \uline{\hspace{8cm}}\bigskip\\
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   263
% $inj\, ([c_1,c_2,\ldots,c_n])\,c\,\ldots$  & $\dn$ & \uline{\hspace{8cm}}\\
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   264
% $inj\, (r^+)\,c\,\ldots$                   & $\dn$ & \uline{\hspace{8cm}}\\
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   265
% $inj\, (r^?)\,c\,\ldots$                   & $\dn$ & \uline{\hspace{8cm}}\\
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   266
% $inj\, (r^{\{n\}})\,c\,\ldots$             & $\dn$ & \uline{\hspace{8cm}}\\
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   267
% \end{tabular}
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   268
% \end{center}\bigskip
918
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   269
968
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   270
% \noindent
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   271
% Tokens for \code{"read n;"}\\
918
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   272
968
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   273
% \noindent
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   274
% \uline{\hfill}\medskip
918
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   275
968
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   276
% \noindent
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   277
% \uline{\hfill}\medskip
918
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   278
968
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   279
% \noindent
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   280
% \uline{\hfill}\medskip
918
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   281
968
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   282
% \noindent
d8d8911a3d6f updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 946
diff changeset
   283
% \uline{\hfill}\medskip
918
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   284
53e7da9f372a updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 886
diff changeset
   285
178
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   286
\end{document}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   287
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   288
%%% Local Variables: 
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   289
%%% mode: latex
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   290
%%% TeX-master: t
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   291
%%% End: