| author | Christian Urban <christian.urban@kcl.ac.uk> | 
| Tue, 31 Oct 2023 12:52:36 +0000 | |
| changeset 951 | a6a5ba526d73 | 
| parent 945 | 6cd55dfd3b7d | 
| child 967 | 258e18af6d14 | 
| permissions | -rw-r--r-- | 
| 630 | 1  | 
% !TEX program = xelatex  | 
| 
178
 
d36363d648e3
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
2  | 
\documentclass{article}
 | 
| 
275
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
3  | 
\usepackage{../style}
 | 
| 
216
 
f5ec7c597c5b
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
200 
diff
changeset
 | 
4  | 
\usepackage{../langs}
 | 
| 918 | 5  | 
\usepackage[normalem]{ulem}
 | 
| 
178
 
d36363d648e3
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
6  | 
|
| 
 
d36363d648e3
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
7  | 
\begin{document}
 | 
| 
 
d36363d648e3
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
8  | 
|
| 748 | 9  | 
\section*{Coursework 2}
 | 
| 
198
 
f54972b0f641
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
182 
diff
changeset
 | 
10  | 
|
| 835 | 11  | 
\noindent This coursework is worth 10\% and is due on \cwTWO{} at
 | 
| 877 | 12  | 
16:00. You are asked to implement the Sulzmann \& Lu lexer for the  | 
| 748 | 13  | 
WHILE language. You can do the implementation in any programming  | 
14  | 
language you like, but you need to submit the source code with which  | 
|
15  | 
you answered the questions, otherwise a mark of 0\% will be  | 
|
| 942 | 16  | 
awarded. You need to submit your written answers as pdf---see attached  | 
17  | 
questionaire. Code send as code. If you use Scala in your code, a  | 
|
18  | 
good place to start is the file \texttt{lexer.sc} and
 | 
|
19  | 
\texttt{token.sc} uploaded to KEATS. The template file on Github is
 | 
|
20  | 
called \texttt{cw02.sc}. Your code needs to be uploaded to Github by
 | 
|
21  | 
the deadline.  | 
|
| 
180
 
50e8dcd95ae3
added cw
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
179 
diff
changeset
 | 
22  | 
|
| 750 | 23  | 
\subsection*{Disclaimer\alert}
 | 
| 
178
 
d36363d648e3
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
24  | 
|
| 
358
 
b3129cff41e9
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
333 
diff
changeset
 | 
25  | 
It should be understood that the work you submit represents  | 
| 918 | 26  | 
your own effort. You have not copied from anyone else  | 
27  | 
including CoPilot, ChatGPT \& Co. An  | 
|
| 
363
 
0d6deecdb2eb
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
358 
diff
changeset
 | 
28  | 
exception is the Scala code from KEATS and the code I showed  | 
| 
419
 
4110ab35e5d8
updated courseworks
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
396 
diff
changeset
 | 
29  | 
during the lectures, which you can both freely use. You can  | 
| 918 | 30  | 
also use your own code from the CW~1.  | 
31  | 
%But do not  | 
|
32  | 
%be tempted to ask Github Copilot for help or do any other  | 
|
33  | 
%shenanigans like this!  | 
|
| 
178
 
d36363d648e3
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
34  | 
|
| 
419
 
4110ab35e5d8
updated courseworks
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
396 
diff
changeset
 | 
35  | 
\subsection*{Question 1}
 | 
| 
178
 
d36363d648e3
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
36  | 
|
| 
419
 
4110ab35e5d8
updated courseworks
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
396 
diff
changeset
 | 
37  | 
To implement a lexer for the WHILE language, you first  | 
| 
358
 
b3129cff41e9
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
333 
diff
changeset
 | 
38  | 
need to design the appropriate regular expressions for the  | 
| 748 | 39  | 
following eleven syntactic entities:  | 
| 
178
 
d36363d648e3
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
40  | 
|
| 
180
 
50e8dcd95ae3
added cw
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
179 
diff
changeset
 | 
41  | 
\begin{enumerate}
 | 
| 
 
50e8dcd95ae3
added cw
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
179 
diff
changeset
 | 
42  | 
\item keywords are  | 
| 
 
50e8dcd95ae3
added cw
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
179 
diff
changeset
 | 
43  | 
|
| 748 | 44  | 
\begin{center}
 | 
| 
275
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
45  | 
\texttt{while}, 
 | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
46  | 
\texttt{if}, 
 | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
47  | 
\texttt{then}, 
 | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
48  | 
\texttt{else}, 
 | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
49  | 
\texttt{do}, 
 | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
50  | 
\texttt{for}, 
 | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
51  | 
\texttt{to}, 
 | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
52  | 
\texttt{true}, 
 | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
53  | 
\texttt{false}, 
 | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
54  | 
\texttt{read}, 
 | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
55  | 
\texttt{write},
 | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
56  | 
\texttt{skip}
 | 
| 748 | 57  | 
\end{center} 
 | 
| 
180
 
50e8dcd95ae3
added cw
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
179 
diff
changeset
 | 
58  | 
|
| 748 | 59  | 
\item operators are:  | 
| 
275
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
60  | 
\texttt{+}, 
 | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
61  | 
\texttt{-}, 
 | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
62  | 
\texttt{*}, 
 | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
63  | 
\texttt{\%},
 | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
64  | 
\texttt{/},
 | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
65  | 
\texttt{==}, 
 | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
66  | 
\texttt{!=}, 
 | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
67  | 
\texttt{>}, 
 | 
| 748 | 68  | 
\texttt{<},
 | 
69  | 
\texttt{<=}, 
 | 
|
70  | 
\texttt{>=},
 | 
|
| 
275
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
71  | 
\texttt{:=},
 | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
72  | 
\texttt{\&\&},
 | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
73  | 
\texttt{||}
 | 
| 748 | 74  | 
|
75  | 
\item letters are uppercase and lowercase  | 
|
| 
180
 
50e8dcd95ae3
added cw
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
179 
diff
changeset
 | 
76  | 
|
| 748 | 77  | 
\item symbols are letters plus the characters  | 
78  | 
  \texttt{.},
 | 
|
79  | 
  \texttt{\_},
 | 
|
80  | 
  \texttt{>},
 | 
|
81  | 
  \texttt{<},
 | 
|
82  | 
  \texttt{=},
 | 
|
83  | 
  \texttt{;},
 | 
|
| 850 | 84  | 
  \texttt{,} (comma),
 | 
| 833 | 85  | 
  \texttt{$\backslash$} and
 | 
| 748 | 86  | 
  \texttt{:}
 | 
87  | 
||
| 
180
 
50e8dcd95ae3
added cw
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
179 
diff
changeset
 | 
88  | 
\item parentheses are \texttt{(}, \texttt{\{}, \texttt{)} and \texttt{\}}
 | 
| 933 | 89  | 
\item digits are \pcode{0} to \pcode{9}
 | 
| 
180
 
50e8dcd95ae3
added cw
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
179 
diff
changeset
 | 
90  | 
\item there are semicolons \texttt{;}
 | 
| 
447
 
68769db65185
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
428 
diff
changeset
 | 
91  | 
\item whitespaces are either \texttt{" "} (one or more) or \texttt{$\backslash$n} or
 | 
| 845 | 92  | 
  \texttt{$\backslash$t} or \texttt{$\backslash$r}
 | 
| 
180
 
50e8dcd95ae3
added cw
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
179 
diff
changeset
 | 
93  | 
\item identifiers are letters followed by underscores \texttt{\_\!\_}, letters
 | 
| 933 | 94  | 
or digits  | 
95  | 
\item numbers for numbers give  | 
|
| 
396
 
4cd75c619e06
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
96  | 
a regular expression that can recognise \pcode{0}, but not numbers 
 | 
| 
 
4cd75c619e06
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
97  | 
with leading zeroes, such as \pcode{001}
 | 
| 933 | 98  | 
\item strings are enclosed by double quotes, like \texttt{"\ldots"}, and consisting of
 | 
99  | 
  symbols, digits, parentheses, whitespaces and \texttt{$\backslash$n} (note the latter is not the escaped version but \texttt{$\backslash$} followed by \texttt{n}, otherwise we would not be able to indicate in our strings when to write a newline).
 | 
|
| 945 | 100  | 
\item comments start with \texttt{//} and contain symbols, spaces, parentheses and digits until the end-of-the-line markers
 | 
| 933 | 101  | 
\item endo-of-line-markers are \texttt{$\backslash$n} and \texttt{$\backslash$r$\backslash$n}  
 | 
| 
180
 
50e8dcd95ae3
added cw
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
179 
diff
changeset
 | 
102  | 
\end{enumerate}
 | 
| 
178
 
d36363d648e3
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
103  | 
|
| 
 
d36363d648e3
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
104  | 
\noindent  | 
| 
275
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
105  | 
You can use the basic regular expressions  | 
| 
178
 
d36363d648e3
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
106  | 
|
| 
275
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
107  | 
\[  | 
| 
419
 
4110ab35e5d8
updated courseworks
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
396 
diff
changeset
 | 
108  | 
\ZERO,\; \ONE,\; c,\; r_1 + r_2,\; r_1 \cdot r_2,\; r^*  | 
| 
275
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
109  | 
\]  | 
| 
178
 
d36363d648e3
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
110  | 
|
| 
275
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
111  | 
\noindent  | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
112  | 
but also the following extended regular expressions  | 
| 
182
 
9ce2414e470e
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
181 
diff
changeset
 | 
113  | 
|
| 
 
9ce2414e470e
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
181 
diff
changeset
 | 
114  | 
\begin{center}
 | 
| 
275
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
115  | 
\begin{tabular}{ll}
 | 
| 494 | 116  | 
$[c_1,c_2,\ldots,c_n]$ & a set of characters\\  | 
| 
275
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
117  | 
$r^+$ & one or more times $r$\\  | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
118  | 
$r^?$ & optional $r$\\  | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
119  | 
$r^{\{n\}}$ & n-times $r$\\
 | 
| 
182
 
9ce2414e470e
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
181 
diff
changeset
 | 
120  | 
\end{tabular}
 | 
| 
 
9ce2414e470e
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
181 
diff
changeset
 | 
121  | 
\end{center}
 | 
| 
 
9ce2414e470e
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
181 
diff
changeset
 | 
122  | 
|
| 458 | 123  | 
\noindent  | 
| 473 | 124  | 
Later on you will also need the record regular expression:  | 
| 458 | 125  | 
|
126  | 
\begin{center}
 | 
|
127  | 
\begin{tabular}{ll}
 | 
|
128  | 
$REC(x:r)$ & record regular expression\\  | 
|
129  | 
\end{tabular}
 | 
|
130  | 
\end{center}
 | 
|
131  | 
||
| 
396
 
4cd75c619e06
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
132  | 
\noindent Try to design your regular expressions to be as  | 
| 494 | 133  | 
small as possible. For example you should use character sets  | 
134  | 
for identifiers and numbers. Feel free to use the general  | 
|
135  | 
character constructor \textit{CFUN} introduced in CW 1.
 | 
|
| 
275
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
136  | 
|
| 
419
 
4110ab35e5d8
updated courseworks
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
396 
diff
changeset
 | 
137  | 
\subsection*{Question 2}
 | 
| 
275
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
138  | 
|
| 
419
 
4110ab35e5d8
updated courseworks
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
396 
diff
changeset
 | 
139  | 
Implement the Sulzmann \& Lu lexer from the lectures. For  | 
| 
358
 
b3129cff41e9
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
333 
diff
changeset
 | 
140  | 
this you need to implement the functions $nullable$ and $der$  | 
| 
369
 
43c0ed473720
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
364 
diff
changeset
 | 
141  | 
(you can use your code from CW~1), as well as $mkeps$ and  | 
| 
358
 
b3129cff41e9
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
333 
diff
changeset
 | 
142  | 
$inj$. These functions need to be appropriately extended for  | 
| 918 | 143  | 
the extended regular expressions from Q1. Write down in the  | 
144  | 
questionaire at the end the  | 
|
| 
369
 
43c0ed473720
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
364 
diff
changeset
 | 
145  | 
clauses for  | 
| 
275
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
146  | 
|
| 
369
 
43c0ed473720
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
364 
diff
changeset
 | 
147  | 
\begin{center}
 | 
| 
 
43c0ed473720
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
364 
diff
changeset
 | 
148  | 
\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
 | 
| 494 | 149  | 
$mkeps([c_1,c_2,\ldots,c_n])$ & $\dn$ & $?$\\  | 
| 
369
 
43c0ed473720
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
364 
diff
changeset
 | 
150  | 
$mkeps(r^+)$ & $\dn$ & $?$\\  | 
| 
 
43c0ed473720
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
364 
diff
changeset
 | 
151  | 
$mkeps(r^?)$ & $\dn$ & $?$\\  | 
| 
 
43c0ed473720
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
364 
diff
changeset
 | 
152  | 
$mkeps(r^{\{n\}})$             & $\dn$ & $?$\medskip\\
 | 
| 494 | 153  | 
$inj\, ([c_1,c_2,\ldots,c_n])\,c\,\ldots$ & $\dn$ & $?$\\  | 
| 
369
 
43c0ed473720
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
364 
diff
changeset
 | 
154  | 
$inj\, (r^+)\,c\,\ldots$ & $\dn$ & $?$\\  | 
| 
 
43c0ed473720
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
364 
diff
changeset
 | 
155  | 
$inj\, (r^?)\,c\,\ldots$ & $\dn$ & $?$\\  | 
| 
 
43c0ed473720
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
364 
diff
changeset
 | 
156  | 
$inj\, (r^{\{n\}})\,c\,\ldots$             & $\dn$ & $?$\\
 | 
| 
 
43c0ed473720
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
364 
diff
changeset
 | 
157  | 
\end{tabular}
 | 
| 
 
43c0ed473720
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
364 
diff
changeset
 | 
158  | 
\end{center}
 | 
| 
 
43c0ed473720
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
364 
diff
changeset
 | 
159  | 
|
| 
 
43c0ed473720
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
364 
diff
changeset
 | 
160  | 
\noindent where $inj$ takes three arguments: a regular  | 
| 
396
 
4cd75c619e06
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
161  | 
expression, a character and a value. Test your lexer code  | 
| 
 
4cd75c619e06
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
162  | 
with at least the two small examples below:  | 
| 
 
4cd75c619e06
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
163  | 
|
| 
 
4cd75c619e06
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
164  | 
\begin{center}
 | 
| 
 
4cd75c619e06
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
165  | 
\begin{tabular}{ll}
 | 
| 
 
4cd75c619e06
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
166  | 
regex: & string:\smallskip\\  | 
| 
 
4cd75c619e06
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
167  | 
$a^{\{3\}}$ & $aaa$\\
 | 
| 458 | 168  | 
$(a + \ONE)^{\{3\}}$ & $aa$
 | 
| 
396
 
4cd75c619e06
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
169  | 
\end{tabular}
 | 
| 
 
4cd75c619e06
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
170  | 
\end{center}
 | 
| 
 
4cd75c619e06
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
171  | 
|
| 
 
4cd75c619e06
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
172  | 
|
| 598 | 173  | 
\noindent Both strings should be successfully lexed by the  | 
| 
396
 
4cd75c619e06
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
174  | 
respective regular expression, that means the lexer returns  | 
| 
 
4cd75c619e06
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
175  | 
in both examples a value.  | 
| 
 
4cd75c619e06
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
176  | 
|
| 
 
4cd75c619e06
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
177  | 
|
| 
 
4cd75c619e06
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
178  | 
Also add the record regular expression from the  | 
| 
419
 
4110ab35e5d8
updated courseworks
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
396 
diff
changeset
 | 
179  | 
lectures to your lexer and implement a function, say  | 
| 
396
 
4cd75c619e06
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
180  | 
\pcode{env}, that returns all assignments from a value (such
 | 
| 
 
4cd75c619e06
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
395 
diff
changeset
 | 
181  | 
that you can extract easily the tokens from a value).\medskip  | 
| 
369
 
43c0ed473720
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
364 
diff
changeset
 | 
182  | 
|
| 
 
43c0ed473720
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
364 
diff
changeset
 | 
183  | 
\noindent  | 
| 933 | 184  | 
Finally give \textbf{all} the tokens for your regular expressions from Q1 and the
 | 
| 
369
 
43c0ed473720
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
364 
diff
changeset
 | 
185  | 
string  | 
| 
275
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
186  | 
|
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
187  | 
\begin{center}
 | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
188  | 
\code{"read n;"}
 | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
189  | 
\end{center} 
 | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
190  | 
|
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
191  | 
\noindent  | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
192  | 
and use your \pcode{env} function to give the token sequence.
 | 
| 
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
193  | 
|
| 
333
 
8890852e18b7
updated coursework
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
328 
diff
changeset
 | 
194  | 
|
| 
419
 
4110ab35e5d8
updated courseworks
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
396 
diff
changeset
 | 
195  | 
\subsection*{Question 3}
 | 
| 
275
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
196  | 
|
| 748 | 197  | 
Extend your lexer from Q2 to also simplify regular expressions after  | 
198  | 
each derivation step and rectify the computed values after each  | 
|
| 933 | 199  | 
injection. Use this lexer to tokenize six WHILE programs some of which  | 
200  | 
are given in Figures~\ref{fib} -- \ref{collatz}. You can find these programms also on
 | 
|
201  | 
Github under the \texttt{cw2} directory. Give the tokens of these
 | 
|
202  | 
programs where whitespaces and comments are  | 
|
| 748 | 203  | 
filtered out. Make sure you can tokenise \textbf{exactly} these
 | 
204  | 
programs.\bigskip  | 
|
| 
182
 
9ce2414e470e
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
181 
diff
changeset
 | 
205  | 
|
| 
178
 
d36363d648e3
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
206  | 
|
| 578 | 207  | 
\begin{figure}[h]
 | 
| 860 | 208  | 
\mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../cwtests/cw02/fib.while}}
 | 
| 
181
 
1f98d215df71
added material
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
180 
diff
changeset
 | 
209  | 
\caption{Fibonacci program in the WHILE language.\label{fib}}
 | 
| 
 
1f98d215df71
added material
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
180 
diff
changeset
 | 
210  | 
\end{figure}
 | 
| 
178
 
d36363d648e3
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
211  | 
|
| 578 | 212  | 
\begin{figure}[h]
 | 
| 860 | 213  | 
\mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../cwtests/cw02/loops.while}}
 | 
| 
275
 
618c7640cf66
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
216 
diff
changeset
 | 
214  | 
\caption{The three-nested-loops program in the WHILE language. 
 | 
| 578 | 215  | 
(Usually used for timing measurements.)\label{loop}}
 | 
| 
181
 
1f98d215df71
added material
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
180 
diff
changeset
 | 
216  | 
\end{figure}
 | 
| 
178
 
d36363d648e3
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
217  | 
|
| 659 | 218  | 
\begin{figure}[h]
 | 
| 860 | 219  | 
\mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../cwtests/cw02/factors.while}}
 | 
| 659 | 220  | 
\caption{A program that calculates factors for numbers in the WHILE
 | 
221  | 
  language.\label{factors}}
 | 
|
222  | 
\end{figure}
 | 
|
223  | 
||
| 748 | 224  | 
\begin{figure}[h]
 | 
| 933 | 225  | 
\mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../cwtests/cw02/collatz2.while}}
 | 
| 748 | 226  | 
\caption{A program that calculates the Collatz series for numbers
 | 
227  | 
  between 1 and 100.\label{collatz}}
 | 
|
228  | 
\end{figure}
 | 
|
229  | 
||
| 918 | 230  | 
\clearpage  | 
231  | 
\newpage  | 
|
232  | 
\section*{Answers}
 | 
|
233  | 
||
234  | 
\mbox{}
 | 
|
235  | 
||
236  | 
\noindent  | 
|
| 933 | 237  | 
\textbf{Question 2:}\\ (Use mathematical notation, such as $r^+$, rather than code, such as \code{PLUS(r)})
 | 
| 918 | 238  | 
|
239  | 
\begin{center}
 | 
|
240  | 
  \def\arraystretch{1.6}  
 | 
|
241  | 
\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}
 | 
|
242  | 
$mkeps([c_1,c_2,\ldots,c_n])$  & $\dn$ & \uline{\hspace{8cm}}\\
 | 
|
243  | 
$mkeps(r^+)$                   & $\dn$ & \uline{\hspace{8cm}}\\
 | 
|
244  | 
$mkeps(r^?)$                   & $\dn$ & \uline{\hspace{8cm}}\\
 | 
|
245  | 
$mkeps(r^{\{n\}})$             & $\dn$ & \uline{\hspace{8cm}}\bigskip\\
 | 
|
246  | 
$inj\, ([c_1,c_2,\ldots,c_n])\,c\,\ldots$  & $\dn$ & \uline{\hspace{8cm}}\\
 | 
|
247  | 
$inj\, (r^+)\,c\,\ldots$                   & $\dn$ & \uline{\hspace{8cm}}\\
 | 
|
248  | 
$inj\, (r^?)\,c\,\ldots$                   & $\dn$ & \uline{\hspace{8cm}}\\
 | 
|
249  | 
$inj\, (r^{\{n\}})\,c\,\ldots$             & $\dn$ & \uline{\hspace{8cm}}\\
 | 
|
250  | 
\end{tabular}
 | 
|
251  | 
\end{center}\bigskip
 | 
|
252  | 
||
253  | 
\noindent  | 
|
254  | 
Tokens for \code{"read n;"}\\
 | 
|
255  | 
||
256  | 
\noindent  | 
|
257  | 
\uline{\hfill}\medskip
 | 
|
258  | 
||
259  | 
\noindent  | 
|
260  | 
\uline{\hfill}\medskip
 | 
|
261  | 
||
262  | 
\noindent  | 
|
263  | 
\uline{\hfill}\medskip
 | 
|
264  | 
||
265  | 
\noindent  | 
|
266  | 
\uline{\hfill}\medskip
 | 
|
267  | 
||
268  | 
||
| 
178
 
d36363d648e3
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
269  | 
\end{document}
 | 
| 
 
d36363d648e3
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
270  | 
|
| 
 
d36363d648e3
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
271  | 
%%% Local Variables:  | 
| 
 
d36363d648e3
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
272  | 
%%% mode: latex  | 
| 
 
d36363d648e3
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
273  | 
%%% TeX-master: t  | 
| 
 
d36363d648e3
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents:  
diff
changeset
 | 
274  | 
%%% End:  |