% !TEX program = xelatex\documentclass{article}\usepackage{../style}\usepackage{../langs}\begin{document}\section*{Coursework 2}\noindent This coursework is worth 10\% and is due on \cwTWO{} at16:00. You are asked to implement the Sulzmann \& Lu lexer for theWHILE language. You can do the implementation in any programminglanguage you like, but you need to submit the source code with whichyou answered the questions, otherwise a mark of 0\% will beawarded. You can submit your answers in a txt-file or as pdf. Codesubmit as code. Please package everything in a zip-file that creates adirectory with the name \texttt{YournameYourfamilyname} on my end. Thanks!\subsection*{Disclaimer\alert}It should be understood that the work you submit representsyour own effort. You have not copied from anyone else. Anexception is the Scala code from KEATS and the code I showedduring the lectures, which you can both freely use. You canalso use your own code from the CW~1. But do notbe tempted to ask Github Copilot for help or do any othershenanigans like this!\subsection*{Question 1}To implement a lexer for the WHILE language, you firstneed to design the appropriate regular expressions for thefollowing eleven syntactic entities:\begin{enumerate}\item keywords are\begin{center}\texttt{while}, \texttt{if}, \texttt{then}, \texttt{else}, \texttt{do}, \texttt{for}, \texttt{to}, \texttt{true}, \texttt{false}, \texttt{read}, \texttt{write},\texttt{skip}\end{center} \item operators are:\texttt{+}, \texttt{-}, \texttt{*}, \texttt{\%},\texttt{/},\texttt{==}, \texttt{!=}, \texttt{>}, \texttt{<},\texttt{<=}, \texttt{>=},\texttt{:=},\texttt{\&\&},\texttt{||}\item letters are uppercase and lowercase\item symbols are letters plus the characters \texttt{.}, \texttt{\_}, \texttt{>}, \texttt{<}, \texttt{=}, \texttt{;}, \texttt{,} (comma), \texttt{$\backslash$} and \texttt{:}\item strings are enclosed by double quotes, like \texttt{"\ldots"}, and consisting of symbols, whitespaces and digits\item parentheses are \texttt{(}, \texttt{\{}, \texttt{)} and \texttt{\}}\item there are semicolons \texttt{;}\item whitespaces are either \texttt{" "} (one or more) or \texttt{$\backslash$n} or \texttt{$\backslash$t} or \texttt{$\backslash$r}\item identifiers are letters followed by underscores \texttt{\_\!\_}, lettersor digits\item numbers are \pcode{0}, \pcode{1}, \ldots and so on; give a regular expression that can recognise \pcode{0}, but not numbers with leading zeroes, such as \pcode{001}\item comments start with \texttt{//} and contain symbols, spaces and digits until the end of the line\end{enumerate}\noindentYou can use the basic regular expressions \[\ZERO,\; \ONE,\; c,\; r_1 + r_2,\; r_1 \cdot r_2,\; r^*\]\noindentbut also the following extended regular expressions\begin{center}\begin{tabular}{ll}$[c_1,c_2,\ldots,c_n]$ & a set of characters\\$r^+$ & one or more times $r$\\$r^?$ & optional $r$\\$r^{\{n\}}$ & n-times $r$\\\end{tabular}\end{center}\noindentLater on you will also need the record regular expression:\begin{center}\begin{tabular}{ll}$REC(x:r)$ & record regular expression\\\end{tabular}\end{center}\noindent Try to design your regular expressions to be assmall as possible. For example you should use character setsfor identifiers and numbers. Feel free to use the generalcharacter constructor \textit{CFUN} introduced in CW 1.\subsection*{Question 2}Implement the Sulzmann \& Lu lexer from the lectures. Forthis you need to implement the functions $nullable$ and $der$(you can use your code from CW~1), as well as $mkeps$ and$inj$. These functions need to be appropriately extended forthe extended regular expressions from Q1. Write down the clauses for\begin{center}\begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}}$mkeps([c_1,c_2,\ldots,c_n])$ & $\dn$ & $?$\\$mkeps(r^+)$ & $\dn$ & $?$\\$mkeps(r^?)$ & $\dn$ & $?$\\$mkeps(r^{\{n\}})$ & $\dn$ & $?$\medskip\\$inj\, ([c_1,c_2,\ldots,c_n])\,c\,\ldots$ & $\dn$ & $?$\\$inj\, (r^+)\,c\,\ldots$ & $\dn$ & $?$\\$inj\, (r^?)\,c\,\ldots$ & $\dn$ & $?$\\$inj\, (r^{\{n\}})\,c\,\ldots$ & $\dn$ & $?$\\\end{tabular}\end{center}\noindent where $inj$ takes three arguments: a regularexpression, a character and a value. Test your lexer codewith at least the two small examples below:\begin{center}\begin{tabular}{ll}regex: & string:\smallskip\\$a^{\{3\}}$ & $aaa$\\$(a + \ONE)^{\{3\}}$ & $aa$\end{tabular}\end{center}\noindent Both strings should be successfully lexed by therespective regular expression, that means the lexer returns in both examples a value.Also add the record regular expression from thelectures to your lexer and implement a function, say\pcode{env}, that returns all assignments from a value (suchthat you can extract easily the tokens from a value).\medskip \noindentFinally give the tokens for your regular expressions from Q1 and thestring\begin{center}\code{"read n;"}\end{center} \noindentand use your \pcode{env} function to give the token sequence.\subsection*{Question 3}Extend your lexer from Q2 to also simplify regular expressions aftereach derivation step and rectify the computed values after eachinjection. Use this lexer to tokenize the programs inFigures~\ref{fib} -- \ref{collatz}. You can find the programms also onKEATS. Give the tokens of these programs where whitespaces arefiltered out. Make sure you can tokenise \textbf{exactly} theseprograms.\bigskip\begin{figure}[h]\mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../cwtests/cw02/fib.while}}\caption{Fibonacci program in the WHILE language.\label{fib}}\end{figure}\begin{figure}[h]\mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../cwtests/cw02/loops.while}}\caption{The three-nested-loops program in the WHILE language. (Usually used for timing measurements.)\label{loop}}\end{figure}\begin{figure}[h]\mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../cwtests/cw02/factors.while}}\caption{A program that calculates factors for numbers in the WHILE language.\label{factors}}\end{figure}\begin{figure}[h]\mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../progs/while-tests/collatz2.while}}\caption{A program that calculates the Collatz series for numbers between 1 and 100.\label{collatz}}\end{figure}\end{document}%%% Local Variables: %%% mode: latex%%% TeX-master: t%%% End: