13 WHILE language. You can do the implementation in any programming |
13 WHILE language. You can do the implementation in any programming |
14 language you like, but you need to submit the source code with which |
14 language you like, but you need to submit the source code with which |
15 you answered the questions, otherwise a mark of 0\% will be |
15 you answered the questions, otherwise a mark of 0\% will be |
16 awarded. You need to submit your written |
16 awarded. You need to submit your written |
17 answers as pdf---see attached questionaire. Code send as code. If you use |
17 answers as pdf---see attached questionaire. Code send as code. If you use |
18 Scala in your code, a good place to start is the file \texttt{re3.sc} |
18 Scala in your code, a good place to start is the file \texttt{lexer.sc} |
19 that is uploaded to Github. |
19 and \texttt{token.sc} |
|
20 that are uploaded to Github. |
20 |
21 |
21 \subsection*{Disclaimer\alert} |
22 \subsection*{Disclaimer\alert} |
22 |
23 |
23 It should be understood that the work you submit represents |
24 It should be understood that the work you submit represents |
24 your own effort. You have not copied from anyone else |
25 your own effort. You have not copied from anyone else |
81 \texttt{;}, |
82 \texttt{;}, |
82 \texttt{,} (comma), |
83 \texttt{,} (comma), |
83 \texttt{$\backslash$} and |
84 \texttt{$\backslash$} and |
84 \texttt{:} |
85 \texttt{:} |
85 |
86 |
86 \item strings are enclosed by double quotes, like \texttt{"\ldots"}, and consisting of |
|
87 symbols, whitespaces and digits |
|
88 \item parentheses are \texttt{(}, \texttt{\{}, \texttt{)} and \texttt{\}} |
87 \item parentheses are \texttt{(}, \texttt{\{}, \texttt{)} and \texttt{\}} |
|
88 \item digits are \pcode{0} to \pcode{9} |
89 \item there are semicolons \texttt{;} |
89 \item there are semicolons \texttt{;} |
90 \item whitespaces are either \texttt{" "} (one or more) or \texttt{$\backslash$n} or |
90 \item whitespaces are either \texttt{" "} (one or more) or \texttt{$\backslash$n} or |
91 \texttt{$\backslash$t} or \texttt{$\backslash$r} |
91 \texttt{$\backslash$t} or \texttt{$\backslash$r} |
92 \item identifiers are letters followed by underscores \texttt{\_\!\_}, letters |
92 \item identifiers are letters followed by underscores \texttt{\_\!\_}, letters |
93 or digits |
93 or digits |
94 \item numbers are \pcode{0}, \pcode{1}, \ldots and so on; give |
94 \item numbers for numbers give |
95 a regular expression that can recognise \pcode{0}, but not numbers |
95 a regular expression that can recognise \pcode{0}, but not numbers |
96 with leading zeroes, such as \pcode{001} |
96 with leading zeroes, such as \pcode{001} |
97 \item comments start with \texttt{//} and contain symbols, spaces and digits until the end of the line |
97 \item strings are enclosed by double quotes, like \texttt{"\ldots"}, and consisting of |
|
98 symbols, digits, parentheses, whitespaces and \texttt{$\backslash$n} (note the latter is not the escaped version but \texttt{$\backslash$} followed by \texttt{n}, otherwise we would not be able to indicate in our strings when to write a newline). |
|
99 \item comments start with \texttt{//} and contain symbols, spaces and digits until the end-of-the-line markers |
|
100 \item endo-of-line-markers are \texttt{$\backslash$n} and \texttt{$\backslash$r$\backslash$n} |
98 \end{enumerate} |
101 \end{enumerate} |
99 |
102 |
100 \noindent |
103 \noindent |
101 You can use the basic regular expressions |
104 You can use the basic regular expressions |
102 |
105 |
175 lectures to your lexer and implement a function, say |
178 lectures to your lexer and implement a function, say |
176 \pcode{env}, that returns all assignments from a value (such |
179 \pcode{env}, that returns all assignments from a value (such |
177 that you can extract easily the tokens from a value).\medskip |
180 that you can extract easily the tokens from a value).\medskip |
178 |
181 |
179 \noindent |
182 \noindent |
180 Finally give the tokens for your regular expressions from Q1 and the |
183 Finally give \textbf{all} the tokens for your regular expressions from Q1 and the |
181 string |
184 string |
182 |
185 |
183 \begin{center} |
186 \begin{center} |
184 \code{"read n;"} |
187 \code{"read n;"} |
185 \end{center} |
188 \end{center} |
190 |
193 |
191 \subsection*{Question 3} |
194 \subsection*{Question 3} |
192 |
195 |
193 Extend your lexer from Q2 to also simplify regular expressions after |
196 Extend your lexer from Q2 to also simplify regular expressions after |
194 each derivation step and rectify the computed values after each |
197 each derivation step and rectify the computed values after each |
195 injection. Use this lexer to tokenize the programs in |
198 injection. Use this lexer to tokenize six WHILE programs some of which |
196 Figures~\ref{fib} -- \ref{collatz}. You can find the programms also on |
199 are given in Figures~\ref{fib} -- \ref{collatz}. You can find these programms also on |
197 KEATS. Give the tokens of these programs where whitespaces are |
200 Github under the \texttt{cw2} directory. Give the tokens of these |
|
201 programs where whitespaces and comments are |
198 filtered out. Make sure you can tokenise \textbf{exactly} these |
202 filtered out. Make sure you can tokenise \textbf{exactly} these |
199 programs.\bigskip |
203 programs.\bigskip |
200 |
204 |
201 |
205 |
202 \begin{figure}[h] |
206 \begin{figure}[h] |
215 \caption{A program that calculates factors for numbers in the WHILE |
219 \caption{A program that calculates factors for numbers in the WHILE |
216 language.\label{factors}} |
220 language.\label{factors}} |
217 \end{figure} |
221 \end{figure} |
218 |
222 |
219 \begin{figure}[h] |
223 \begin{figure}[h] |
220 \mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../progs/while-tests/collatz2.while}} |
224 \mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../cwtests/cw02/collatz2.while}} |
221 \caption{A program that calculates the Collatz series for numbers |
225 \caption{A program that calculates the Collatz series for numbers |
222 between 1 and 100.\label{collatz}} |
226 between 1 and 100.\label{collatz}} |
223 \end{figure} |
227 \end{figure} |
224 |
228 |
225 \clearpage |
229 \clearpage |
227 \section*{Answers} |
231 \section*{Answers} |
228 |
232 |
229 \mbox{} |
233 \mbox{} |
230 |
234 |
231 \noindent |
235 \noindent |
232 \textbf{Question 2:} |
236 \textbf{Question 2:}\\ (Use mathematical notation, such as $r^+$, rather than code, such as \code{PLUS(r)}) |
233 |
237 |
234 \begin{center} |
238 \begin{center} |
235 \def\arraystretch{1.6} |
239 \def\arraystretch{1.6} |
236 \begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}} |
240 \begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}} |
237 $mkeps([c_1,c_2,\ldots,c_n])$ & $\dn$ & \uline{\hspace{8cm}}\\ |
241 $mkeps([c_1,c_2,\ldots,c_n])$ & $\dn$ & \uline{\hspace{8cm}}\\ |