11 \noindent This coursework is worth 10\% and is due on \cwTWO{} at |
11 \noindent This coursework is worth 10\% and is due on \cwTWO{} at |
12 16:00. You are asked to implement the Sulzmann \& Lu lexer for the |
12 16:00. You are asked to implement the Sulzmann \& Lu lexer for the |
13 WHILE language. You can do the implementation in any programming |
13 WHILE language. You can do the implementation in any programming |
14 language you like, but you need to submit the source code with which |
14 language you like, but you need to submit the source code with which |
15 you answered the questions, otherwise a mark of 0\% will be |
15 you answered the questions, otherwise a mark of 0\% will be |
16 awarded. You need to submit your written answers as pdf---see attached |
16 awarded. %You need to submit your written answers as pdf---see attached |
17 questionaire. Code send as code. If you use Scala in your code, a |
17 % questionaire. Code send as code. |
|
18 If you use Scala in your code, a |
18 good place to start is the file \texttt{lexer.sc} and |
19 good place to start is the file \texttt{lexer.sc} and |
19 \texttt{token.sc} uploaded to KEATS. The template file on Github is |
20 \texttt{token.sc} uploaded to KEATS. The template file on Github is |
20 called \texttt{cw02.sc}. Your code needs to be uploaded to Github by |
21 called \texttt{cw02.sc}. The example files are in the subdirectory |
21 the deadline. |
22 \texttt{examples}. The main function that will be tested is |
|
23 called \texttt{tokenise}. The marks will be distributed such that |
|
24 3 marks are given for the correct \texttt{WHILE\_REGS} regular |
|
25 expression; 5 marks for the correct \texttt{inj} and \texttt{mkeps} |
|
26 definitions; and two marks when \texttt{tokenise} produces the correct |
|
27 results for the example files. |
|
28 |
|
29 |
22 |
30 |
23 \subsection*{Disclaimer\alert} |
31 \subsection*{Disclaimer\alert} |
24 |
32 |
25 It should be understood that the work you submit represents |
33 It should be understood that the work you submit represents |
26 your own effort. You have not copied from anyone else |
34 your own effort. You have not copied from anyone else |
174 respective regular expression, that means the lexer returns |
182 respective regular expression, that means the lexer returns |
175 in both examples a value. |
183 in both examples a value. |
176 |
184 |
177 |
185 |
178 Also add the record regular expression from the |
186 Also add the record regular expression from the |
179 lectures to your lexer and implement a function, say |
187 lectures to your lexer and complete the function |
180 \pcode{env}, that returns all assignments from a value (such |
188 \pcode{env} so that it returns all assignments from a value (this then |
181 that you can extract easily the tokens from a value).\medskip |
189 allows you to extract easily the tokens from a value in the next |
182 |
190 question).\medskip |
183 \noindent |
191 |
184 Finally give \textbf{all} the tokens for your regular expressions from Q1 and the |
192 \noindent |
185 string |
193 Finally make that the function \texttt{lexing\_simp} generates |
|
194 with the regular expression from Q1 for the string |
186 |
195 |
187 \begin{center} |
196 \begin{center} |
188 \code{"read n;"} |
197 \code{"read n;"} |
189 \end{center} |
198 \end{center} |
190 |
199 |
191 \noindent |
200 \noindent |
192 and use your \pcode{env} function to give the token sequence. |
201 the following pairs: |
|
202 |
|
203 \begin{center} |
|
204 \texttt{List((k,read), (w, ), (i,n), (s,;))} |
|
205 \end{center} |
|
206 |
|
207 |
|
208 |
193 |
209 |
194 |
210 |
195 \subsection*{Question 3} |
211 \subsection*{Question 3} |
196 |
212 |
197 Extend your lexer from Q2 to also simplify regular expressions after |
213 Make sure your lexer from Q2 also simplifies regular expressions after |
198 each derivation step and rectify the computed values after each |
214 each derivation step and rectifies the computed values after each |
199 injection. Use this lexer to tokenize six WHILE programs some of which |
215 injection. Use this lexer to tokenise the six WHILE programs |
200 are given in Figures~\ref{fib} -- \ref{collatz}. You can find these programms also on |
216 in the \texttt{examples} directory. Make sure that the \texttt{tokenise} |
201 Github under the \texttt{cw2} directory. Give the tokens of these |
217 function filters out whitespaces and comments.\bigskip |
202 programs where whitespaces and comments are |
218 |
203 filtered out. Make sure you can tokenise \textbf{exactly} these |
219 |
204 programs.\bigskip |
220 % \begin{figure}[h] |
205 |
221 % \mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../cwtests/cw02/fib.while}} |
206 |
222 % \caption{Fibonacci program in the WHILE language.\label{fib}} |
207 \begin{figure}[h] |
223 % \end{figure} |
208 \mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../cwtests/cw02/fib.while}} |
224 |
209 \caption{Fibonacci program in the WHILE language.\label{fib}} |
225 % \begin{figure}[h] |
210 \end{figure} |
226 % \mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../cwtests/cw02/loops.while}} |
211 |
227 % \caption{The three-nested-loops program in the WHILE language. |
212 \begin{figure}[h] |
228 % (Usually used for timing measurements.)\label{loop}} |
213 \mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../cwtests/cw02/loops.while}} |
229 % \end{figure} |
214 \caption{The three-nested-loops program in the WHILE language. |
230 |
215 (Usually used for timing measurements.)\label{loop}} |
231 % \begin{figure}[h] |
216 \end{figure} |
232 % \mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../cwtests/cw02/factors.while}} |
217 |
233 % \caption{A program that calculates factors for numbers in the WHILE |
218 \begin{figure}[h] |
234 % language.\label{factors}} |
219 \mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../cwtests/cw02/factors.while}} |
235 % \end{figure} |
220 \caption{A program that calculates factors for numbers in the WHILE |
236 |
221 language.\label{factors}} |
237 % \begin{figure}[h] |
222 \end{figure} |
238 % \mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../cwtests/cw02/collatz2.while}} |
223 |
239 % \caption{A program that calculates the Collatz series for numbers |
224 \begin{figure}[h] |
240 % between 1 and 100.\label{collatz}} |
225 \mbox{\lstinputlisting[language=While,xleftmargin=10mm]{../cwtests/cw02/collatz2.while}} |
241 % \end{figure} |
226 \caption{A program that calculates the Collatz series for numbers |
242 |
227 between 1 and 100.\label{collatz}} |
243 % \clearpage |
228 \end{figure} |
244 % \newpage |
229 |
245 % \section*{Answers} |
230 \clearpage |
246 |
231 \newpage |
247 % \mbox{} |
232 \section*{Answers} |
248 |
233 |
249 % \noindent |
234 \mbox{} |
250 % \textbf{Question 2:}\\ (Use mathematical notation, such as $r^+$, rather than code, such as \code{PLUS(r)}) |
235 |
251 |
236 \noindent |
252 % \begin{center} |
237 \textbf{Question 2:}\\ (Use mathematical notation, such as $r^+$, rather than code, such as \code{PLUS(r)}) |
253 % \def\arraystretch{1.6} |
238 |
254 % \begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}} |
239 \begin{center} |
255 % $mkeps([c_1,c_2,\ldots,c_n])$ & $\dn$ & \uline{\hspace{8cm}}\\ |
240 \def\arraystretch{1.6} |
256 % $mkeps(r^+)$ & $\dn$ & \uline{\hspace{8cm}}\\ |
241 \begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}} |
257 % $mkeps(r^?)$ & $\dn$ & \uline{\hspace{8cm}}\\ |
242 $mkeps([c_1,c_2,\ldots,c_n])$ & $\dn$ & \uline{\hspace{8cm}}\\ |
258 % $mkeps(r^{\{n\}})$ & $\dn$ & \uline{\hspace{8cm}}\bigskip\\ |
243 $mkeps(r^+)$ & $\dn$ & \uline{\hspace{8cm}}\\ |
259 % $inj\, ([c_1,c_2,\ldots,c_n])\,c\,\ldots$ & $\dn$ & \uline{\hspace{8cm}}\\ |
244 $mkeps(r^?)$ & $\dn$ & \uline{\hspace{8cm}}\\ |
260 % $inj\, (r^+)\,c\,\ldots$ & $\dn$ & \uline{\hspace{8cm}}\\ |
245 $mkeps(r^{\{n\}})$ & $\dn$ & \uline{\hspace{8cm}}\bigskip\\ |
261 % $inj\, (r^?)\,c\,\ldots$ & $\dn$ & \uline{\hspace{8cm}}\\ |
246 $inj\, ([c_1,c_2,\ldots,c_n])\,c\,\ldots$ & $\dn$ & \uline{\hspace{8cm}}\\ |
262 % $inj\, (r^{\{n\}})\,c\,\ldots$ & $\dn$ & \uline{\hspace{8cm}}\\ |
247 $inj\, (r^+)\,c\,\ldots$ & $\dn$ & \uline{\hspace{8cm}}\\ |
263 % \end{tabular} |
248 $inj\, (r^?)\,c\,\ldots$ & $\dn$ & \uline{\hspace{8cm}}\\ |
264 % \end{center}\bigskip |
249 $inj\, (r^{\{n\}})\,c\,\ldots$ & $\dn$ & \uline{\hspace{8cm}}\\ |
265 |
250 \end{tabular} |
266 % \noindent |
251 \end{center}\bigskip |
267 % Tokens for \code{"read n;"}\\ |
252 |
268 |
253 \noindent |
269 % \noindent |
254 Tokens for \code{"read n;"}\\ |
270 % \uline{\hfill}\medskip |
255 |
271 |
256 \noindent |
272 % \noindent |
257 \uline{\hfill}\medskip |
273 % \uline{\hfill}\medskip |
258 |
274 |
259 \noindent |
275 % \noindent |
260 \uline{\hfill}\medskip |
276 % \uline{\hfill}\medskip |
261 |
277 |
262 \noindent |
278 % \noindent |
263 \uline{\hfill}\medskip |
279 % \uline{\hfill}\medskip |
264 |
|
265 \noindent |
|
266 \uline{\hfill}\medskip |
|
267 |
280 |
268 |
281 |
269 \end{document} |
282 \end{document} |
270 |
283 |
271 %%% Local Variables: |
284 %%% Local Variables: |