4 |
4 |
5 \begin{document} |
5 \begin{document} |
6 |
6 |
7 \section*{Coursework 2 (Strand 1)} |
7 \section*{Coursework 2 (Strand 1)} |
8 |
8 |
9 \noindent This coursework is worth 5\% and is due on 6 |
9 \noindent This coursework is worth 4\% and is due on 3 |
10 November at 16:00. You are asked to implement the Sulzmann \& |
10 November at 16:00. You are asked to implement the Sulzmann \& |
11 Lu tokeniser for the WHILE language. You can do the |
11 Lu lexer for the WHILE language. You can do the |
12 implementation in any programming language you like, but you |
12 implementation in any programming language you like, but you |
13 need to submit the source code with which you answered the |
13 need to submit the source code with which you answered the |
14 questions, otherwise a mark of 0\% will be awarded. You can |
14 questions, otherwise a mark of 0\% will be awarded. You can |
15 submit your answers in a txt-file or as pdf. |
15 submit your answers in a txt-file or as pdf. Code submit as |
|
16 code. |
16 |
17 |
17 \subsection*{Disclaimer} |
18 \subsection*{Disclaimer} |
18 |
19 |
19 It should be understood that the work you submit represents |
20 It should be understood that the work you submit represents |
20 your own effort. You have not copied from anyone else. An |
21 your own effort. You have not copied from anyone else. An |
21 exception is the Scala code from KEATS and the code I showed |
22 exception is the Scala code from KEATS and the code I showed |
22 during the lectures, which you can both use. You can also use |
23 during the lectures, which you can both freely use. You can |
23 your own code from the CW~1. |
24 also use your own code from the CW~1. |
24 |
25 |
25 \subsection*{Question 1 (marked with 1\%)} |
26 \subsection*{Question 1} |
26 |
27 |
27 To implement a tokeniser for the WHILE language, you first |
28 To implement a lexer for the WHILE language, you first |
28 need to design the appropriate regular expressions for the |
29 need to design the appropriate regular expressions for the |
29 following eight syntactic entities: |
30 following eight syntactic entities: |
30 |
31 |
31 \begin{enumerate} |
32 \begin{enumerate} |
32 \item keywords are |
33 \item keywords are |
76 |
77 |
77 \noindent |
78 \noindent |
78 You can use the basic regular expressions |
79 You can use the basic regular expressions |
79 |
80 |
80 \[ |
81 \[ |
81 \varnothing, \epsilon, c, r_1 + r_2, r_1 \cdot r_2, r^* |
82 \ZERO,\; \ONE,\; c,\; r_1 + r_2,\; r_1 \cdot r_2,\; r^* |
82 \] |
83 \] |
83 |
84 |
84 \noindent |
85 \noindent |
85 but also the following extended regular expressions |
86 but also the following extended regular expressions |
86 |
87 |
95 |
96 |
96 \noindent Try to design your regular expressions to be as |
97 \noindent Try to design your regular expressions to be as |
97 small as possible. For example you should use character ranges |
98 small as possible. For example you should use character ranges |
98 for identifiers and numbers. |
99 for identifiers and numbers. |
99 |
100 |
100 \subsection*{Question 2 (marked with 3\%)} |
101 \subsection*{Question 2} |
101 |
102 |
102 Implement the Sulzmann \& Lu tokeniser from the lectures. For |
103 Implement the Sulzmann \& Lu lexer from the lectures. For |
103 this you need to implement the functions $nullable$ and $der$ |
104 this you need to implement the functions $nullable$ and $der$ |
104 (you can use your code from CW~1), as well as $mkeps$ and |
105 (you can use your code from CW~1), as well as $mkeps$ and |
105 $inj$. These functions need to be appropriately extended for |
106 $inj$. These functions need to be appropriately extended for |
106 the extended regular expressions from Q1. Write down the |
107 the extended regular expressions from Q1. Write down the |
107 clauses for |
108 clauses for |
136 respective regular expression, that means the lexer returns |
137 respective regular expression, that means the lexer returns |
137 in both examples a value. |
138 in both examples a value. |
138 |
139 |
139 |
140 |
140 Also add the record regular expression from the |
141 Also add the record regular expression from the |
141 lectures to your tokeniser and implement a function, say |
142 lectures to your lexer and implement a function, say |
142 \pcode{env}, that returns all assignments from a value (such |
143 \pcode{env}, that returns all assignments from a value (such |
143 that you can extract easily the tokens from a value).\medskip |
144 that you can extract easily the tokens from a value).\medskip |
144 |
145 |
145 \noindent |
146 \noindent |
146 Finally give the tokens for your regular expressions from Q1 and the |
147 Finally give the tokens for your regular expressions from Q1 and the |
152 |
153 |
153 \noindent |
154 \noindent |
154 and use your \pcode{env} function to give the token sequence. |
155 and use your \pcode{env} function to give the token sequence. |
155 |
156 |
156 |
157 |
157 \subsection*{Question 3 (marked with 1\%)} |
158 \subsection*{Question 3} |
158 |
159 |
159 Extend your tokenizer from Q2 to also simplify regular expressions |
160 Extend your lexer from Q2 to also simplify regular expressions |
160 after each derivation step and rectify the computed values after each |
161 after each derivation step and rectify the computed values after each |
161 injection. Use this tokenizer to tokenize the programs in |
162 injection. Use this lexer to tokenize the programs in |
162 Figure~\ref{fib} and \ref{loop}. Give the tokens of these |
163 Figure~\ref{fib} and \ref{loop}. Give the tokens of these |
163 programs where whitespaces are filtered out. |
164 programs where whitespaces are filtered out. |
164 |
165 |
165 |
166 |
166 \begin{figure}[p] |
167 \begin{figure}[p] |