coursework/cw02.tex
changeset 419 4110ab35e5d8
parent 396 4cd75c619e06
child 428 a47c4227a0c6
equal deleted inserted replaced
418:010c5a03dca2 419:4110ab35e5d8
     4 
     4 
     5 \begin{document}
     5 \begin{document}
     6 
     6 
     7 \section*{Coursework 2 (Strand 1)}
     7 \section*{Coursework 2 (Strand 1)}
     8 
     8 
     9 \noindent This coursework is worth 5\% and is due on 6
     9 \noindent This coursework is worth 4\% and is due on 3
    10 November at 16:00. You are asked to implement the Sulzmann \&
    10 November at 16:00. You are asked to implement the Sulzmann \&
    11 Lu tokeniser for the WHILE language. You can do the
    11 Lu lexer for the WHILE language. You can do the
    12 implementation in any programming language you like, but you
    12 implementation in any programming language you like, but you
    13 need to submit the source code with which you answered the
    13 need to submit the source code with which you answered the
    14 questions, otherwise a mark of 0\% will be awarded. You can
    14 questions, otherwise a mark of 0\% will be awarded. You can
    15 submit your answers in a txt-file or as pdf.
    15 submit your answers in a txt-file or as pdf. Code submit as 
       
    16 code.
    16 
    17 
    17 \subsection*{Disclaimer}
    18 \subsection*{Disclaimer}
    18 
    19 
    19 It should be understood that the work you submit represents
    20 It should be understood that the work you submit represents
    20 your own effort. You have not copied from anyone else. An
    21 your own effort. You have not copied from anyone else. An
    21 exception is the Scala code from KEATS and the code I showed
    22 exception is the Scala code from KEATS and the code I showed
    22 during the lectures, which you can both use. You can also use
    23 during the lectures, which you can both freely use. You can
    23 your own code from the CW~1.
    24 also use your own code from the CW~1.
    24 
    25 
    25 \subsection*{Question 1 (marked with 1\%)}
    26 \subsection*{Question 1}
    26 
    27 
    27 To implement a tokeniser for the WHILE language, you first
    28 To implement a lexer for the WHILE language, you first
    28 need to design the appropriate regular expressions for the
    29 need to design the appropriate regular expressions for the
    29 following eight syntactic entities:
    30 following eight syntactic entities:
    30 
    31 
    31 \begin{enumerate}
    32 \begin{enumerate}
    32 \item keywords are
    33 \item keywords are
    76 
    77 
    77 \noindent
    78 \noindent
    78 You can use the basic regular expressions 
    79 You can use the basic regular expressions 
    79 
    80 
    80 \[
    81 \[
    81 \varnothing, \epsilon, c, r_1 + r_2, r_1 \cdot r_2, r^*
    82 \ZERO,\; \ONE,\; c,\; r_1 + r_2,\; r_1 \cdot r_2,\; r^*
    82 \]
    83 \]
    83 
    84 
    84 \noindent
    85 \noindent
    85 but also the following extended regular expressions
    86 but also the following extended regular expressions
    86 
    87 
    95 
    96 
    96 \noindent Try to design your regular expressions to be as
    97 \noindent Try to design your regular expressions to be as
    97 small as possible. For example you should use character ranges
    98 small as possible. For example you should use character ranges
    98 for identifiers and numbers.
    99 for identifiers and numbers.
    99 
   100 
   100 \subsection*{Question 2 (marked with 3\%)}
   101 \subsection*{Question 2}
   101 
   102 
   102 Implement the Sulzmann \& Lu tokeniser from the lectures. For
   103 Implement the Sulzmann \& Lu lexer from the lectures. For
   103 this you need to implement the functions $nullable$ and $der$
   104 this you need to implement the functions $nullable$ and $der$
   104 (you can use your code from CW~1), as well as $mkeps$ and
   105 (you can use your code from CW~1), as well as $mkeps$ and
   105 $inj$. These functions need to be appropriately extended for
   106 $inj$. These functions need to be appropriately extended for
   106 the extended regular expressions from Q1. Write down the 
   107 the extended regular expressions from Q1. Write down the 
   107 clauses for
   108 clauses for
   136 respective regular expression, that means the lexer returns 
   137 respective regular expression, that means the lexer returns 
   137 in both examples a value.
   138 in both examples a value.
   138 
   139 
   139 
   140 
   140 Also add the record regular expression from the
   141 Also add the record regular expression from the
   141 lectures to your tokeniser and implement a function, say
   142 lectures to your lexer and implement a function, say
   142 \pcode{env}, that returns all assignments from a value (such
   143 \pcode{env}, that returns all assignments from a value (such
   143 that you can extract easily the tokens from a value).\medskip 
   144 that you can extract easily the tokens from a value).\medskip 
   144 
   145 
   145 \noindent
   146 \noindent
   146 Finally give the tokens for your regular expressions from Q1 and the
   147 Finally give the tokens for your regular expressions from Q1 and the
   152 
   153 
   153 \noindent
   154 \noindent
   154 and use your \pcode{env} function to give the token sequence.
   155 and use your \pcode{env} function to give the token sequence.
   155 
   156 
   156 
   157 
   157 \subsection*{Question 3 (marked with 1\%)}
   158 \subsection*{Question 3}
   158 
   159 
   159 Extend your tokenizer from Q2 to also simplify regular expressions
   160 Extend your lexer from Q2 to also simplify regular expressions
   160 after each derivation step and rectify the computed values after each
   161 after each derivation step and rectify the computed values after each
   161 injection. Use this tokenizer to tokenize the programs in
   162 injection. Use this lexer to tokenize the programs in
   162 Figure~\ref{fib} and \ref{loop}. Give the tokens of these
   163 Figure~\ref{fib} and \ref{loop}. Give the tokens of these
   163 programs where whitespaces are filtered out.
   164 programs where whitespaces are filtered out.
   164 
   165 
   165 
   166 
   166 \begin{figure}[p]
   167 \begin{figure}[p]