cws/cw04.tex
changeset 284 9a04eb6a2291
parent 275 eb1b4ad23941
child 289 08b5ddbc7e55
--- a/cws/cw04.tex	Tue Oct 29 23:56:13 2019 +0000
+++ b/cws/cw04.tex	Wed Oct 30 11:28:44 2019 +0000
@@ -99,21 +99,22 @@
 \mbox{}\hfill\textit{``[Google’s MapReduce] abstraction is inspired by the}\\
 \mbox{}\hfill\textit{map and reduce primitives present in Lisp and many}\\
 \mbox{}\hfill\textit{other functional language.''}\smallskip\\
-\mbox{}\hfill\textit{ --- Dean and Ghemawat, who designed this concept at Google}\bigskip
+\mbox{}\hfill\textit{ --- Dean and Ghemawat, who designed this concept at Google}
+\bigskip\medskip
 
 \noindent
-This coursework is worth 10\%. It is about a regular expression
-matcher and the shunting yard algorithm by Dijkstra. The first part is
-due on \cwNINE{} at 11pm; the second, more advanced part, is due on
-\cwNINEa{} at 11pm. In the first part, you are asked to implement a
-regular expression matcher based on derivatives of regular
-expressions. The background is that ``out-of-the-box'' regular
-expression matching in mainstream languages like Java, JavaScript and
-Python can sometimes be excruciatingly slow. You are supposed to implement
-an regular expression matcher that is much, much faster. The advanced part is
-about the shunting yard algorithm that transforms the usual infix
-notation of arithmetic expressions into the postfix notation, which is
-for example used in compilers.\bigskip
+This coursework is about the shunting yard algorithm by Dijkstra and a
+regular expression matcher by Brzozowski. The preliminary part is due on
+\cwNINE{} at 4pm; the core, more advanced part, is due on \cwNINEa{}
+at 4pm. The preliminary part is about the shunting yard algorithm that
+transforms the usual infix notation of arithmetic expressions into the
+postfix notation, which is for example used in compilers. In the core
+part, you are asked to implement a regular expression matcher based on
+derivatives of regular expressions. The background is that
+``out-of-the-box'' regular expression matching in mainstream languages
+like Java, JavaScript and Python can sometimes be excruciatingly slow.
+You are supposed to implement an regular expression matcher that is
+much, much faster. \bigskip
 
 \IMPORTANT{}
 
@@ -159,7 +160,116 @@
 \end{lstlisting}%$
 
 
-\subsection*{Part 1 (6 Marks)}
+\subsection*{Preliminary Part (4 Marks)}
+
+The \emph{Shunting Yard Algorithm} has been developed by Edsger Dijkstra,
+an influential computer scientist who developed many well-known
+algorithms. This algorithm transforms the usual infix notation of
+arithmetic expressions into the postfix notation, sometimes also
+called reverse Polish notation.
+
+Why on Earth do people use the postfix notation? It is much more
+convenient to work with the usual infix notation for arithmetic
+expressions. Most modern calculators (as opposed to the ones used 20
+years ago) understand infix notation. So why on Earth? \ldots{}Well,
+many computers under the hood, even nowadays, use postfix notation
+extensively. For example if you give to the Java compiler the
+expression $1 + ((2 * 3) + (4 - 3))$, it will generate the Java Byte
+code
+
+\begin{lstlisting}[language=JVMIS,numbers=none]
+ldc 1 
+ldc 2 
+ldc 3 
+imul 
+ldc 4 
+ldc 3 
+isub 
+iadd 
+iadd
+\end{lstlisting}
+
+\noindent
+where the command \texttt{ldc} loads a constant onto the stack, and \texttt{imul},
+\texttt{isub} and \texttt{iadd} are commands acting on the stack. Clearly this
+is the arithmetic expression in postfix notation.\bigskip
+
+\noindent
+The shunting yard algorithm processes an input token list using an
+operator stack and an output list. The input consists of numbers,
+operators ($+$, $-$, $*$, $/$) and parentheses, and for the purpose of
+the assignment we assume the input is always a well-formed expression
+in infix notation.  The calculation in the shunting yard algorithm uses
+information about the
+precedences of the operators (given in the template file). The
+algorithm processes the input token list as follows:
+
+\begin{itemize}
+\item If there is a number as input token, then this token is
+  transferred directly to the output list. Then the rest of the input is
+  processed.
+\item If there is an operator as input token, then you need to check
+  what is on top of the operator stack. If there are operators with
+  a higher or equal precedence, these operators are first popped off
+  from the stack and moved to the output list. Then the operator from the input
+  is pushed onto the stack and the rest of the input is processed.
+\item If the input is a left-parenthesis, you push it on to the stack
+  and continue processing the input.
+\item If the input is a right-parenthesis, then you pop off all operators
+  from the stack to the output list until you reach the left-parenthesis.
+  Then you discharge the $($ and $)$ from the input and stack, and continue
+  processing the input list.
+\item If the input is empty, then you move all remaining operators
+  from the stack to the output list.  
+\end{itemize}  
+
+\subsubsection*{Tasks (file postfix.scala)}
+
+\begin{itemize}
+\item[(1)] Implement the shunting yard algorithm described above. The
+  function, called \texttt{syard}, takes a list of tokens as first
+  argument. The second and third arguments are the stack and output
+  list represented as Scala lists. The most convenient way to
+  implement this algorithm is to analyse what the input list, stack
+  and output list look like in each step using pattern-matching. The
+  algorithm transforms for example the input
+
+  \[
+  \texttt{List(3, +, 4, *, (, 2, -, 1, ))}
+  \]
+
+  into the postfix output
+
+  \[
+  \texttt{List(3, 4, 2, 1, -, *, +)}
+  \]  
+  
+  You can assume the input list is always a  list representing
+  a well-formed infix arithmetic expression.\hfill[1 Mark]
+
+\item[(2)] Implement a compute function that takes a postfix expression
+  as argument and evaluates it generating an integer as result. It uses a
+  stack to evaluate the postfix expression. The operators $+$, $-$, $*$
+  are as usual; $/$ is division on integers, for example $7 / 3 = 2$.
+  \hfill[1 Mark]
+\end{itemize}
+
+\subsubsection*{Task (file postfix2.scala)}
+
+\begin{itemize}
+\item[(3)] Extend the code in (7) and (8) to include the power
+  operator.  This requires proper account of associativity of
+  the operators. The power operator is right-associative, whereas the
+  other operators are left-associative.  Left-associative operators
+  are popped off if the precedence is bigger or equal, while
+  right-associative operators are only popped off if the precedence is
+  bigger. The compute function in this task should use
+  \texttt{Long}s, rather than \texttt{Int}s.\hfill[2 Marks]
+\end{itemize}
+
+
+
+\subsection*{Core Part (6 Marks)}
 
 The task is to implement a regular expression matcher that is based on
 derivatives of regular expressions. Most of the functions are defined by
@@ -220,7 +330,7 @@
 
 
 \begin{itemize}
-\item[(1)] Implement a function, called \textit{nullable}, by
+\item[(5)] Implement a function, called \textit{nullable}, by
   recursion over regular expressions. This function tests whether a
   regular expression can match the empty string. This means given a
   regular expression it either returns true or false. The function
@@ -238,7 +348,7 @@
 \end{tabular}
 \end{center}~\hfill[1 Mark]
 
-\item[(2)] Implement a function, called \textit{der}, by recursion over
+\item[(6)] Implement a function, called \textit{der}, by recursion over
   regular expressions. It takes a character and a regular expression
   as arguments and calculates the derivative of a xregular expression according
   to the rules:
@@ -294,7 +404,7 @@
 Note, the last derivative can match the empty string, that is it is \textit{nullable}.\\
 \mbox{}\hfill\mbox{[1 Mark]}
 
-\item[(3)] Implement the function \textit{simp}, which recursively
+\item[(7)] Implement the function \textit{simp}, which recursively
   traverses a regular expression, and on the way up simplifies every
   regular expression on the left (see below) to the regular expression
   on the right, except it does not simplify inside ${}^*$-regular
@@ -330,7 +440,7 @@
   simplification in an inside-out fashion, it is always clear which
   simplification should be applied next.\hfill[1 Mark]
 
-\item[(4)] Implement two functions: The first, called \textit{ders},
+\item[(8)] Implement two functions: The first, called \textit{ders},
   takes a list of characters and a regular expression as arguments, and
   builds the derivative w.r.t.~the list as follows:
 
@@ -353,7 +463,7 @@
 true for the regular expression $(a\cdot b)\cdot c$ and the string
 $abc$, but false if you give it the string $ab$. \hfill[1 Mark]
 
-\item[(5)] Implement a function, called \textit{size}, by recursion
+\item[(9)] Implement a function, called \textit{size}, by recursion
   over regular expressions. If a regular expression is seen as a tree,
   then \textit{size} should return the number of nodes in such a
   tree. Therefore this function is defined as follows:
@@ -493,112 +603,6 @@
 \end{center}
 \newpage
 
-\subsection*{Part 2 (4 Marks)}
-
-The \emph{Shunting Yard Algorithm} has been developed by Edsger Dijkstra,
-an influential computer scientist who developed many well-known
-algorithms. This algorithm transforms the usual infix notation of
-arithmetic expressions into the postfix notation, sometimes also
-called reverse Polish notation.
-
-Why on Earth do people use the postfix notation? It is much more
-convenient to work with the usual infix notation for arithmetic
-expressions. Most modern calculators (as opposed to the ones used 20
-years ago) understand infix notation. So why on Earth? \ldots{}Well,
-many computers under the hood, even nowadays, use postfix notation
-extensively. For example if you give to the Java compiler the
-expression $1 + ((2 * 3) + (4 - 3))$, it will generate the Java Byte
-code
-
-\begin{lstlisting}[language=JVMIS,numbers=none]
-ldc 1 
-ldc 2 
-ldc 3 
-imul 
-ldc 4 
-ldc 3 
-isub 
-iadd 
-iadd
-\end{lstlisting}
-
-\noindent
-where the command \texttt{ldc} loads a constant onto the stack, and \texttt{imul},
-\texttt{isub} and \texttt{iadd} are commands acting on the stack. Clearly this
-is the arithmetic expression in postfix notation.\bigskip
-
-\noindent
-The shunting yard algorithm processes an input token list using an
-operator stack and an output list. The input consists of numbers,
-operators ($+$, $-$, $*$, $/$) and parentheses, and for the purpose of
-the assignment we assume the input is always a well-formed expression
-in infix notation.  The calculation in the shunting yard algorithm uses
-information about the
-precedences of the operators (given in the template file). The
-algorithm processes the input token list as follows:
-
-\begin{itemize}
-\item If there is a number as input token, then this token is
-  transferred directly to the output list. Then the rest of the input is
-  processed.
-\item If there is an operator as input token, then you need to check
-  what is on top of the operator stack. If there are operators with
-  a higher or equal precedence, these operators are first popped off
-  from the stack and moved to the output list. Then the operator from the input
-  is pushed onto the stack and the rest of the input is processed.
-\item If the input is a left-parenthesis, you push it on to the stack
-  and continue processing the input.
-\item If the input is a right-parenthesis, then you pop off all operators
-  from the stack to the output list until you reach the left-parenthesis.
-  Then you discharge the $($ and $)$ from the input and stack, and continue
-  processing the input list.
-\item If the input is empty, then you move all remaining operators
-  from the stack to the output list.  
-\end{itemize}  
-
-\subsubsection*{Tasks (file postfix.scala)}
-
-\begin{itemize}
-\item[(7)] Implement the shunting yard algorithm described above. The
-  function, called \texttt{syard}, takes a list of tokens as first
-  argument. The second and third arguments are the stack and output
-  list represented as Scala lists. The most convenient way to
-  implement this algorithm is to analyse what the input list, stack
-  and output list look like in each step using pattern-matching. The
-  algorithm transforms for example the input
-
-  \[
-  \texttt{List(3, +, 4, *, (, 2, -, 1, ))}
-  \]
-
-  into the postfix output
-
-  \[
-  \texttt{List(3, 4, 2, 1, -, *, +)}
-  \]  
-  
-  You can assume the input list is always a  list representing
-  a well-formed infix arithmetic expression.\hfill[1 Mark]
-
-\item[(8)] Implement a compute function that takes a postfix expression
-  as argument and evaluates it generating an integer as result. It uses a
-  stack to evaluate the postfix expression. The operators $+$, $-$, $*$
-  are as usual; $/$ is division on integers, for example $7 / 3 = 2$.
-  \hfill[1 Mark]
-\end{itemize}
-
-\subsubsection*{Task (file postfix2.scala)}
-
-\begin{itemize}
-\item[(9)] Extend the code in (7) and (8) to include the power
-  operator.  This requires proper account of associativity of
-  the operators. The power operator is right-associative, whereas the
-  other operators are left-associative.  Left-associative operators
-  are popped off if the precedence is bigger or equal, while
-  right-associative operators are only popped off if the precedence is
-  bigger. The compute function in this task should use
-  \texttt{Long}s, rather than \texttt{Int}s.\hfill[2 Marks]
-\end{itemize}