pep-material: comparison cws/cw03.tex

equal deleted inserted replaced

-:71e463b33a9e
+:85f2f75abeeb
 \documentclass{article}
 \usepackage{../style}
-%%\usepackage{../langs}
+\usepackage{../langs}
 \begin{document}
 \section*{Coursework 8 (Scala, Regular Expressions}
 This coursework is worth 10\%. It is about regular expressions and
 pattern matching. The first part is due on 30 November at 11pm; the
-second, more advanced part, is due on 7 December at 11pm. The
+second, more advanced part, is due on 7 December at 11pm. You are
-second part is not yet included. For the first part you are
 asked to implement a regular expression matcher. Make sure the files
 you submit can be processed by just calling \texttt{scala
 <<filename.scala>>}.\bigskip
 \noindent
 $\textit{ders}\;(c::cs)\;r$  & $\dn$ &
 $\textit{ders}\;cs\;(\textit{simp}(\textit{der}\;c\;r))$\\
 \end{tabular}
 \end{center}
-The second, called \textit{matcher}, takes a string and a regular expression
+Note that this function is different from \textit{der}, which only
-as arguments. It builds first the derivatives according to \textit{ders}
+takes a single character.
-and after that tests whether the resulting derivative regular expression can match
-the empty string (using \textit{nullable}).
+The second function, called \textit{matcher}, takes a string and a
-For example the \textit{matcher} will produce true given the
+regular expression as arguments. It builds first the derivatives
-regular expression $(a\cdot b)\cdot c$ and the string $abc$.
+according to \textit{ders} and after that tests whether the resulting
-\hfill[1 Mark]
+derivative regular expression can match the empty string (using
+\textit{nullable}).  For example the \textit{matcher} will produce
+true given the regular expression $(a\cdot b)\cdot c$ and the string
+$abc$.\\  \mbox{}\hfill[1 Mark]
 \item[(1e)] Implement the function $\textit{replace}\;r\;s_1\;s_2$: it searches
 (from the left to
 right) in the string $s_1$ all the non-empty substrings that match the
 regular expression $r$---these substrings are assumed to be
 are replaced by $s_2$. For example given the regular expression
 \[(a \cdot a)^* + (b \cdot b)\]
 \noindent the string $s_1 = aabbbaaaaaaabaaaaabbaaaabb$ and
-replacement string $s_2 = c$ yields the string
+replacement the string $s_2 = c$ yields the string
 \[
 ccbcabcaccc
 \]
-\hfill[2 Mark]
+\hfill[2 Marks]
 \end{itemize}
+\subsection*{Part 2 (4 Marks)}
+You need to copy all the code from \texttt{re.scala} into
+\texttt{re2.scala} in order to complete Part 2.  Parts (2a) and (2b)
+give you another method and datapoints for testing the \textit{der}
+and \textit{simp} functions from Part~1.
+\subsection*{Tasks (file re2.scala)}
+\begin{itemize}
+\item[(2a)] Write a \textbf{polymorphic} function, called
+\textit{iterT}, that is \textbf{tail-recursive}(!) and takes an
+integer $n$, a function $f$ and an $x$ as arguments. This function
+should iterate $f$ $n$-times starting with the argument $x$, like
+\[\underbrace{f(\ldots (f}_{n\text{-times}}(x)))
+\]
+More formally that means \textit{iterT} behaves as follows:
+\begin{center}
+\begin{tabular}{lcl}
+$\textit{iterT}(n, f, x)$ & $\dn$ &
+$\begin{cases}
+\;x & \textit{if}\;n = 0\\
+\;f(\textit{iterT}(n - 1, f, x)) & \textit{otherwise}
+\end{cases}$
+\end{tabular}
+\end{center}
+Make sure you write a \textbf{tail-recursive} version of
+\textit{iterT}.  If you add the annotation \texttt{@tailrec} (see
+below) you should not get an error message.
+\begin{lstlisting}[language=Scala, numbers=none, xleftmargin=-1mm]
+import scala.annotation.tailrec
+@tailrec
+def iterT[A](n: Int, f: A => A, x: A): A = ...
+\end{lstlisting}
+You can assume that \textit{iterT} will only be called for positive
+integers $0 \le n$. Given the type variable \texttt{A}, the type of
+$f$ is \texttt{A => A} and the type of $x$ is \texttt{A}. This means
+\textit{iterT} can be used, for example, for functions from integers
+to integers, or strings to strings.  \\ \mbox{}\hfill[2 Marks]
+\item[(2b)] Implement a function, called \textit{size}, by recursion
+over regular expressions. If a regular expression is seen as a tree,
+then \textit{size} should return the number of nodes in such a
+tree. Therefore this function is defined as follows:
+\begin{center}
+\begin{tabular}{lcl}
+$\textit{size}(\ZERO)$ & $\dn$ & $1$\\
+$\textit{size}(\ONE)$  & $\dn$ & $1$\\
+$\textit{size}(c)$     & $\dn$ & $1$\\
+$\textit{size}(r_1 + r_2)$ & $\dn$ & $1 + \textit{size}(r_1) + \textit{size}(r_2)$\\
+$\textit{size}(r_1 \cdot r_2)$ & $\dn$ & $1 + \textit{size}(r_1) + \textit{size}(r_2)$\\
+$\textit{size}(r^*)$ & $\dn$ & $1 + \textit{size}(r)$\\
+\end{tabular}
+\end{center}
+You can use \textit{size} and \textit{iterT} in order to test how much
+the 'evil' regular expression $(a^*)^* \cdot b$ grows when taking
+successive derivatives according the letter $a$ and compare it to
+taking the derivative, but simlifying the derivative after each step.
+For example, the calls
+\begin{lstlisting}[language=Scala, numbers=none, xleftmargin=-1mm]
+size(iterT(20, (r: Rexp) => der('a', r), EVIL))
+size(iterT(20, (r: Rexp) => simp(der('a', r)), EVIL))
+\end{lstlisting}
+produce without simplification a regular expression of size of
+7340068 for the derivative after 20 iterations, while the latter is
+just 8.\\ \mbox{}\hfill[1 Mark]
+\item[(2c)] Write a \textbf{polymorphic} function, called
+\textit{fixpT}, that takes
+a function $f$ and an $x$ as arguments. The purpose
+of \textit{fixpT} is to calculate a fixpoint of the function $f$
+starting from the argument $x$.
+A fixpoint, say $y$, is when $f(y) = y$ holds.
+That means \textit{fixpT} behaves as follows:
+\begin{center}
+\begin{tabular}{lcl}
+$\textit{fixpT}(f, x)$ & $\dn$ &
+$\begin{cases}
+\;x & \textit{if}\;f(x) = x\\
+\;\textit{fixpT}(f, f(x)) & \textit{otherwise}
+\end{cases}$
+\end{tabular}
+\end{center}
+Make sure you calculate in the code of $\textit{fixpT}$ the result
+of $f(x)$ only once. Given the type variable \texttt{A} in
+$\textit{fixpT}$, the type of $f$ is \texttt{A => A} and the type of
+$x$ is \texttt{A}. The file \texttt{re2.scala} gives two example
+function where in one the fixpoint is 1 and in the other
+it is the string $a$.\\ \mbox{}\hfill[1 Mark]
+\end{itemize}\bigskip
+\noindent
 \textbf{Background} Although easily implementable in Scala, the idea
 behind the derivative function might not so easy to be seen. To
 understand its purpose better, assume a regular expression $r$ can
 match strings of the form $c::cs$ (that means strings which start with
-a character $c$ and have some rest $cs$). If you now take the
+a character $c$ and have some rest, or tail, $cs$). If you now take the
 derivative of $r$ with respect to the character $c$, then you obtain a
 regular expressions that can match all the strings $cs$.  In other
-words the regular expression $\textit{der}\;c\;r$ can match the same
+words, the regular expression $\textit{der}\;c\;r$ can match the same
 strings $c::cs$ that can be matched by $r$, except that the $c$ is
 chopped off.
 Assume now $r$ can match the string $abc$. If you take the derivative
 according to $a$ then you obtain a regular expression that can match
 $bc$ (it is $abc$ where the $a$ has been chopped off). If you now
-build the derivative $\textit{der}\;b\;(\textit{der}\;a\;r))$ you obtain a regular
+build the derivative $\textit{der}\;b\;(\textit{der}\;a\;r))$ you
-expression that can match the string "c" (it is "bc" where 'b' is
+obtain a regular expression that can match the string $c$ (it is $bc$
-chopped off). If you finally build the derivative of this according
+where $b$ is chopped off). If you finally build the derivative of this
-'c', that is der('c', der('b', der('a', r))), you obtain a regular
+according $c$, that is
-expression that can match the empty string. You can test this using
+$\textit{der}\;c\;(\textit{der}\;b\;(\textit{der}\;a\;r)))$, you
-the function nullable, which is what your matcher is doing.
+obtain a regular expression that can match the empty string. You can
+test this using the function nullable, which is what your matcher is
-The purpose of the simp function is to keep the regular expression small. Normally the derivative function makes the regular expression bigger (see the SEQ case) and the algorithm would be slower and slower over time. The simp function counters this increase in size and the result is that the algorithm is fast throughout.
+doing.
-By the way this whole idea is by Janusz Brzozowski who came up with this in 1964 in his PhD thesis.
-https://en.wikipedia.org/wiki/Janusz_Brzozowski_(computer_scientist)
+The purpose of the simp function is to keep the regular expression
+small. Normally the derivative function makes the regular expression
+bigger (see the SEQ case and the example in (2b)) and the algorithm
+would be slower and slower over time. The simp function counters this
-\subsection*{Part 2 (4 Marks)}
+increase in size and the result is that the algorithm is fast
+throughout.  By the way, this algorithm is by Janusz Brzozowski who
-Coming soon.
+came up with the idea of derivatives in 1964 in his PhD thesis.
+\begin{center}\small
+\url{https://en.wikipedia.org/wiki/Janusz_Brzozowski_(computer_scientist)}
+\end{center}
 \end{document}
 %%% Local Variables:

changeset 78	85f2f75abeeb
parent 75	71e463b33a9e
child 79	2d57b0d43a0f