cws/main_cw03.tex
changeset 424 daf561a83ba6
parent 418 fa7f7144f2bb
child 425 957808dcb367
equal deleted inserted replaced
423:e9d14d58be3c 424:daf561a83ba6
   174 4000000: 0.99149 secs.
   174 4000000: 0.99149 secs.
   175 4500000: 1.15395 secs.
   175 4500000: 1.15395 secs.
   176 5000000: 1.29659 secs.
   176 5000000: 1.29659 secs.
   177 \end{lstlisting}%$
   177 \end{lstlisting}%$
   178 
   178 
       
   179 
   179 \subsection*{Preliminaries}
   180 \subsection*{Preliminaries}
   180 
   181 
   181 The task is to implement a regular expression matcher that is based on
   182 The task is to implement a regular expression matcher that is based on
   182 derivatives of regular expressions. Most of the functions are defined by
   183 derivatives of regular expressions. Most of the functions are defined by
   183 recursion over regular expressions and can be elegantly implemented
   184 recursion over regular expressions and can be elegantly implemented
   239 
   240 
   240 \noindent
   241 \noindent
   241 The alternative regular expressions comes in two versions: one is
   242 The alternative regular expressions comes in two versions: one is
   242 binary (+ / \texttt{ALT}) and the other is n-ary ($\sum$ /
   243 binary (+ / \texttt{ALT}) and the other is n-ary ($\sum$ /
   243 \texttt{ALTs}). The latter takes a list of regular expressions as
   244 \texttt{ALTs}). The latter takes a list of regular expressions as
   244 arguments.  In what follows we shall use $rs$ to stand for lists of
   245 argument.  In what follows we shall use $rs$ to stand for lists of
   245 regular expressions.  The binary alternative can be seen as an abbreviation,
   246 regular expressions. When the list is empty, we shall write $\sum\,[]$;
       
   247 if it is non-empty, we sometimes write $\sum\,[r_1,..., r_n]$.
       
   248 The binary alternative can be seen as an abbreviation,
   246 that is $r_1 + r_2 \dn \sum\,[r_1, r_2]$. As a result we can ignore the
   249 that is $r_1 + r_2 \dn \sum\,[r_1, r_2]$. As a result we can ignore the
   247 binary version and only implement the n-ary alternative.
   250 binary version and only implement the n-ary alternative.
   248 
   251 
   249 \begin{itemize}
   252 \begin{itemize}
   250 \item[(1)] Implement a function, called \textit{nullable}, by
   253 \item[(1)] Implement a function, called \textit{nullable}, by
   265 \end{tabular}
   268 \end{tabular}
   266 \end{center}~\hfill[0.5 Marks]
   269 \end{center}~\hfill[0.5 Marks]
   267 
   270 
   268 \item[(2)] Implement a function, called \textit{der}, by recursion over
   271 \item[(2)] Implement a function, called \textit{der}, by recursion over
   269   regular expressions. It takes a character and a regular expression
   272   regular expressions. It takes a character and a regular expression
   270   as arguments and calculates the derivative of a regular expression according
   273   as arguments and calculates the \emph{derivative} of a regular expression according
   271   to the rules:
   274   to the rules:
   272 
   275 
   273 \begin{center}
   276 \begin{center}
   274 \begin{tabular}{lcl}
   277 \begin{tabular}{lcl}
   275 $\textit{der}\;c\;(\ZERO)$ & $\dn$ & $\ZERO$\\
   278 $\textit{der}\;c\;(\ZERO)$ & $\dn$ & $\ZERO$\\
   335       3) &$rs = (\sum rs_1) :: rest$ & $\dn$ & $rs_1 ::: \textrm{flatten}\;rest$\\
   338       3) &$rs = (\sum rs_1) :: rest$ & $\dn$ & $rs_1 ::: \textrm{flatten}\;rest$\\
   336       4) &$rs = r :: rest$         & $\dn$ & $r :: \textrm{flatten}\;rest$ & (otherwise)\\
   339       4) &$rs = r :: rest$         & $\dn$ & $r :: \textrm{flatten}\;rest$ & (otherwise)\\
   337     \end{tabular}  
   340     \end{tabular}  
   338   \end{center}  
   341   \end{center}  
   339 
   342 
   340   The first clause just states that empty lists cannot be further
   343   The first clause states that empty lists cannot be further
   341   flattened. The second removes all $\ZERO$s from the list.
   344   flattened. The second removes the first $\ZERO$ from the list and recurses.
   342   The third is when the first regular expression is an \texttt{ALTs}, then
   345   The third is when the first regular expression is an \texttt{ALTs}, then
   343   the content of this alternative should be spilled out and appended
   346   the content of this alternative should be spilled out and appended
   344   with the flattened rest of the list. The last case is for all other
   347   with the flattened rest of the list. The last case is for all other
   345   cases where the head of the list is not $\ZERO$ and not an \texttt{ALTs},
   348   cases where the head of the list is not $\ZERO$ and not an \texttt{ALTs},
   346   then we just keep the head of the list and flatten the rest.
   349   then we just keep the head of the list and flatten the rest.
   364 
   367 
   365 The last case is as follows: first apply $simp$ to all regular
   368 The last case is as follows: first apply $simp$ to all regular
   366 expressions $r_1,.. ,r_n$; then flatten the resulting list using
   369 expressions $r_1,.. ,r_n$; then flatten the resulting list using
   367 \texttt{flts}; finally remove all duplicates in this list (this can be
   370 \texttt{flts}; finally remove all duplicates in this list (this can be
   368 done in Scala using the function
   371 done in Scala using the function
   369 \texttt{\_.distinct}). \textcolor{red}{When you perform these
   372 \texttt{\_.distinct}). When you perform these
   370   operations, you end up with three cases, namely where the list is
   373   operations, you end up with three cases, namely where the list is
   371   empty, contains a single element and ``otherwise''. These cases
   374   empty, contains a single element and ``otherwise''. These cases
   372   should be processed as follows}
   375   should be processed as follows
   373 \begin{center}
   376 \begin{center}
   374 \textcolor{red}{\begin{tabular}{l@{\hspace{4mm}}c@{\hspace{4mm}}ll}
   377 \begin{tabular}{l@{\hspace{4mm}}c@{\hspace{4mm}}ll}
   375 $\sum\;[]$ & $\mapsto$ & $\ZERO$\\ 
   378 $\sum\;[]$ & $\mapsto$ & $\ZERO$\\ 
   376 $\sum\;[r]$ & $\mapsto$ & $r$\\ 
   379 $\sum\;[r]$ & $\mapsto$ & $r$\\ 
   377 $\sum\;rs$ & $\mapsto$ & $\sum\;rs$ & ``otherwise''\\ 
   380 $\sum\;rs$ & $\mapsto$ & $\sum\;rs$ & ``otherwise''\\ 
   378 \end{tabular}}
   381 \end{tabular}
   379 \end{center}
   382 \end{center}
   380 
   383 
   381   
   384   
   382 
   385 
   383   For example the regular expression
   386   For example the regular expression
   513 than your off-the-shelf matcher in your bog-standard language.
   516 than your off-the-shelf matcher in your bog-standard language.
   514 
   517 
   515 \begin{center}
   518 \begin{center}
   516 \begin{tabular}{@{}cc@{}}
   519 \begin{tabular}{@{}cc@{}}
   517 \multicolumn{2}{c}{Graph: $(a^*)^*\cdot b$ and strings 
   520 \multicolumn{2}{c}{Graph: $(a^*)^*\cdot b$ and strings 
   518            $\underbrace{a\ldots a}_{n}$}\bigskip\\
   521            $\underbrace{a\ldots a}_{n}$}\medskip\\
   519   
   522   
   520 \begin{tikzpicture}
   523 \begin{tikzpicture}
   521 \begin{axis}[
   524 \begin{axis}[
   522     xlabel={$n$},
   525     xlabel={$n$},
   523     x label style={at={(1.05,0.0)}},
   526     x label style={at={(1.05,0.0)}},
   529     ymax=45,
   532     ymax=45,
   530     ytick={0,5,...,40},
   533     ytick={0,5,...,40},
   531     scaled ticks=false,
   534     scaled ticks=false,
   532     axis lines=left,
   535     axis lines=left,
   533     width=6cm,
   536     width=6cm,
   534     height=5.0cm, 
   537     height=5.5cm, 
   535     legend entries={Python, Java 8, JavaScript, Swift, Dart},  
   538     legend entries={Python, Java 8, JavaScript, Swift, Dart},  
   536     legend pos=north west,
   539     legend pos=north west,
   537     legend cell align=left]
   540     legend cell align=left]
   538 \addplot[blue,mark=*, mark options={fill=white}] table {re-python2.data};
   541 \addplot[blue,mark=*, mark options={fill=white}] table {re-python2.data};
   539 \addplot[cyan,mark=*, mark options={fill=white}] table {re-java.data};
   542 \addplot[cyan,mark=*, mark options={fill=white}] table {re-java.data};
   555     ymax=45,
   558     ymax=45,
   556     ytick={0,5,...,40},
   559     ytick={0,5,...,40},
   557     scaled ticks=false,
   560     scaled ticks=false,
   558     axis lines=left,
   561     axis lines=left,
   559     width=6cm,
   562     width=6cm,
   560     height=5.0cm, 
   563     height=5.5cm, 
   561     legend entries={Java 9},  
   564     legend entries={Java 9},  
   562     legend pos=north west]
   565     legend pos=north west]
   563 \addplot[cyan,mark=*, mark options={fill=white}] table {re-java9.data};
   566 \addplot[cyan,mark=*, mark options={fill=white}] table {re-java9.data};
   564 \end{axis}
   567 \end{axis}
   565 \end{tikzpicture}
   568 \end{tikzpicture}