560 Page~\pageref{simplecases}. |
560 Page~\pageref{simplecases}. |
561 \label{thompson1}} |
561 \label{thompson1}} |
562 \end{figure} |
562 \end{figure} |
563 |
563 |
564 \begin{figure}[p] |
564 \begin{figure}[p] |
565 \small |
565 \small |
566 \lstinputlisting[numbers=left,firstline=48,firstnumber=48,lastline=85]{../progs/automata/thompson.sc} |
566 \lstinputlisting[numbers=left,firstline=48,firstnumber=48,lastline=85]{../progs/automata/thompson.sc} |
567 \caption{The second part of the Thompson Construction implementing |
567 \caption{The second part of the Thompson Construction implementing |
568 the composition of NFAs according to $\cdot$, $+$ and ${}^*$. |
568 the composition of NFAs according to $\cdot$, $+$ and ${}^*$. |
569 The implicit class (Lines 48--54) about rich partial functions |
569 The extension (Lines 48--54) about rich partial functions |
570 implements the infix operation \texttt{+++} which |
570 implements the infix operation \texttt{+++} which |
571 combines an $\epsilon$NFA transition with an NFA transition |
571 combines an $\epsilon$NFA transition with an NFA transition |
572 (both are given as partial functions---but with different type!).\label{thompson2}} |
572 (both are given as partial functions---but with different type!).\label{thompson2}} |
573 \end{figure} |
573 \end{figure} |
574 |
574 |
910 |
910 |
911 \noindent |
911 \noindent |
912 OK\ldots now you know why regular expression matchers in those |
912 OK\ldots now you know why regular expression matchers in those |
913 languages are sometimes so slow. A bit surprising, don't you agree? |
913 languages are sometimes so slow. A bit surprising, don't you agree? |
914 Also it is still a mystery why Rust, which because of the reasons |
914 Also it is still a mystery why Rust, which because of the reasons |
915 above avoids NFAs and uses DFAs instead cannot compete in all cases |
915 above avoids NFAs and uses DFAs instead, cannot compete in all cases |
916 with our simple derivative-based regular expression matcher in Scala. |
916 with our simple derivative-based regular expression matcher in Scala. |
|
917 There is an explanation for this as well\ldots{}remember there the |
|
918 offending examples are of the form $r^{\{n\}}$. Why could they be |
|
919 a problem in Rust? |
917 |
920 |
918 \subsection*{Subset Construction} |
921 \subsection*{Subset Construction} |
919 |
922 |
920 So of course, some clever developers of regular expression matchers are aware of |
923 So of course, some clever developers of regular expression matchers are aware of |
921 these problems with sluggish NFAs and try to address them. One common |
924 these problems with sluggish NFAs and try to address them. One common |
1621 \node[draw=none,fill=none] at (7,0.1) {\Large\ldots}; |
1624 \node[draw=none,fill=none] at (7,0.1) {\Large\ldots}; |
1622 \end{tikzpicture} |
1625 \end{tikzpicture} |
1623 \end{center} |
1626 \end{center} |
1624 |
1627 |
1625 \noindent |
1628 \noindent |
1626 I let you implement this ``automaton''. |
1629 You might want to implement this ``automaton''. What do you get? |
1627 |
1630 |
1628 While this makes all sense (modulo the flaw with the infinite states), |
1631 While this makes all sense (modulo the flaw with the infinite states), |
1629 does this automaton teach us anything new? The answer is no, because it |
1632 does this automaton teach us anything new? The answer is no, because it |
1630 boils down to just another implementation of the Brzozowski |
1633 boils down to just another implementation of the Brzozowski |
1631 algorithm from Lecture 2. There \emph{is} however something interesting |
1634 algorithm from Lecture 2. There \emph{is} however something interesting |
1632 in this construction |
1635 in this construction |
1633 which Brzozowski already cleverly found out, because there is |
1636 which Brzozowski already cleverly found out, because there is |
1634 a way to restrict the number of states to something finite. |
1637 a way to restrict the number of states to something finite. |
1635 Meaning it would give us a real automaton. |
1638 Meaning it would give us a real automaton. |
1636 However, this would lead us far, far away from what we want |
1639 However, this would lead us far, far away from what we want |
1637 discuss here. |
1640 discuss here. The end. |
1638 |
1641 |
1639 |
1642 |
1640 |
1643 |
1641 %\section*{Further Reading} |
1644 %\section*{Further Reading} |
1642 |
1645 |