414 q\stackrel{\epsilon}{\longrightarrow}\ldots\stackrel{\epsilon}{\longrightarrow} |
414 q\stackrel{\epsilon}{\longrightarrow}\ldots\stackrel{\epsilon}{\longrightarrow} |
415 \;\stackrel{a}{\longrightarrow}\; |
415 \;\stackrel{a}{\longrightarrow}\; |
416 \stackrel{\epsilon}{\longrightarrow}\ldots\stackrel{\epsilon}{\longrightarrow} q' |
416 \stackrel{\epsilon}{\longrightarrow}\ldots\stackrel{\epsilon}{\longrightarrow} q' |
417 \] |
417 \] |
418 |
418 |
419 \noindent and replace them with $q \stackrel{a}{\longrightarrow} |
419 \noindent where somewhere in the ``middle'' is an $a$-transition. We |
420 q'$. Doing this to the $\epsilon$NFA on the right-hand side above gives |
420 replace them with $q \stackrel{a}{\longrightarrow} q'$. Doing this to |
421 the NFA |
421 the $\epsilon$NFA on the right-hand side above gives the NFA |
422 |
422 |
423 \begin{center} |
423 \begin{center} |
424 \begin{tikzpicture}[>=stealth',very thick, |
424 \begin{tikzpicture}[>=stealth',very thick, |
425 every state/.style={minimum size=0pt,draw=blue!50,very thick,fill=blue!20},] |
425 every state/.style={minimum size=0pt,draw=blue!50,very thick,fill=blue!20},] |
426 \node[state,initial] (r_1) {$R_1$}; |
426 \node[state,initial] (r_1) {$R_1$}; |
622 \node (b_1) [right=of A_0] {$\ldots$}; |
622 \node (b_1) [right=of A_0] {$\ldots$}; |
623 \node[state, accepting] (c_1) [right=of b_1] {$\mbox{}$}; |
623 \node[state, accepting] (c_1) [right=of b_1] {$\mbox{}$}; |
624 \node[state, accepting] (c_2) [above=of c_1] {$\mbox{}$}; |
624 \node[state, accepting] (c_2) [above=of c_1] {$\mbox{}$}; |
625 \node[state, accepting] (c_3) [below=of c_1] {$\mbox{}$}; |
625 \node[state, accepting] (c_3) [below=of c_1] {$\mbox{}$}; |
626 \path[->] (t_1) edge (A_01); |
626 \path[->] (t_1) edge (A_01); |
627 \path[->] (t_2) edge node [above] {$\epsilon$} (A_01); |
627 \path[->] (t_2) edge node [above] {$\epsilon$s} (A_01); |
628 \path[->] (t_3) edge (A_01); |
628 \path[->] (t_3) edge (A_01); |
629 \path[->] (t_1) edge (A_02); |
629 \path[->] (t_1) edge (A_02); |
630 \path[->] (t_2) edge (A_02); |
630 \path[->] (t_2) edge (A_02); |
631 \path[->] (t_3) edge node [below] {$\epsilon$} (A_02); |
631 \path[->] (t_3) edge node [below] {$\epsilon$s} (A_02); |
632 |
632 |
633 \begin{pgfonlayer}{background} |
633 \begin{pgfonlayer}{background} |
634 \node (3) [rounded corners, inner sep=1mm, thick, |
634 \node (3) [rounded corners, inner sep=1mm, thick, |
635 draw=black!60, fill=black!20, fit= (Q_0) (c_1) (c_2) (c_3)] {}; |
635 draw=black!60, fill=black!20, fit= (Q_0) (c_1) (c_2) (c_3)] {}; |
636 \node [yshift=2mm] at (3.north) {$r_1\cdot r_2$}; |
636 \node [yshift=2mm] at (3.north) {$r_1\cdot r_2$}; |
1075 accepting-states function for the DFA is true henevner at least one |
1075 accepting-states function for the DFA is true henevner at least one |
1076 state in the subset is accepting (that is true) in the NFA.\medskip |
1076 state in the subset is accepting (that is true) in the NFA.\medskip |
1077 |
1077 |
1078 \noindent |
1078 \noindent |
1079 You might be able to spend some quality tinkering with this code and |
1079 You might be able to spend some quality tinkering with this code and |
1080 time to ponder. Then you will probably notice it is actually |
1080 time to ponder about it. Then you will probably notice it is actually |
1081 silly. The whole point of translating the NFA into a DFA via the |
1081 a bit silly. The whole point of translating the NFA into a DFA via the |
1082 subset construction is to make the decision of whether a string is |
1082 subset construction is to make the decision of whether a string is |
1083 accepted or not faster. Given the code above, the generated DFA will |
1083 accepted or not faster. Given the code above, the generated DFA will |
1084 be exactly as fast, or as slow, as the NFA we started with (actually |
1084 be exactly as fast, or as slow, as the NFA we started with (actually |
1085 it will even be a tiny bit slower). The reason is that we just re-use |
1085 it will even be a tiny bit slower). The reason is that we just re-use |
1086 the \texttt{nexts} function from the NFA. This fucntion implements the |
1086 the \texttt{nexts} function from the NFA. This function implements the |
1087 non-deterministic breadth-first search. You might be thinking: That |
1087 non-deterministic breadth-first search. You might be thinking: This |
1088 is cheating! \ldots{} Well, not quite as you will see later, but in |
1088 is cheating! \ldots{} Well, not quite as you will see later, but in |
1089 terms of speed we still need to work a bit in order to get |
1089 terms of speed we still need to work a bit in order to get |
1090 sometimes(!) a faster DFA. Let's do this next. |
1090 sometimes(!) a faster DFA. Let's do this next. |
1091 |
1091 |
1092 \subsection*{DFA Minimisation} |
1092 \subsection*{DFA Minimisation} |
1240 \path[->] (Q_4) edge [loop above] node {$a, b$} (); |
1240 \path[->] (Q_4) edge [loop above] node {$a, b$} (); |
1241 \end{tikzpicture} |
1241 \end{tikzpicture} |
1242 \end{center} |
1242 \end{center} |
1243 |
1243 |
1244 |
1244 |
|
1245 By the way, we are not bothering with implementing the above |
|
1246 minimisation algorith: while up to now all the transformations used |
|
1247 some clever composition of functions, the minimisation algorithm |
|
1248 cannot be implemented by just composing some functions. For this we |
|
1249 would require a more concrete representation of the transition |
|
1250 function (like maps). If we did this, however, then many advantages of |
|
1251 the functions would be thrown away. So the compromise is to not being |
|
1252 able to minimise (easily) our DFAs. |
|
1253 |
1245 \subsection*{Brzozowski's Method} |
1254 \subsection*{Brzozowski's Method} |
1246 |
1255 |
1247 I know tyhis is already a long, long rant: but after all it is a topic |
1256 I know tyhis is already a long, long rant: but after all it is a topic |
1248 that has been researched for more than 60 years. If you reflect on |
1257 that has been researched for more than 60 years. If you reflect on |
1249 what you have read so far, the story you can take a regular |
1258 what you have read so far, the story is that you can take a regular |
1250 expression, translate it via the Thompson Construction into an |
1259 expression, translate it via the Thompson Construction into an |
1251 $\epsilon$NFA, then translate it into a NFA by removing all |
1260 $\epsilon$NFA, then translate it into a NFA by removing all |
1252 $\epsilon$-transitions, and then via the subset construction obtain a |
1261 $\epsilon$-transitions, and then via the subset construction obtain a |
1253 DFA. In all steps we made sure the language, or which strings can be |
1262 DFA. In all steps we made sure the language, or which strings can be |
1254 recognised, stays the same. After the last section, we can even |
1263 recognised, stays the same. After the last section, we can even |
1255 minimise the DFA. But again we made sure the same language is |
1264 minimise the DFA (maybe not in code). But again we made sure the same |
1256 recognised. You might be wondering: Can we go into the other |
1265 language is recognised. You might be wondering: Can we go into the |
1257 direction? Can we go from a DFA and obtain a regular expression that |
1266 other direction? Can we go from a DFA and obtain a regular expression |
1258 can recognise the same language as the DFA?\medskip |
1267 that can recognise the same language as the DFA?\medskip |
1259 |
1268 |
1260 \noindent |
1269 \noindent |
1261 The answer is yes. Again there are several methods for calculating a |
1270 The answer is yes. Again there are several methods for calculating a |
1262 regular expression for a DFA. I will show you Brzozowski's method |
1271 regular expression for a DFA. I will show you Brzozowski's method |
1263 because it calculates a regular expression using quite familiar |
1272 because it calculates a regular expression using quite familiar |
1506 language and also not an automaton. One can actually prove that there |
1515 language and also not an automaton. One can actually prove that there |
1507 is no regular expression nor automaton for this language, but again |
1516 is no regular expression nor automaton for this language, but again |
1508 that would lead us too far afield for what we want to do in this |
1517 that would lead us too far afield for what we want to do in this |
1509 module. |
1518 module. |
1510 |
1519 |
|
1520 |
|
1521 \subsection*{Where Have Derivatives Gone?} |
|
1522 |
|
1523 By now you are probably fed up by this text. It is now way too long |
|
1524 for one lecture, but there is still one aspect of the |
|
1525 automata-connection I like to highlight for you. Perhaps by now you |
|
1526 are asking yourself: Where have the derivatives gone? Did we just |
|
1527 forget them? Well, they have a place in the picture of calculating a |
|
1528 DFA from the regular expression. |
|
1529 |
|
1530 To be done |
|
1531 |
|
1532 |
1511 %\section*{Further Reading} |
1533 %\section*{Further Reading} |
1512 |
1534 |
1513 %Compare what a ``human expert'' would create as an automaton for the |
1535 %Compare what a ``human expert'' would create as an automaton for the |
1514 %regular expression $a\cdot (b + c)^*$ and what the Thomson |
1536 %regular expression $a\cdot (b + c)^*$ and what the Thomson |
1515 %algorithm generates. |
1537 %algorithm generates. |