diff -r a1544b804d1e -r 18bef085a7ca handouts/ho03.tex
--- a/handouts/ho03.tex	Fri Oct 10 16:59:22 2014 +0100
+++ b/handouts/ho03.tex	Sat Oct 11 01:13:13 2014 +0100
@@ -1,33 +1,32 @@
 \documentclass{article}
 \usepackage{../style}
 \usepackage{../langs}
-
-\usepackage{xcolor}
-\usepackage{tikz}
-\usetikzlibrary{arrows}
-\usetikzlibrary{automata}
-\usetikzlibrary{shapes}
-\usetikzlibrary{shadows}
-\usetikzlibrary{positioning}
-\usetikzlibrary{calc}
-\usetikzlibrary{fit}
-\usetikzlibrary{backgrounds}
+\usepackage{../graphics}
 
 
 \begin{document}
 
-\section*{Handout 3}
+\section*{Handout 3 (Automata)}
 
-Let us have a closer look at automata and their relation to
-regular expressions. This will help us to understand why the
+Every formal language course I know of bombards you first with
+automata and then to a much, much smaller extend with regular
+expressions. As you can see, this course is turned upside
+down: regular expressions come first. The reason is that
+regular expressions are easier to reason about and the notion
+of derivatives, although already quite old, only became more
+widely known rather recently. Still let us in this lecture
+have a closer look at automata and their relation to regular
+expressions. This will help us with understanding why the
 regular expression matchers in Python and Ruby are so slow
-with certain regular expressions. 
+with certain regular expressions. The central definition
+is:\medskip 
 
+\noindent 
 A \emph{deterministic finite automaton} (DFA), say $A$, is
 defined by a four-tuple written $A(Q, q_0, F, \delta)$ where
 
 \begin{itemize}
-\item $Q$ is a set of states,
+\item $Q$ is a finite set of states,
 \item $q_0 \in Q$ is the start state,
 \item $F \subseteq Q$ are the accepting states, and
 \item $\delta$ is the transition function.
@@ -35,15 +34,17 @@
 
 \noindent The transition function determines how to
 ``transition'' from one state to the next state with respect
-to a character. We have the assumption that these functions do
-not need to be defined everywhere: so it can be the case that
-given a character there is no next state, in which case we
-need to raise a kind of ``raise an exception''. A typical
-example of a DFA is
+to a character. We have the assumption that these transition
+functions do not need to be defined everywhere: so it can be
+the case that given a character there is no next state, in
+which case we need to raise a kind of ``failure exception''. A
+typical example of a DFA is
 
 \begin{center}
 \begin{tikzpicture}[>=stealth',very thick,auto,
-                             every state/.style={minimum size=0pt,inner sep=2pt,draw=blue!50,very thick,fill=blue!20},]
+                    every state/.style={minimum size=0pt,
+                    inner sep=2pt,draw=blue!50,very thick,
+                    fill=blue!20},scale=2]
 \node[state,initial]  (q_0)  {$q_0$};
 \node[state] (q_1) [right=of q_0] {$q_1$};
 \node[state] (q_2) [below right=of q_0] {$q_2$};
@@ -61,11 +62,13 @@
 \end{tikzpicture}
 \end{center}
 
-\noindent The accepting state $q_4$ is indicated with double
-circles. It is possible that a DFA has no accepting states at
-all, or that the starting state is also an accepting state. In
-the case above the transition function is defined everywhere
-and can be given as a table as follows:
+\noindent In this graphical notation, the accepting state
+$q_4$ is indicated with double circles. Note that there can be
+more than one accepting state. It is also possible that a DFA
+has no accepting states at all, or that the starting state is
+also an accepting state. In the case above the transition
+function is defined everywhere and can be given as a table as
+follows:
 
 \[
 \begin{array}{lcl}
@@ -82,35 +85,44 @@
 \end{array}
 \]
 
-\noindent We need to define the notion of what language is
-accepted by an automaton. For this we lift the transition
-function $\delta$ from characters to strings as follows:
+We need to define the notion of what language is accepted by
+an automaton. For this we lift the transition function
+$\delta$ from characters to strings as follows:
 
 \[
 \begin{array}{lcl}
-\hat{\delta}(q, "")        & \dn & q\\
+\hat{\delta}(q, [])        & \dn & q\\
 \hat{\delta}(q, c\!::\!s) & \dn & \hat{\delta}(\delta(q, c), s)\\
 \end{array}
 \]
 
-\noindent Given a string this means we start in the starting
-state and take the first character of the string, follow to
-the next sate, then take the second character and so on. Once
-the string is exhausted and we end up in an accepting state,
-then this string is accepted. Otherwise it is not accepted. So
-$s$ in the \emph{language accepted by the automaton} $A(Q,
-q_0, F, \delta)$ iff
+\noindent This lifted transition function is often called
+``delta-hat''. Given a string, we start in the starting state
+and take the first character of the string, follow to the next
+sate, then take the second character and so on. Once the
+string is exhausted and we end up in an accepting state, then
+this string is accepted by the automaton. Otherwise it is not
+accepted. So $s$ is in the \emph{language accepted by the
+automaton} $A(Q, q_0, F, \delta)$ iff
 
 \[
 \hat{\delta}(q_0, s) \in F 
 \]
+
+\noindent I let you think about a definition that describes
+the set of strings accepted by an automaton.
   
 
-While with DFA it will always clear that given a character
-what the next state is, it will be useful to relax this
-restriction. The resulting construction is called a
-\emph{non-deterministic finite automaton} (NFA) given as a
-four-tuple $A(Q, q_0, F, \rho)$ where
+While with DFAs it will always clear that given a character
+what the next state is (potentially none), it will be useful
+to relax this restriction. That means we have several
+potential successor states. We even allow ``silent
+transitions'', also called epsilon-transitions. They allow us
+to go from one state to the next without having a character
+consumed. We label such silent transition with the letter
+$\epsilon$. The resulting construction is called a
+\emph{non-deterministic finite automaton} (NFA) given also as
+a four-tuple $A(Q, q_0, F, \rho)$ where
 
 \begin{itemize}
 \item $Q$ is a finite set of states
@@ -149,21 +161,26 @@
 \end{tabular}
 \end{center}
 
-\noindent There are a number of points you should note. Every
-DFA is a NFA, but not vice versa. The $\rho$ in NFAs is a
-transition \emph{relation} (DFAs have a transition function).
-The difference between a function and a relation is that a
-function has always a single output, while a relation gives,
-roughly speaking, several outputs. Look at the NFA on the
-right-hand side above: if you are currently in the state $r_2$
-and you read a character $a$, then you can transition to $r_1$
-\emph{or} $r_3$. Which route you take is not determined. This
-means if we need to decide whether a string is accepted by a
-NFA, we might have to explore all possibilities. Also there is
-a special transition in NFAs which is called
-\emph{epsilon-transition} or \emph{silent transition}. This
-transition means you do not have to ``consume'' no part of the
-input string, but ``silently'' change to a different state.
+\noindent There are, however, a number of points you should
+note. Every DFA is a NFA, but not vice versa. The $\rho$ in
+NFAs is a transition \emph{relation} (DFAs have a transition
+function). The difference between a function and a relation is
+that a function has always a single output, while a relation
+gives, roughly speaking, several outputs. Look at the NFA on
+the right-hand side above: if you are currently in the state
+$r_2$ and you read a character $a$, then you can transition to
+either $r_1$ \emph{or} $r_3$. Which route you take is not
+determined. This means if we need to decide whether a string
+is accepted by a NFA, we might have to explore all
+possibilities. Also there is the special silent transition in
+NFAs. As mentioned already this transition means you do not
+have to ``consume'' any part of the input string, but
+``silently'' change to a different state. In the left picture,
+for example, if you are in the starting state, you can 
+silently move either to $q_1$ or $q_2$.
+
+
+\subsection*{Thompson Construction}
 
 The reason for introducing NFAs is that there is a relatively
 simple (recursive) translation of regular expressions into
@@ -328,7 +345,7 @@
 
 \noindent and connect its accepting states to a new starting
 state via $\epsilon$-transitions. This new starting state is
-also an accepting state, because $r^*$ can also recognise the
+also an accepting state, because $r^*$ can recognise the
 empty string. This gives the following automaton for $r^*$:
 
 \begin{center}
@@ -354,6 +371,99 @@
 \noindent This construction of a NFA from a regular expression
 was invented by Ken Thompson in 1968.
 
+
+\subsection*{Subset Construction}
+
+What is interesting that for every NFA we can find a DFA which
+recognises the same language. This can be done by the
+\emph{subset construction}. Consider again the NFA on the 
+left, consisting of nodes labeled $0$, $1$ and $2$. 
+
+\begin{center}
+\begin{tabular}{c@{\hspace{10mm}}c}
+\begin{tikzpicture}[scale=0.7,>=stealth',very thick,
+                    every state/.style={minimum size=0pt,
+                    draw=blue!50,very thick,fill=blue!20},
+                    baseline=0mm]
+\node[state,initial]  (q_0)  {$0$};
+\node[state] (q_1) [above=of q_0] {$1$};
+\node[state, accepting] (q_2) [below=of q_0] {$2$};
+\path[->] (q_0) edge node [left]  {$\epsilon$} (q_1);
+\path[->] (q_0) edge node [left]  {$\epsilon$} (q_2);
+\path[->] (q_0) edge [loop right] node  {$a$} ();
+\path[->] (q_1) edge [loop above] node  {$a$} ();
+\path[->] (q_2) edge [loop below] node  {$b$} ();
+\end{tikzpicture}
+&
+\begin{tabular}{r|cl}
+nodes & $a$ & $b$\\
+\hline
+$\varnothing\phantom{\star}$ & $\varnothing$ & $\varnothing$\\
+$\{0\}\phantom{\star}$       & $\{0,1,2\}$   & $\{2\}$\\
+$\{1\}\phantom{\star}$       & $\{1\}$       & $\varnothing$\\
+$\{2\}\star$  & $\varnothing$ & $\{2\}$\\
+$\{0,1\}\phantom{\star}$     & $\{0,1,2\}$   & $\{2\}$\\
+$\{0,2\}\star$ & $\{0,1,2\}$   & $\{2\}$\\
+$\{1,2\}\star$ & $\{1\}$       & $\{2\}$\\
+s: $\{0,1,2\}\star$ & $\{0,1,2\}$ & $\{2\}$\\
+\end{tabular}
+\end{tabular}
+\end{center}
+
+\noindent The nodes of the DFA are given by calculating all
+subsets of the set of nodes of the NFA (seen in the nodes
+column on the right). The table shows the transition function
+for the DFA. The first row states that $\varnothing$ is the
+sink node which has transitions for $a$ and $b$ to itself.
+The next three lines are calculated as follows: 
+
+\begin{itemize}
+\item suppose you calculate the entry for the transition for
+      $a$ and the node $\{0\}$
+\item start from the node $0$ in the NFA
+\item do as many $\epsilon$-transition as you can obtaining a
+      set of nodes, in this case $\{0,1,2\}$
+\item filter out all notes that do not allow an $a$-transition
+      from this set, this excludes $2$ which does not permit a
+      $a$-transition
+\item from the remaining set, do as many $\epsilon$-transition
+      as you can, this yields $\{0,1,2\}$     
+\item the resulting set specifies the transition from $\{0\}$
+      when given an $a$ 
+\end{itemize}
+
+\noindent Similarly for the other entries in the rows for
+$\{0\}$, $\{1\}$ and $\{2\}$. The other rows are calculated by
+just taking the union of the single node entries. For example
+for $a$ and $\{0,1\}$ we need to take the union of $\{0,1,2\}$
+(for $0$) and $\{1\}$ (for $1$). The starting state of the DFA
+can be calculated from the starting state of the NFA, that is
+$0$, and then do as many $\epsilon$-transitions as possible.
+This gives $\{0,1,2\}$ which is the starting state of the DFA.
+One terminal states in the DFA are given by all sets that
+contain a $2$, which is the terminal state of the NFA. This
+completes the subset construction.
+
+There are two points to note: One is that the resulting DFA
+contains a number of ``dead'' nodes that are never reachable
+from the starting state (that is that the calculated DFA in
+this example is not a minimal DFA). Such dead nodes can be
+safely removed without changing the language that is
+recognised by the DFA. Another point is that in some cases the
+subset construction produces a DFA that does \emph{not}
+contain any dead nodes\ldots{}that means it calculates a
+minimal DFA. Which in turn means that in some cases the number
+of nodes by going from NFAs to DFAs exponentially increases,
+namely by $2^n$ (which is the number of subsets you can form
+for $n$ nodes). 
+
+
+\subsection*{Brzozowski's Method}
+
+\subsection*{Automata Minimization}
+
+\subsection*{Regular Languages and Automata}
+
 \end{document}
 
 %%% Local Variables: