\documentclass{article}\usepackage{../style}\usepackage{../graphics}\begin{document}\section*{Homework 3}%\HEADER\begin{enumerate}\item The regular expression matchers in Java, Python and Ruby can be very slow with some (basic) regular expressions. What is the main reason for this inefficient computation? \solution{Many matchers employ DFS type of algorithms to check if a string is matched by the regex or not. Such algorithms require backtracking if have gone down the wrong path which can be very slow. There are also problems with bounded regular expressions and backreferences.}\item What is a regular language? Are there alternative ways to define this notion? If yes, give an explanation why they define the same notion. \solution{A regular language is a language for which every string can be recognized by some regular expression. Another definition is that it is a language for which a finite automaton can be constructed. Both define the same set of languages.} \item Why is every finite set of strings a regular language? \solution{Take a regex composed of all strings (works for finite languages)}\item Assume you have an alphabet consisting of the letters $a$, $b$ and $c$ only. (1) Find a regular expression that recognises the two strings $ab$ and $ac$. (2) Find a regular expression that matches all strings \emph{except} these two strings. Note, you can only use regular expressions of the form \begin{center} $r ::= \ZERO \;|\; \ONE \;|\; c \;|\; r_1 + r_2 \;|\; r_1 \cdot r_2 \;|\; r^*$ \end{center}%\item Define the function \textit{zeroable} which takes a% regular expression as argument and returns a boolean.% The function should satisfy the following property:%% \begin{center}% $\textit{zeroable(r)} \;\text{if and only if}\; % L(r) = \{\}$% \end{center} \solution{Done in the video but there I forgot to include the empty string.}\item Given the alphabet $\{a,b\}$. Draw the automaton that has two states, say $Q_0$ and $Q_1$. The starting state is $Q_0$ and the final state is $Q_1$. The transition function is given by \begin{center} \begin{tabular}{l} $(Q_0, a) \rightarrow Q_0$\\ $(Q_0, b) \rightarrow Q_1$\\ $(Q_1, b) \rightarrow Q_1$ \end{tabular} \end{center} What is the language recognised by this automaton?\item Give a non-deterministic finite automaton that can recognise the language $L(a\cdot (a + b)^* \cdot c)$.\item Given a deterministic finite automaton $A(\varSigma, Q, Q_0, F, \delta)$, define which language is recognised by this automaton. Can you define also the language defined by a non-deterministic automaton? \solution{ A formula for DFAs is \[L(A) \dn \{s \;|\; \hat{\delta}(start_q, s) \in F\}\] For NFAs you need to first define what $\hat{\rho}$ means. If $\rho$ is given as a relation, you can define: \[ \hat{\rho}(qs, []) \dn qs \qquad \hat{\rho}(qs, c::s) \dn \bigcup_{q\in qs} \{ q' \; | \; \rho(q, c, q')\} \] This ``collects'' all the states reachable in a breadth-first manner. Once you have all the states reachable by an NFA, you can define the language as \[ L(N) \dn \{s \;|\; \hat{\rho}(qs_{start}, s) \cap F \not= \emptyset\} \] Here you test whether the all states reachable (for $s$) contain at least a single accepting state. }\item Given the following deterministic finite automaton over the alphabet $\{a, b\}$, find an automaton that recognises the complement language. (Hint: Recall that for the algorithm from the lectures, the automaton needs to be in completed form, that is have a transition for every letter from the alphabet.) \solution{ Before exchanging accepting and non-accepting states, it is important that the automaton is completed (meamning has a transition for every letter of the alphabet). If not completed, you have to introduce a sink state. For fun you can try out the example with out completion: Then the original automaton can recognise strings of the form $a$, $ab...b$; but the ``uncompleted'' automaton would recognise only the empty string. } \begin{center} \begin{tikzpicture}[>=stealth',very thick,auto, every state/.style={minimum size=0pt, inner sep=2pt,draw=blue!50,very thick, fill=blue!20},scale=2] \node[state, initial] (q0) at ( 0,1) {$Q_0$}; \node[state, accepting] (q1) at ( 1,1) {$Q_1$}; \path[->] (q0) edge node[above] {$a$} (q1) (q1) edge [loop right] node {$b$} (); \end{tikzpicture} \end{center}%\item Given the following deterministic finite automaton%%\begin{center}%\begin{tikzpicture}[scale=3, line width=0.7mm]% \node[state, initial] (q0) at ( 0,1) {$q_0$};% \node[state,accepting] (q1) at ( 1,1) {$q_1$};% \node[state, accepting] (q2) at ( 2,1) {$q_2$};% \path[->] (q0) edge node[above] {$b$} (q1)% (q1) edge [loop above] node[above] {$a$} ()% (q2) edge [loop above] node[above] {$a, b$} ()% (q1) edge node[above] {$b$} (q2)% (q0) edge[bend right] node[below] {$a$} (q2)% ;%\end{tikzpicture}%\end{center}%find the corresponding minimal automaton. State clearly which nodes%can be merged.\item Given the following non-deterministic finite automaton over the alphabet $\{a, b\}$, find a deterministic finite automaton that recognises the same language: \begin{center} \begin{tikzpicture}[>=stealth',very thick,auto, every state/.style={minimum size=0pt, inner sep=2pt,draw=blue!50,very thick, fill=blue!20},scale=2] \node[state, initial] (q0) at ( 0,1) {$Q_0$}; \node[state] (q1) at ( 1,1) {$Q_1$}; \node[state, accepting] (q2) at ( 2,1) {$Q_2$}; \path[->] (q0) edge node[above] {$a$} (q1) (q0) edge [loop above] node[above] {$b$} () (q0) edge [loop below] node[below] {$a$} () (q1) edge node[above] {$a$} (q2); \end{tikzpicture} \end{center}\item %%\textbf{(Deleted for 2017, 2018, 2019)} Given the following deterministic finite automaton over the alphabet $\{0, 1\}$, find the corresponding minimal automaton. In case states can be merged, state clearly which states can be merged. \begin{center} \begin{tikzpicture}[>=stealth',very thick,auto, every state/.style={minimum size=0pt, inner sep=2pt,draw=blue!50,very thick, fill=blue!20},scale=2] \node[state, initial] (q0) at ( 0,1) {$Q_0$}; \node[state] (q1) at ( 1,1) {$Q_1$}; \node[state, accepting] (q4) at ( 2,1) {$Q_4$}; \node[state] (q2) at (0.5,0) {$Q_2$}; \node[state] (q3) at (1.5,0) {$Q_3$}; \path[->] (q0) edge node[above] {$0$} (q1) (q0) edge node[right] {$1$} (q2) (q1) edge node[above] {$0$} (q4) (q1) edge node[right] {$1$} (q2) (q2) edge node[above] {$0$} (q3) (q2) edge [loop below] node {$1$} () (q3) edge node[left] {$0$} (q4) (q3) edge [bend left=95, looseness = 2.2] node [left=2mm] {$1$} (q0) (q4) edge [loop right] node {$0, 1$} (); \end{tikzpicture} \end{center} \solution{Q0 and Q2 can be merged; and Q1 and Q3 as well}\item Given the following finite deterministic automaton over the alphabet $\{a, b\}$: \begin{center} \begin{tikzpicture}[scale=2,>=stealth',very thick,auto, every state/.style={minimum size=0pt, inner sep=2pt,draw=blue!50,very thick, fill=blue!20}] \node[state, initial, accepting] (q0) at ( 0,1) {$Q_0$}; \node[state, accepting] (q1) at ( 1,1) {$Q_1$}; \node[state] (q2) at ( 2,1) {$Q_2$}; \path[->] (q0) edge[bend left] node[above] {$a$} (q1) (q1) edge[bend left] node[above] {$b$} (q0) (q2) edge[bend left=50] node[below] {$b$} (q0) (q1) edge node[above] {$a$} (q2) (q2) edge [loop right] node {$a$} () (q0) edge [loop below] node {$b$} () ; \end{tikzpicture} \end{center} Give a regular expression that can recognise the same language as this automaton. (Hint: If you use Brzozwski's method, you can assume Arden's lemma which states that an equation of the form $q = q\cdot r + s$ has the unique solution $q = s \cdot r^*$.)\item If a non-deterministic finite automaton (NFA) has $n$ states. How many states does a deterministic automaton (DFA) that can recognise the same language as the NFA maximal need? \solution{ $2^n$ in the worst-case and for some regexes the worst case cannot be avoided. Other comments: $r^{\{n\}}$ can only be represented as $n$ copies of the automaton for $r$, which can explode the automaton for bounded regular expressions. Similarly, we have no idea how backreferences can be represented as automaton. }\item Prove that for all regular expressions $r$ we have\begin{center} $\textit{nullable}(r) \quad \text{if and only if} \quad [] \in L(r)$ \end{center} Write down clearly in each case what you need to prove and what are the assumptions. \item \POSTSCRIPT \end{enumerate}\end{document}%%% Local Variables: %%% mode: latex%%% TeX-master: t%%% End: