23
|
1 |
\documentclass{article}
|
264
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
2 |
\usepackage{../style}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
3 |
\usepackage{../graphics}
|
23
|
4 |
|
|
5 |
\begin{document}
|
|
6 |
|
|
7 |
\section*{Homework 3}
|
|
8 |
|
916
|
9 |
%\HEADER
|
347
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
10 |
|
23
|
11 |
\begin{enumerate}
|
647
|
12 |
\item The regular expression matchers in Java, Python and Ruby can be
|
|
13 |
very slow with some (basic) regular expressions. What is the main
|
|
14 |
reason for this inefficient computation?
|
892
|
15 |
|
|
16 |
\solution{Many matchers employ DFS type of algorithms to check
|
|
17 |
if a string is matched by the regex or not. Such algorithms
|
|
18 |
require backtracking if have gone down the wrong path which
|
|
19 |
can be very slow. There are also problems with bounded regular
|
|
20 |
expressions and backreferences.}
|
647
|
21 |
|
401
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
22 |
\item What is a regular language? Are there alternative ways
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
23 |
to define this notion? If yes, give an explanation why
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
24 |
they define the same notion.
|
267
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
25 |
|
892
|
26 |
\solution{A regular language is a language for which every string
|
|
27 |
can be recognized by some regular expression. Another definition is
|
|
28 |
that it is a language for which a finite automaton can be
|
|
29 |
constructed. Both define the same set of languages.}
|
|
30 |
|
267
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
31 |
\item Why is every finite set of strings a regular language?
|
132
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
32 |
|
892
|
33 |
\solution{Take a regex composed of all strings (works for finite languages)}
|
|
34 |
|
401
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
35 |
\item Assume you have an alphabet consisting of the letters
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
36 |
$a$, $b$ and $c$ only. (1) Find a regular expression
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
37 |
that recognises the two strings $ab$ and $ac$. (2) Find
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
38 |
a regular expression that matches all strings
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
39 |
\emph{except} these two strings. Note, you can only use
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
40 |
regular expressions of the form
|
258
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
41 |
|
267
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
42 |
\begin{center} $r ::=
|
401
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
43 |
\ZERO \;|\; \ONE \;|\; c \;|\; r_1 + r_2 \;|\;
|
267
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
44 |
r_1 \cdot r_2 \;|\; r^*$
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
45 |
\end{center}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
46 |
|
647
|
47 |
%\item Define the function \textit{zeroable} which takes a
|
|
48 |
% regular expression as argument and returns a boolean.
|
|
49 |
% The function should satisfy the following property:
|
|
50 |
%
|
|
51 |
% \begin{center}
|
|
52 |
% $\textit{zeroable(r)} \;\text{if and only if}\;
|
|
53 |
% L(r) = \{\}$
|
|
54 |
% \end{center}
|
267
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
55 |
|
892
|
56 |
\solution{Done in the video but there I forgot to include the empty string.}
|
|
57 |
|
267
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
58 |
\item Given the alphabet $\{a,b\}$. Draw the automaton that has two
|
517
|
59 |
states, say $Q_0$ and $Q_1$. The starting state is $Q_0$ and the
|
|
60 |
final state is $Q_1$. The transition function is given by
|
258
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
61 |
|
267
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
62 |
\begin{center}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
63 |
\begin{tabular}{l}
|
517
|
64 |
$(Q_0, a) \rightarrow Q_0$\\
|
|
65 |
$(Q_0, b) \rightarrow Q_1$\\
|
|
66 |
$(Q_1, b) \rightarrow Q_1$
|
267
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
67 |
\end{tabular}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
68 |
\end{center}
|
258
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
69 |
|
267
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
70 |
What is the language recognised by this automaton?
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
71 |
|
941
|
72 |
\solution{
|
|
73 |
All strings consisting of 0 or more a's then 1 or more b's,
|
|
74 |
which is equivalent to the language of the regular
|
|
75 |
expression $a^* \cdot b \cdot b^*$.
|
|
76 |
}
|
937
|
77 |
|
355
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
78 |
\item Give a non-deterministic finite automaton that can
|
937
|
79 |
recognise the language $L(a\cdot (a + b)^* \cdot c)$.
|
|
80 |
|
|
81 |
\solution{It is already possible to just read off the automaton without
|
|
82 |
going through Thompson.}
|
267
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
83 |
|
517
|
84 |
\item Given a deterministic finite automaton $A(\varSigma, Q, Q_0, F,
|
355
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
85 |
\delta)$, define which language is recognised by this
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
86 |
automaton. Can you define also the language defined by a
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
87 |
non-deterministic automaton?
|
23
|
88 |
|
892
|
89 |
|
|
90 |
\solution{
|
|
91 |
A formula for DFAs is
|
|
92 |
|
|
93 |
\[L(A) \dn \{s \;|\; \hat{\delta}(start_q, s) \in F\}\]
|
|
94 |
|
|
95 |
For NFAs you need to first define what $\hat{\rho}$ means. If
|
|
96 |
$\rho$ is given as a relation, you can define:
|
|
97 |
|
|
98 |
\[
|
|
99 |
\hat{\rho}(qs, []) \dn qs \qquad
|
943
|
100 |
\hat{\rho}(qs, c::s) \dn \bigcup_{q\in qs} \hat{\rho}(\{ q' \; | \; \rho(q, c, q')\}, s)
|
892
|
101 |
\]
|
|
102 |
|
|
103 |
This ``collects'' all the states reachable in a breadth-first
|
|
104 |
manner. Once you have all the states reachable by an NFA, you can define
|
|
105 |
the language as
|
|
106 |
|
|
107 |
\[
|
|
108 |
L(N) \dn \{s \;|\; \hat{\rho}(qs_{start}, s) \cap F \not= \emptyset\}
|
|
109 |
\]
|
|
110 |
|
|
111 |
Here you test whether the all states reachable (for $s$) contain at least
|
|
112 |
a single accepting state.
|
|
113 |
|
|
114 |
}
|
|
115 |
|
941
|
116 |
|
|
117 |
|
355
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
118 |
\item Given the following deterministic finite automaton over
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
119 |
the alphabet $\{a, b\}$, find an automaton that
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
120 |
recognises the complement language. (Hint: Recall that
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
121 |
for the algorithm from the lectures, the automaton needs
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
122 |
to be in completed form, that is have a transition for
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
123 |
every letter from the alphabet.)
|
264
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
124 |
|
892
|
125 |
\solution{
|
|
126 |
Before exchanging accepting and non-accepting states, it is important that
|
940
|
127 |
the automaton is completed (meaning has a transition for every letter
|
892
|
128 |
of the alphabet). If not completed, you have to introduce a sink state.
|
|
129 |
|
940
|
130 |
For fun you can try out the example without
|
|
131 |
completion: Then the original automaton can recognise
|
892
|
132 |
strings of the form $a$, $ab...b$; but the ``uncompleted'' automaton would
|
|
133 |
recognise only the empty string.
|
|
134 |
}
|
|
135 |
|
267
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
136 |
\begin{center}
|
292
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
137 |
\begin{tikzpicture}[>=stealth',very thick,auto,
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
138 |
every state/.style={minimum size=0pt,
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
139 |
inner sep=2pt,draw=blue!50,very thick,
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
140 |
fill=blue!20},scale=2]
|
517
|
141 |
\node[state, initial] (q0) at ( 0,1) {$Q_0$};
|
|
142 |
\node[state, accepting] (q1) at ( 1,1) {$Q_1$};
|
267
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
143 |
\path[->] (q0) edge node[above] {$a$} (q1)
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
144 |
(q1) edge [loop right] node {$b$} ();
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
145 |
\end{tikzpicture}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
146 |
\end{center}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
147 |
|
264
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
148 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
149 |
|
267
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
150 |
%\item Given the following deterministic finite automaton
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
151 |
%
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
152 |
%\begin{center}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
153 |
%\begin{tikzpicture}[scale=3, line width=0.7mm]
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
154 |
% \node[state, initial] (q0) at ( 0,1) {$q_0$};
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
155 |
% \node[state,accepting] (q1) at ( 1,1) {$q_1$};
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
156 |
% \node[state, accepting] (q2) at ( 2,1) {$q_2$};
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
157 |
% \path[->] (q0) edge node[above] {$b$} (q1)
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
158 |
% (q1) edge [loop above] node[above] {$a$} ()
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
159 |
% (q2) edge [loop above] node[above] {$a, b$} ()
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
160 |
% (q1) edge node[above] {$b$} (q2)
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
161 |
% (q0) edge[bend right] node[below] {$a$} (q2)
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
162 |
% ;
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
163 |
%\end{tikzpicture}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
164 |
%\end{center}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
165 |
%find the corresponding minimal automaton. State clearly which nodes
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
166 |
%can be merged.
|
31
|
167 |
|
355
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
168 |
\item Given the following non-deterministic finite automaton
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
169 |
over the alphabet $\{a, b\}$, find a deterministic
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
170 |
finite automaton that recognises the same language:
|
267
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
171 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
172 |
\begin{center}
|
292
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
173 |
\begin{tikzpicture}[>=stealth',very thick,auto,
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
174 |
every state/.style={minimum size=0pt,
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
175 |
inner sep=2pt,draw=blue!50,very thick,
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
176 |
fill=blue!20},scale=2]
|
517
|
177 |
\node[state, initial] (q0) at ( 0,1) {$Q_0$};
|
|
178 |
\node[state] (q1) at ( 1,1) {$Q_1$};
|
|
179 |
\node[state, accepting] (q2) at ( 2,1) {$Q_2$};
|
267
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
180 |
\path[->] (q0) edge node[above] {$a$} (q1)
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
181 |
(q0) edge [loop above] node[above] {$b$} ()
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
182 |
(q0) edge [loop below] node[below] {$a$} ()
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
183 |
(q1) edge node[above] {$a$} (q2);
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
184 |
\end{tikzpicture}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
185 |
\end{center}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
186 |
|
941
|
187 |
\solution{
|
|
188 |
The DFA has three states Q0,Q1,Q2 with Q0 starting state and Q2 accepting.
|
|
189 |
The transitions are (Q0,a)-> Q1 (Q0,b)->Q0 (Q1,a)->Q2 (Q1,b)->Q0
|
|
190 |
(Q2,a)->Q2 (Q2,b)->Q0.
|
|
191 |
}
|
|
192 |
|
778
|
193 |
\item %%\textbf{(Deleted for 2017, 2018, 2019)}
|
517
|
194 |
Given the following deterministic finite automaton over the
|
271
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
195 |
alphabet $\{0, 1\}$, find the corresponding minimal automaton. In
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
196 |
case states can be merged, state clearly which states can be merged.
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
197 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
198 |
\begin{center}
|
292
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
199 |
\begin{tikzpicture}[>=stealth',very thick,auto,
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
200 |
every state/.style={minimum size=0pt,
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
201 |
inner sep=2pt,draw=blue!50,very thick,
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
202 |
fill=blue!20},scale=2]
|
517
|
203 |
\node[state, initial] (q0) at ( 0,1) {$Q_0$};
|
|
204 |
\node[state] (q1) at ( 1,1) {$Q_1$};
|
|
205 |
\node[state, accepting] (q4) at ( 2,1) {$Q_4$};
|
|
206 |
\node[state] (q2) at (0.5,0) {$Q_2$};
|
|
207 |
\node[state] (q3) at (1.5,0) {$Q_3$};
|
271
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
208 |
\path[->] (q0) edge node[above] {$0$} (q1)
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
209 |
(q0) edge node[right] {$1$} (q2)
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
210 |
(q1) edge node[above] {$0$} (q4)
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
211 |
(q1) edge node[right] {$1$} (q2)
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
212 |
(q2) edge node[above] {$0$} (q3)
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
213 |
(q2) edge [loop below] node {$1$} ()
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
214 |
(q3) edge node[left] {$0$} (q4)
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
215 |
(q3) edge [bend left=95, looseness = 2.2] node [left=2mm] {$1$} (q0)
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
216 |
(q4) edge [loop right] node {$0, 1$} ();
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
217 |
\end{tikzpicture}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
218 |
\end{center}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
219 |
|
892
|
220 |
\solution{Q0 and Q2 can be merged; and Q1 and Q3 as well}
|
|
221 |
|
267
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
222 |
\item Given the following finite deterministic automaton over the alphabet $\{a, b\}$:
|
264
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
223 |
|
267
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
224 |
\begin{center}
|
292
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
225 |
\begin{tikzpicture}[scale=2,>=stealth',very thick,auto,
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
226 |
every state/.style={minimum size=0pt,
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
227 |
inner sep=2pt,draw=blue!50,very thick,
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
228 |
fill=blue!20}]
|
517
|
229 |
\node[state, initial, accepting] (q0) at ( 0,1) {$Q_0$};
|
|
230 |
\node[state, accepting] (q1) at ( 1,1) {$Q_1$};
|
|
231 |
\node[state] (q2) at ( 2,1) {$Q_2$};
|
267
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
232 |
\path[->] (q0) edge[bend left] node[above] {$a$} (q1)
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
233 |
(q1) edge[bend left] node[above] {$b$} (q0)
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
234 |
(q2) edge[bend left=50] node[below] {$b$} (q0)
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
235 |
(q1) edge node[above] {$a$} (q2)
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
236 |
(q2) edge [loop right] node {$a$} ()
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
237 |
(q0) edge [loop below] node {$b$} ()
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
238 |
;
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
239 |
\end{tikzpicture}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
240 |
\end{center}
|
31
|
241 |
|
267
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
242 |
Give a regular expression that can recognise the same language as
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
243 |
this automaton. (Hint: If you use Brzozwski's method, you can assume
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
244 |
Arden's lemma which states that an equation of the form $q = q\cdot r + s$
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
245 |
has the unique solution $q = s \cdot r^*$.)
|
294
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
246 |
|
941
|
247 |
\solution{
|
|
248 |
$(b + ab + aa(a^*)b)^* \cdot (1 + a)$
|
|
249 |
}
|
|
250 |
|
294
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
251 |
\item If a non-deterministic finite automaton (NFA) has
|
770
|
252 |
$n$ states. How many states does a deterministic
|
|
253 |
automaton (DFA) that can recognise the same language
|
|
254 |
as the NFA maximal need?
|
294
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
255 |
|
937
|
256 |
\solution{$2^n$ in the worst-case and for some regexes the worst case
|
892
|
257 |
cannot be avoided.
|
|
258 |
|
|
259 |
Other comments: $r^{\{n\}}$ can only be represented as $n$
|
|
260 |
copies of the automaton for $r$, which can explode the automaton for bounded
|
|
261 |
regular expressions. Similarly, we have no idea how backreferences can be
|
|
262 |
represented as automaton.
|
|
263 |
}
|
|
264 |
|
937
|
265 |
\item Rust implements a non-backtracking regular expression matcher
|
|
266 |
based on the classic idea of DFAs. Still, some regular expressions
|
|
267 |
take a surprising amount of time for matching problems. Explain the
|
|
268 |
problem?
|
|
269 |
|
|
270 |
\solution{The problem has to do with bounded regular expressions,
|
|
271 |
such as $r^{\{n\}}$. They are represented as $n$-copies of some
|
|
272 |
automaton for $r$. If $n$ is large, then this can result in a
|
|
273 |
large memory-footprint and slow runtime.}
|
|
274 |
|
770
|
275 |
\item Prove that for all regular expressions $r$ we have
|
|
276 |
|
|
277 |
\begin{center}
|
|
278 |
$\textit{nullable}(r) \quad \text{if and only if}
|
|
279 |
\quad [] \in L(r)$
|
|
280 |
\end{center}
|
|
281 |
|
|
282 |
Write down clearly in each case what you need to prove
|
|
283 |
and what are the assumptions.
|
|
284 |
|
|
285 |
|
444
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
286 |
\item \POSTSCRIPT
|
23
|
287 |
\end{enumerate}
|
|
288 |
|
|
289 |
\end{document}
|
|
290 |
|
|
291 |
%%% Local Variables:
|
|
292 |
%%% mode: latex
|
|
293 |
%%% TeX-master: t
|
|
294 |
%%% End:
|