6
|
1 |
\documentclass{article}
|
62
|
2 |
\usepackage{../style}
|
78
|
3 |
\usepackage{../langs}
|
118
|
4 |
\usepackage{../graphics}
|
6
|
5 |
|
|
6 |
\begin{document}
|
|
7 |
|
118
|
8 |
\section*{Replacement Coursework 2 (Automata)}
|
6
|
9 |
|
118
|
10 |
This coursework is worth 10\%. It is about deterministic and
|
121
|
11 |
non-deterministic finite automata. The coursework is due on 21 March
|
118
|
12 |
at 5pm. Make sure the files you submit can be processed by just
|
|
13 |
calling \texttt{scala <<filename.scala>>}.\bigskip
|
62
|
14 |
|
|
15 |
\noindent
|
|
16 |
\textbf{Important:} Do not use any mutable data structures in your
|
119
|
17 |
submission! They are not needed. This means you cannot use
|
62
|
18 |
\texttt{ListBuffer}s, for example. Do not use \texttt{return} in your
|
68
|
19 |
code! It has a different meaning in Scala, than in Java. Do not use
|
|
20 |
\texttt{var}! This declares a mutable variable. Make sure the
|
62
|
21 |
functions you submit are defined on the ``top-level'' of Scala, not
|
119
|
22 |
inside a class or object. Also note that when marking, the running time
|
|
23 |
will be restricted to a maximum of 360 seconds on my laptop.
|
86
|
24 |
|
6
|
25 |
|
100
|
26 |
\subsection*{Disclaimer}
|
6
|
27 |
|
105
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
28 |
It should be understood that the work you submit represents your own
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
29 |
effort! You have not copied from anyone else. An exception is the
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
30 |
Scala code I showed during the lectures or uploaded to KEATS, which
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
31 |
you can freely use.\bigskip
|
6
|
32 |
|
|
33 |
|
118
|
34 |
\subsection*{Part 1 (Deterministic Finite Automata)}
|
6
|
35 |
|
105
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
36 |
\noindent
|
118
|
37 |
There are many uses for Deterministic Finite Automata (DFAs), for
|
119
|
38 |
example for testing whether a string matches a pattern or not. DFAs
|
121
|
39 |
consist of some states (circles) and some transitions (edges) between
|
|
40 |
states. For example the DFA
|
118
|
41 |
|
|
42 |
\begin{center}
|
|
43 |
\begin{tikzpicture}[scale=1.5,>=stealth',very thick,auto,
|
|
44 |
every state/.style={minimum size=4pt,
|
|
45 |
inner sep=4pt,draw=blue!50,very thick,
|
|
46 |
fill=blue!20}]
|
|
47 |
\node[state, initial] (q0) at ( 0,1) {$Q_0$};
|
|
48 |
\node[state] (q1) at ( 1,1) {$Q_1$};
|
|
49 |
\node[state, accepting] (q2) at ( 2,1) {$Q_2$};
|
|
50 |
\path[->] (q0) edge[bend left] node[above] {$a$} (q1)
|
|
51 |
(q1) edge[bend left] node[above] {$b$} (q0)
|
|
52 |
(q2) edge[bend left=50] node[below] {$b$} (q0)
|
|
53 |
(q1) edge node[above] {$a$} (q2)
|
|
54 |
(q2) edge [loop right] node {$a$} ()
|
|
55 |
(q0) edge [loop below] node {$b$} ();
|
|
56 |
\end{tikzpicture}
|
|
57 |
\end{center}
|
|
58 |
|
|
59 |
\noindent
|
119
|
60 |
has three states ($Q_0$, $Q_1$ and $Q_2$), whereby $Q_0$ is the
|
|
61 |
starting state of the DFA and $Q_2$ is the accepting state. The latter
|
121
|
62 |
is indicated by double lines. In general, a DFA can have any number of
|
119
|
63 |
accepting states, but only a single starting state.
|
118
|
64 |
|
|
65 |
Transitions are edges between states labelled with a character. The
|
119
|
66 |
idea is that if we are in state $Q_0$, say, and get an $a$, we can go
|
|
67 |
to state $Q_1$. If we are in state $Q_2$ and get an $a$, we can stay
|
121
|
68 |
in state $Q_2$; if we get a $b$ in $Q_2$, then can go to state
|
119
|
69 |
$Q_0$. The main point of DFAs is that if we are in a state and get a
|
118
|
70 |
character, it is always clear which is the next state---there can only
|
|
71 |
be at most one. The task of Part 1 is to implement such DFAs in Scala
|
|
72 |
using partial functions for the transitions.
|
|
73 |
|
119
|
74 |
A string is accepted by a DFA, if we start in the starting state,
|
|
75 |
follow all transitions according to the string; if we end up in an
|
|
76 |
accepting state, then the string is accepted. If not, the string is
|
|
77 |
not accepted. The technical idea is that DFAs can be used to
|
|
78 |
accept strings from \emph{regular} languages.
|
|
79 |
|
118
|
80 |
\subsubsection*{Tasks}
|
105
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
81 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
82 |
\begin{itemize}
|
118
|
83 |
\item[(1)] Write a polymorphic function, called \texttt{share}, that
|
|
84 |
decides whether two sets share some elements (i.e.~the intersection
|
|
85 |
is not empty).\hfill[1 Mark]
|
|
86 |
|
119
|
87 |
\item[(2)] The transitions of DFAs will be implemented as partial
|
|
88 |
functions. These functions will have the type (state,
|
|
89 |
character)-pair to state, that is their input will be a (state,
|
|
90 |
character)-pair and they return a state. For example the transitions
|
|
91 |
of the DFA shown above can be defined as the following
|
|
92 |
partial function:
|
105
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
93 |
|
118
|
94 |
\begin{lstlisting}[language=Scala,numbers=none]
|
|
95 |
val dfa_trans : PartialFunction[(State,Char), State] =
|
|
96 |
{ case (Q0, 'a') => Q1
|
|
97 |
case (Q0, 'b') => Q0
|
|
98 |
case (Q1, 'a') => Q2
|
|
99 |
case (Q1, 'b') => Q0
|
|
100 |
case (Q2, 'a') => Q2
|
|
101 |
case (Q2, 'b') => Q0
|
|
102 |
}
|
|
103 |
\end{lstlisting}
|
105
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
104 |
|
119
|
105 |
The main point of partial functions (as opposed to ``normal''
|
|
106 |
functions) is that they do not have to be defined everywhere. For
|
|
107 |
example the transitions above only mention characters $a$ and $b$,
|
|
108 |
but leave out any other characters. Partial functions come with a
|
|
109 |
method \texttt{isDefinedAt} that can be used to check whether an
|
|
110 |
input produces a result or not. For example
|
105
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
111 |
|
121
|
112 |
\begin{lstlisting}[language=Scala,numbers=none]
|
118
|
113 |
dfa_trans.isDefinedAt((Q0, 'a'))
|
|
114 |
dfa_trans.isDefinedAt((Q0, 'c'))
|
121
|
115 |
\end{lstlisting}
|
118
|
116 |
|
|
117 |
\noindent
|
119
|
118 |
gives \texttt{true} in the first case and \texttt{false} in the
|
121
|
119 |
second. There is also a method \texttt{lift} that transforms a
|
119
|
120 |
partial function into a ``normal'' function returning an optional
|
|
121 |
value: if the partial function is defined on the input, the lifted
|
|
122 |
function will return \texttt{Some}; if it is not defined, then
|
|
123 |
\texttt{None}.
|
|
124 |
|
|
125 |
Write a function that takes a transition and a (state, character)-pair as arguments
|
118
|
126 |
and produces an optional state (the state specified by the partial transition
|
|
127 |
function whenever it is defined; if the transition function is undefined,
|
119
|
128 |
return \texttt{None}).\hfill\mbox{[1 Mark]}
|
105
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
129 |
|
118
|
130 |
\item[(3)]
|
|
131 |
Write a function that ``lifts'' the function in (2) from characters to strings. That
|
|
132 |
is, write a function that takes a transition, a state and a list of characters
|
|
133 |
as arguments and produces the state generated by following the transitions for
|
119
|
134 |
each character in the list. For example if you are in state $Q_0$ in the DFA above
|
|
135 |
and have the list \texttt{List(a,a,a,b,b,a)}, then you need to return the
|
|
136 |
state $Q_1$ (as option since there might not be such a state in general).\\
|
118
|
137 |
\mbox{}\hfill\mbox{[1 Mark]}
|
105
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
138 |
|
121
|
139 |
\item[(4)] DFAs are defined as a triple: (starting state, transitions,
|
119
|
140 |
set of accepting states). Write a function \texttt{accepts} that tests whether
|
|
141 |
a string is accepted by an DFA or not. For this start in the
|
|
142 |
starting state of the DFA, use the function under (3) to calculate
|
|
143 |
the state after following all transitions according to the
|
|
144 |
characters in the string. If the resulting state is an accepting state,
|
|
145 |
return \texttt{true}; otherwise \texttt{false}.\\\mbox{}\hfill\mbox{[1 Mark]}
|
105
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
146 |
\end{itemize}
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
147 |
|
Christian Urban <christian dot urban at kcl dot ac dot uk>
diff
changeset
|
148 |
|
118
|
149 |
\subsection*{Part 2 (Non-Deterministic Finite Automata)}
|
|
150 |
|
|
151 |
The main point of DFAs is that for every given state and character
|
|
152 |
there is at most one next state (one if the transition is defined;
|
|
153 |
none otherwise). However, this restriction to at most one state can be
|
|
154 |
quite limiting for some applications.\footnote{Though there is a
|
119
|
155 |
curious fact that every (less restricted) NFA can be translated into
|
|
156 |
an ``equivalent'' DFA, whereby accepting means accepting the same
|
|
157 |
set of strings. However this might increase drastically the number
|
|
158 |
of states in the DFA.} Non-Deterministic Automata (NFAs) remove
|
|
159 |
this restriction: there can be more than one starting state, and given
|
|
160 |
a state and a character there can be more than one next
|
|
161 |
state. Consider for example the NFA
|
118
|
162 |
|
|
163 |
\begin{center}
|
|
164 |
\begin{tikzpicture}[scale=0.7,>=stealth',very thick,
|
|
165 |
every state/.style={minimum size=0pt,
|
|
166 |
draw=blue!50,very thick,fill=blue!20},]
|
|
167 |
\node[state,initial] (R_1) {$R_1$};
|
|
168 |
\node[state,initial] (R_2) [above=of R_1] {$R_2$};
|
|
169 |
\node[state, accepting] (R_3) [right=of R_1] {$R_3$};
|
|
170 |
\path[->] (R_1) edge node [below] {$b$} (R_3);
|
|
171 |
\path[->] (R_2) edge [bend left] node [above] {$a$} (R_3);
|
|
172 |
\path[->] (R_1) edge [bend left] node [left] {$c$} (R_2);
|
|
173 |
\path[->] (R_2) edge [bend left] node [right] {$a$} (R_1);
|
|
174 |
\end{tikzpicture}
|
|
175 |
\end{center}
|
78
|
176 |
|
118
|
177 |
\noindent
|
119
|
178 |
where in state $R_2$ if we get an $a$, we can go to state $R_1$
|
|
179 |
\emph{or} $R_3$. If we want to find out whether an NFA accepts a
|
118
|
180 |
string, then we need to explore both possibilities. We will do this
|
119
|
181 |
``exploration'' in the tasks below in a breadth-first manner.
|
|
182 |
|
|
183 |
|
|
184 |
The feature of having more than one next state in NFAs will be
|
|
185 |
implemented by having a \emph{set} of partial transition functions
|
|
186 |
(DFAs had only one). For example the NFA shown above will be
|
|
187 |
represented by the set of partial functions
|
118
|
188 |
|
|
189 |
\begin{lstlisting}[language=Scala,numbers=none]
|
|
190 |
val nfa_trans : NTrans = Set(
|
|
191 |
{ case (R1, 'c') => R2 },
|
|
192 |
{ case (R1, 'b') => R3 },
|
|
193 |
{ case (R2, 'a') => R1 },
|
|
194 |
{ case (R2, 'a') => R3 }
|
|
195 |
)
|
|
196 |
\end{lstlisting}
|
|
197 |
|
|
198 |
\noindent
|
121
|
199 |
The point is that the 3rd element in this set makes sure that in state $R_2$ and
|
119
|
200 |
given an $a$, we can go to state $R_1$; and the 4th element, in $R_2$,
|
|
201 |
given an $a$, we can also go to state $R_3$. When following
|
|
202 |
transitions from a state, we have to look at all partial functions in
|
|
203 |
the set and generate the set of \emph{all} possible next states.
|
118
|
204 |
|
|
205 |
\subsubsection*{Tasks}
|
109
|
206 |
|
|
207 |
\begin{itemize}
|
118
|
208 |
\item[(5)]
|
|
209 |
Write a function \texttt{nnext} which takes a transition set, a state
|
|
210 |
and a character as arguments, and calculates all possible next states
|
|
211 |
(returned as set).\\
|
|
212 |
\mbox{}\hfill\mbox{[1 Mark]}
|
109
|
213 |
|
118
|
214 |
\item[(6)] Write a function \texttt{nnexts} which takes a transition
|
|
215 |
set, a \emph{set} of states and a character as arguments, and
|
|
216 |
calculates \emph{all} possible next states that can be reached from
|
119
|
217 |
any state in the set.\mbox{}\hfill\mbox{[1 Mark]}
|
109
|
218 |
|
118
|
219 |
\item[(7)] Like in (3), write a function \texttt{nnextss} that lifts
|
|
220 |
\texttt{nnexts} from (6) from single characters to lists of characters.
|
|
221 |
\mbox{}\hfill\mbox{[1 Mark]}
|
|
222 |
|
|
223 |
\item[(8)] NFAs are also defined as a triple: (set of staring states,
|
119
|
224 |
set of transitions, set of accepting states). Write a function
|
|
225 |
\texttt{naccepts} that tests whether a string is accepted by an NFA
|
118
|
226 |
or not. For this start in all starting states of the NFA, use the
|
|
227 |
function under (7) to calculate the set of states following all
|
119
|
228 |
transitions according to the characters in the string. If the
|
|
229 |
resulting set of states shares at least a single state with the set
|
|
230 |
of accepting states, return \texttt{true}; otherwise \texttt{false}.
|
|
231 |
Use the function under (1) in order to test whether these two sets
|
|
232 |
of states share any states or not.\mbox{}\hfill\mbox{[1 Mark]}
|
109
|
233 |
|
119
|
234 |
\item[(9)] Since we explore in functions (6) and (7) all possible next
|
|
235 |
states, we decide whether a string is accepted in a breadth-first
|
|
236 |
manner. (Depth-first would be to choose one state, follow all next
|
|
237 |
states of this single state; check whether it leads to an accepting
|
|
238 |
state. If not, we backtrack and choose another state). The
|
|
239 |
disadvantage of breadth-first search is that at every step a
|
|
240 |
non-empty set of states are ``active''\ldots{} states that need to
|
|
241 |
be followed at the same time. Write similar functions as in (7) and
|
|
242 |
(8), but instead of returning states or a boolean, calculate the
|
|
243 |
number of states that need to be followed in each step. The function
|
|
244 |
\texttt{max\_accept} should then return the maximum of all these
|
|
245 |
numbers.
|
109
|
246 |
|
119
|
247 |
As a test case, consider again the NFA shown above. At the beginning
|
|
248 |
the number of active states will be 2 (since there are two starting
|
|
249 |
states, namely $R_1$ and $R_2$). If we get an $a$, there will be
|
|
250 |
still 2 active states, namely $R_1$ and $R_3$ both reachable from
|
|
251 |
$R_2$. There is no transition for $a$ and $R_1$. So for a string,
|
|
252 |
say, $ab$ which is accepted by the NFA, the maximum number of active
|
|
253 |
states is 2 (it is not possible that all three states of this NFA
|
|
254 |
are active at the same time; is it possible that no state is
|
|
255 |
active?). \hfill\mbox{[2 Marks]}
|
118
|
256 |
|
109
|
257 |
|
|
258 |
\end{itemize}
|
|
259 |
|
6
|
260 |
|
|
261 |
\end{document}
|
|
262 |
|
68
|
263 |
|
6
|
264 |
%%% Local Variables:
|
|
265 |
%%% mode: latex
|
|
266 |
%%% TeX-master: t
|
|
267 |
%%% End:
|