author | Christian Urban <christian.urban@kcl.ac.uk> |
Thu, 17 Nov 2022 11:34:50 +0000 | |
changeset 445 | b73e7ce91c10 |
parent 221 | 9e7897f25e13 |
permissions | -rw-r--r-- |
6 | 1 |
\documentclass{article} |
62 | 2 |
\usepackage{../style} |
78 | 3 |
\usepackage{../langs} |
118 | 4 |
\usepackage{../graphics} |
6 | 5 |
|
6 |
\begin{document} |
|
7 |
||
118 | 8 |
\section*{Replacement Coursework 2 (Automata)} |
6 | 9 |
|
118 | 10 |
This coursework is worth 10\%. It is about deterministic and |
121 | 11 |
non-deterministic finite automata. The coursework is due on 21 March |
118 | 12 |
at 5pm. Make sure the files you submit can be processed by just |
13 |
calling \texttt{scala <<filename.scala>>}.\bigskip |
|
62 | 14 |
|
15 |
\noindent |
|
16 |
\textbf{Important:} Do not use any mutable data structures in your |
|
119 | 17 |
submission! They are not needed. This means you cannot use |
62 | 18 |
\texttt{ListBuffer}s, for example. Do not use \texttt{return} in your |
68 | 19 |
code! It has a different meaning in Scala, than in Java. Do not use |
20 |
\texttt{var}! This declares a mutable variable. Make sure the |
|
62 | 21 |
functions you submit are defined on the ``top-level'' of Scala, not |
119 | 22 |
inside a class or object. Also note that when marking, the running time |
23 |
will be restricted to a maximum of 360 seconds on my laptop. |
|
86 | 24 |
|
6 | 25 |
|
100 | 26 |
\subsection*{Disclaimer} |
6 | 27 |
|
105
67ce930b5935
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
100
diff
changeset
|
28 |
It should be understood that the work you submit represents your own |
67ce930b5935
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
100
diff
changeset
|
29 |
effort! You have not copied from anyone else. An exception is the |
67ce930b5935
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
100
diff
changeset
|
30 |
Scala code I showed during the lectures or uploaded to KEATS, which |
67ce930b5935
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
100
diff
changeset
|
31 |
you can freely use.\bigskip |
6 | 32 |
|
33 |
||
118 | 34 |
\subsection*{Part 1 (Deterministic Finite Automata)} |
6 | 35 |
|
105
67ce930b5935
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
100
diff
changeset
|
36 |
\noindent |
118 | 37 |
There are many uses for Deterministic Finite Automata (DFAs), for |
119 | 38 |
example for testing whether a string matches a pattern or not. DFAs |
121 | 39 |
consist of some states (circles) and some transitions (edges) between |
40 |
states. For example the DFA |
|
118 | 41 |
|
42 |
\begin{center} |
|
43 |
\begin{tikzpicture}[scale=1.5,>=stealth',very thick,auto, |
|
44 |
every state/.style={minimum size=4pt, |
|
45 |
inner sep=4pt,draw=blue!50,very thick, |
|
46 |
fill=blue!20}] |
|
47 |
\node[state, initial] (q0) at ( 0,1) {$Q_0$}; |
|
48 |
\node[state] (q1) at ( 1,1) {$Q_1$}; |
|
49 |
\node[state, accepting] (q2) at ( 2,1) {$Q_2$}; |
|
50 |
\path[->] (q0) edge[bend left] node[above] {$a$} (q1) |
|
51 |
(q1) edge[bend left] node[above] {$b$} (q0) |
|
52 |
(q2) edge[bend left=50] node[below] {$b$} (q0) |
|
53 |
(q1) edge node[above] {$a$} (q2) |
|
54 |
(q2) edge [loop right] node {$a$} () |
|
55 |
(q0) edge [loop below] node {$b$} (); |
|
56 |
\end{tikzpicture} |
|
57 |
\end{center} |
|
58 |
||
59 |
\noindent |
|
119 | 60 |
has three states ($Q_0$, $Q_1$ and $Q_2$), whereby $Q_0$ is the |
61 |
starting state of the DFA and $Q_2$ is the accepting state. The latter |
|
121 | 62 |
is indicated by double lines. In general, a DFA can have any number of |
119 | 63 |
accepting states, but only a single starting state. |
118 | 64 |
|
65 |
Transitions are edges between states labelled with a character. The |
|
119 | 66 |
idea is that if we are in state $Q_0$, say, and get an $a$, we can go |
67 |
to state $Q_1$. If we are in state $Q_2$ and get an $a$, we can stay |
|
121 | 68 |
in state $Q_2$; if we get a $b$ in $Q_2$, then can go to state |
119 | 69 |
$Q_0$. The main point of DFAs is that if we are in a state and get a |
118 | 70 |
character, it is always clear which is the next state---there can only |
71 |
be at most one. The task of Part 1 is to implement such DFAs in Scala |
|
72 |
using partial functions for the transitions. |
|
73 |
||
119 | 74 |
A string is accepted by a DFA, if we start in the starting state, |
75 |
follow all transitions according to the string; if we end up in an |
|
76 |
accepting state, then the string is accepted. If not, the string is |
|
77 |
not accepted. The technical idea is that DFAs can be used to |
|
78 |
accept strings from \emph{regular} languages. |
|
79 |
||
118 | 80 |
\subsubsection*{Tasks} |
105
67ce930b5935
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
100
diff
changeset
|
81 |
|
67ce930b5935
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
100
diff
changeset
|
82 |
\begin{itemize} |
118 | 83 |
\item[(1)] Write a polymorphic function, called \texttt{share}, that |
84 |
decides whether two sets share some elements (i.e.~the intersection |
|
85 |
is not empty).\hfill[1 Mark] |
|
86 |
||
119 | 87 |
\item[(2)] The transitions of DFAs will be implemented as partial |
88 |
functions. These functions will have the type (state, |
|
89 |
character)-pair to state, that is their input will be a (state, |
|
90 |
character)-pair and they return a state. For example the transitions |
|
91 |
of the DFA shown above can be defined as the following |
|
92 |
partial function: |
|
105
67ce930b5935
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
100
diff
changeset
|
93 |
|
118 | 94 |
\begin{lstlisting}[language=Scala,numbers=none] |
95 |
val dfa_trans : PartialFunction[(State,Char), State] = |
|
96 |
{ case (Q0, 'a') => Q1 |
|
97 |
case (Q0, 'b') => Q0 |
|
98 |
case (Q1, 'a') => Q2 |
|
99 |
case (Q1, 'b') => Q0 |
|
100 |
case (Q2, 'a') => Q2 |
|
101 |
case (Q2, 'b') => Q0 |
|
102 |
} |
|
103 |
\end{lstlisting} |
|
105
67ce930b5935
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
100
diff
changeset
|
104 |
|
119 | 105 |
The main point of partial functions (as opposed to ``normal'' |
106 |
functions) is that they do not have to be defined everywhere. For |
|
107 |
example the transitions above only mention characters $a$ and $b$, |
|
108 |
but leave out any other characters. Partial functions come with a |
|
109 |
method \texttt{isDefinedAt} that can be used to check whether an |
|
110 |
input produces a result or not. For example |
|
105
67ce930b5935
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
100
diff
changeset
|
111 |
|
121 | 112 |
\begin{lstlisting}[language=Scala,numbers=none] |
118 | 113 |
dfa_trans.isDefinedAt((Q0, 'a')) |
114 |
dfa_trans.isDefinedAt((Q0, 'c')) |
|
121 | 115 |
\end{lstlisting} |
118 | 116 |
|
117 |
\noindent |
|
119 | 118 |
gives \texttt{true} in the first case and \texttt{false} in the |
121 | 119 |
second. There is also a method \texttt{lift} that transforms a |
119 | 120 |
partial function into a ``normal'' function returning an optional |
121 |
value: if the partial function is defined on the input, the lifted |
|
122 |
function will return \texttt{Some}; if it is not defined, then |
|
123 |
\texttt{None}. |
|
124 |
||
125 |
Write a function that takes a transition and a (state, character)-pair as arguments |
|
118 | 126 |
and produces an optional state (the state specified by the partial transition |
127 |
function whenever it is defined; if the transition function is undefined, |
|
119 | 128 |
return \texttt{None}).\hfill\mbox{[1 Mark]} |
105
67ce930b5935
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
100
diff
changeset
|
129 |
|
118 | 130 |
\item[(3)] |
131 |
Write a function that ``lifts'' the function in (2) from characters to strings. That |
|
132 |
is, write a function that takes a transition, a state and a list of characters |
|
133 |
as arguments and produces the state generated by following the transitions for |
|
119 | 134 |
each character in the list. For example if you are in state $Q_0$ in the DFA above |
135 |
and have the list \texttt{List(a,a,a,b,b,a)}, then you need to return the |
|
136 |
state $Q_1$ (as option since there might not be such a state in general).\\ |
|
118 | 137 |
\mbox{}\hfill\mbox{[1 Mark]} |
105
67ce930b5935
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
100
diff
changeset
|
138 |
|
121 | 139 |
\item[(4)] DFAs are defined as a triple: (starting state, transitions, |
119 | 140 |
set of accepting states). Write a function \texttt{accepts} that tests whether |
141 |
a string is accepted by an DFA or not. For this start in the |
|
142 |
starting state of the DFA, use the function under (3) to calculate |
|
143 |
the state after following all transitions according to the |
|
144 |
characters in the string. If the resulting state is an accepting state, |
|
145 |
return \texttt{true}; otherwise \texttt{false}.\\\mbox{}\hfill\mbox{[1 Mark]} |
|
105
67ce930b5935
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
100
diff
changeset
|
146 |
\end{itemize} |
67ce930b5935
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
100
diff
changeset
|
147 |
|
67ce930b5935
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
100
diff
changeset
|
148 |
|
118 | 149 |
\subsection*{Part 2 (Non-Deterministic Finite Automata)} |
150 |
||
151 |
The main point of DFAs is that for every given state and character |
|
152 |
there is at most one next state (one if the transition is defined; |
|
153 |
none otherwise). However, this restriction to at most one state can be |
|
154 |
quite limiting for some applications.\footnote{Though there is a |
|
119 | 155 |
curious fact that every (less restricted) NFA can be translated into |
156 |
an ``equivalent'' DFA, whereby accepting means accepting the same |
|
157 |
set of strings. However this might increase drastically the number |
|
158 |
of states in the DFA.} Non-Deterministic Automata (NFAs) remove |
|
159 |
this restriction: there can be more than one starting state, and given |
|
160 |
a state and a character there can be more than one next |
|
161 |
state. Consider for example the NFA |
|
118 | 162 |
|
163 |
\begin{center} |
|
164 |
\begin{tikzpicture}[scale=0.7,>=stealth',very thick, |
|
165 |
every state/.style={minimum size=0pt, |
|
166 |
draw=blue!50,very thick,fill=blue!20},] |
|
167 |
\node[state,initial] (R_1) {$R_1$}; |
|
168 |
\node[state,initial] (R_2) [above=of R_1] {$R_2$}; |
|
169 |
\node[state, accepting] (R_3) [right=of R_1] {$R_3$}; |
|
170 |
\path[->] (R_1) edge node [below] {$b$} (R_3); |
|
171 |
\path[->] (R_2) edge [bend left] node [above] {$a$} (R_3); |
|
172 |
\path[->] (R_1) edge [bend left] node [left] {$c$} (R_2); |
|
173 |
\path[->] (R_2) edge [bend left] node [right] {$a$} (R_1); |
|
174 |
\end{tikzpicture} |
|
175 |
\end{center} |
|
78 | 176 |
|
118 | 177 |
\noindent |
119 | 178 |
where in state $R_2$ if we get an $a$, we can go to state $R_1$ |
179 |
\emph{or} $R_3$. If we want to find out whether an NFA accepts a |
|
118 | 180 |
string, then we need to explore both possibilities. We will do this |
119 | 181 |
``exploration'' in the tasks below in a breadth-first manner. |
182 |
||
183 |
||
184 |
The feature of having more than one next state in NFAs will be |
|
185 |
implemented by having a \emph{set} of partial transition functions |
|
186 |
(DFAs had only one). For example the NFA shown above will be |
|
187 |
represented by the set of partial functions |
|
118 | 188 |
|
189 |
\begin{lstlisting}[language=Scala,numbers=none] |
|
190 |
val nfa_trans : NTrans = Set( |
|
191 |
{ case (R1, 'c') => R2 }, |
|
192 |
{ case (R1, 'b') => R3 }, |
|
193 |
{ case (R2, 'a') => R1 }, |
|
194 |
{ case (R2, 'a') => R3 } |
|
195 |
) |
|
196 |
\end{lstlisting} |
|
197 |
||
198 |
\noindent |
|
121 | 199 |
The point is that the 3rd element in this set makes sure that in state $R_2$ and |
119 | 200 |
given an $a$, we can go to state $R_1$; and the 4th element, in $R_2$, |
201 |
given an $a$, we can also go to state $R_3$. When following |
|
202 |
transitions from a state, we have to look at all partial functions in |
|
203 |
the set and generate the set of \emph{all} possible next states. |
|
118 | 204 |
|
205 |
\subsubsection*{Tasks} |
|
109 | 206 |
|
207 |
\begin{itemize} |
|
118 | 208 |
\item[(5)] |
209 |
Write a function \texttt{nnext} which takes a transition set, a state |
|
210 |
and a character as arguments, and calculates all possible next states |
|
211 |
(returned as set).\\ |
|
212 |
\mbox{}\hfill\mbox{[1 Mark]} |
|
109 | 213 |
|
118 | 214 |
\item[(6)] Write a function \texttt{nnexts} which takes a transition |
215 |
set, a \emph{set} of states and a character as arguments, and |
|
216 |
calculates \emph{all} possible next states that can be reached from |
|
119 | 217 |
any state in the set.\mbox{}\hfill\mbox{[1 Mark]} |
109 | 218 |
|
118 | 219 |
\item[(7)] Like in (3), write a function \texttt{nnextss} that lifts |
220 |
\texttt{nnexts} from (6) from single characters to lists of characters. |
|
221 |
\mbox{}\hfill\mbox{[1 Mark]} |
|
222 |
||
223 |
\item[(8)] NFAs are also defined as a triple: (set of staring states, |
|
119 | 224 |
set of transitions, set of accepting states). Write a function |
225 |
\texttt{naccepts} that tests whether a string is accepted by an NFA |
|
118 | 226 |
or not. For this start in all starting states of the NFA, use the |
227 |
function under (7) to calculate the set of states following all |
|
119 | 228 |
transitions according to the characters in the string. If the |
229 |
resulting set of states shares at least a single state with the set |
|
230 |
of accepting states, return \texttt{true}; otherwise \texttt{false}. |
|
231 |
Use the function under (1) in order to test whether these two sets |
|
232 |
of states share any states or not.\mbox{}\hfill\mbox{[1 Mark]} |
|
109 | 233 |
|
119 | 234 |
\item[(9)] Since we explore in functions (6) and (7) all possible next |
235 |
states, we decide whether a string is accepted in a breadth-first |
|
236 |
manner. (Depth-first would be to choose one state, follow all next |
|
237 |
states of this single state; check whether it leads to an accepting |
|
238 |
state. If not, we backtrack and choose another state). The |
|
239 |
disadvantage of breadth-first search is that at every step a |
|
240 |
non-empty set of states are ``active''\ldots{} states that need to |
|
241 |
be followed at the same time. Write similar functions as in (7) and |
|
242 |
(8), but instead of returning states or a boolean, calculate the |
|
243 |
number of states that need to be followed in each step. The function |
|
244 |
\texttt{max\_accept} should then return the maximum of all these |
|
245 |
numbers. |
|
109 | 246 |
|
119 | 247 |
As a test case, consider again the NFA shown above. At the beginning |
248 |
the number of active states will be 2 (since there are two starting |
|
249 |
states, namely $R_1$ and $R_2$). If we get an $a$, there will be |
|
250 |
still 2 active states, namely $R_1$ and $R_3$ both reachable from |
|
251 |
$R_2$. There is no transition for $a$ and $R_1$. So for a string, |
|
252 |
say, $ab$ which is accepted by the NFA, the maximum number of active |
|
253 |
states is 2 (it is not possible that all three states of this NFA |
|
254 |
are active at the same time; is it possible that no state is |
|
255 |
active?). \hfill\mbox{[2 Marks]} |
|
118 | 256 |
|
109 | 257 |
|
258 |
\end{itemize} |
|
259 |
||
6 | 260 |
|
261 |
\end{document} |
|
262 |
||
68 | 263 |
|
6 | 264 |
%%% Local Variables: |
265 |
%%% mode: latex |
|
266 |
%%% TeX-master: t |
|
267 |
%%% End: |