| author | Christian Urban <christian.urban@kcl.ac.uk> | 
| Wed, 17 Sep 2025 16:30:09 +0100 | |
| changeset 986 | 0e4c5d802f9d | 
| parent 966 | d82c91f85391 | 
| permissions | -rw-r--r-- | 
| 631 | 1  | 
% !TEX program = xelatex  | 
| 0 | 2  | 
\documentclass{article}
 | 
| 
249
 
377c59df7297
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
227 
diff
changeset
 | 
3  | 
\usepackage{../style}
 | 
| 0 | 4  | 
|
5  | 
\begin{document}
 | 
|
6  | 
||
7  | 
\section*{Homework 1}
 | 
|
8  | 
||
| 963 | 9  | 
%%\HEADER  | 
| 
331
 
a2c18456c6b7
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
294 
diff
changeset
 | 
10  | 
|
| 0 | 11  | 
\begin{enumerate}
 | 
| 
249
 
377c59df7297
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
227 
diff
changeset
 | 
12  | 
|
| 
401
 
5d85dc9779b1
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
394 
diff
changeset
 | 
13  | 
\item {\bf (Optional)} If you want to run the code presented
 | 
| 
 
5d85dc9779b1
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
394 
diff
changeset
 | 
14  | 
in the lectures, install the Scala programming language  | 
| 
 
5d85dc9779b1
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
394 
diff
changeset
 | 
15  | 
available (for free) from  | 
| 
249
 
377c59df7297
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
227 
diff
changeset
 | 
16  | 
|
| 743 | 17  | 
      \begin{center}
 | 
18  | 
        \url{http://www.scala-lang.org}
 | 
|
19  | 
      \end{center}
 | 
|
20  | 
||
| 964 | 21  | 
and the Ammonite REPL from  | 
22  | 
||
23  | 
       \begin{center}
 | 
|
24  | 
       \url{https://ammonite.io}
 | 
|
25  | 
       \end{center}      
 | 
|
| 0 | 26  | 
|
| 
401
 
5d85dc9779b1
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
394 
diff
changeset
 | 
27  | 
If you want to follow the code I present during the  | 
| 961 | 28  | 
lectures, it might be useful to install VS Code or Codium.  | 
| 964 | 29  | 
Please have a look at the handout about Ammonite and  | 
30  | 
if you need a refresher for Scala - I linked on KEATS  | 
|
31  | 
the Scala handout from PEP.  | 
|
32  | 
      %I will be using Scala Version 3.5, which has the \texttt{scala-cli}
 | 
|
33  | 
%REPL used in PEP already built in.  | 
|
| 961 | 34  | 
|
35  | 
%handout about Scala.  | 
|
36  | 
%Make sure Ammonite  | 
|
37  | 
%uses the Scala 3 compiler.  | 
|
| 0 | 38  | 
|
| 639 | 39  | 
%\item {\bf (Optional)} Have a look at the crawler programs.
 | 
40  | 
% Can you find a usage for them in your daily programming  | 
|
41  | 
% life? Can you improve them? For example in cases there  | 
|
42  | 
% are links that appear on different recursion levels, the  | 
|
43  | 
% crawlers visit such web-pages several times. Can this be  | 
|
44  | 
% avoided? Also, the crawlers flag as problematic any page  | 
|
45  | 
% that gives an error, but probably only 404 Not Found  | 
|
46  | 
% errors should be flagged. Can you change that?)  | 
|
| 
104
 
ffde837b1db1
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
102 
diff
changeset
 | 
47  | 
|
| 640 | 48  | 
\item {\bf (Optional)} Have a look at the catastrophic backtracking
 | 
49  | 
programs uploaded on KEATS. Convince yourself that they really require  | 
|
50  | 
a lot of computation time. If you have similar examples in your own  | 
|
51  | 
favourite programming language, I am happy to hear about it.  | 
|
52  | 
||
53  | 
||
| 
401
 
5d85dc9779b1
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
394 
diff
changeset
 | 
54  | 
\item Read the handout of the first lecture and the handout  | 
| 
 
5d85dc9779b1
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
394 
diff
changeset
 | 
55  | 
about notation. Make sure you understand the concepts of  | 
| 498 | 56  | 
strings and languages. In the context of the CFL-course,  | 
| 
401
 
5d85dc9779b1
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
394 
diff
changeset
 | 
57  | 
      what is meant by the term \emph{language}?
 | 
| 9 | 58  | 
|
| 876 | 59  | 
      \solution{A language - in this context - is just a set of
 | 
60  | 
strings. Some of these sets can actually not be described by  | 
|
61  | 
regular expressions. Only regular​ languages can. This is  | 
|
62  | 
something for lecture 3.}  | 
|
63  | 
||
| 550 | 64  | 
\item Give the definition for regular expressions---this is an  | 
| 498 | 65  | 
inductive datatype. What is the  | 
| 
355
 
a259eec25156
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
331 
diff
changeset
 | 
66  | 
meaning of a regular expression? (Hint: The meaning is  | 
| 
 
a259eec25156
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
331 
diff
changeset
 | 
67  | 
defined recursively.)  | 
| 0 | 68  | 
|
| 876 | 69  | 
      \solution{Here I would also expect the grammar for basic regular
 | 
70  | 
expressions and the definition of the recursive L-function. Discuss  | 
|
71  | 
differences between $r_1 + r_2$ and $r^+$. Discuss differences between  | 
|
72  | 
``real-life regexes'' and regexes in this module.}  | 
|
73  | 
||
| 
355
 
a259eec25156
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
331 
diff
changeset
 | 
74  | 
\item Assume the concatenation operation of two strings is  | 
| 
 
a259eec25156
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
331 
diff
changeset
 | 
75  | 
written as $s_1 @ s_2$. Define the operation of  | 
| 
 
a259eec25156
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
331 
diff
changeset
 | 
76  | 
      \emph{concatenating} two sets of strings. This operation
 | 
| 
394
 
2f9fe225ecc8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
355 
diff
changeset
 | 
77  | 
is also written as $\_ \,@\, \_$. According to  | 
| 
 
2f9fe225ecc8
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
355 
diff
changeset
 | 
78  | 
      this definition, what is $A \,@\, \{\}$ equal to?
 | 
| 498 | 79  | 
Is in general $A\,@\,B$ equal to $B\,@\,A$?  | 
| 0 | 80  | 
|
| 876 | 81  | 
      \solution{ What is $A @ {[]}$? Are there special cases
 | 
| 961 | 82  | 
where $A @ B = B @ A$? Obviously when $A = B$ the stament is true.  | 
83  | 
        But there are also cases when $A \not= B$, for example $A = \{a\}$
 | 
|
84  | 
      and $B = \{aaa\}$.}
 | 
|
| 876 | 85  | 
|
| 
355
 
a259eec25156
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
331 
diff
changeset
 | 
86  | 
\item Assume a set $A$ contains 4 strings and a set $B$  | 
| 
 
a259eec25156
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
331 
diff
changeset
 | 
87  | 
contains 7 strings. None of the strings is the empty  | 
| 
 
a259eec25156
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
331 
diff
changeset
 | 
88  | 
string. How many strings are in $A \,@\, B$?  | 
| 
249
 
377c59df7297
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
227 
diff
changeset
 | 
89  | 
|
| 961 | 90  | 
      \solution{Everyone will probably answer with 28, but there are corner cases where there are fewer
 | 
| 876 | 91  | 
than 28 elements. Can students think of such corner cases?  | 
92  | 
      For example $A = \{a, ab, \ldots\}$, $B = \{bc, c,\ldots\}$ }
 | 
|
93  | 
||
| 
267
 
a1544b804d1e
updated homeworks
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
258 
diff
changeset
 | 
94  | 
\item How is the power of a language defined? (Hint: There are two  | 
| 
 
a1544b804d1e
updated homeworks
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
258 
diff
changeset
 | 
95  | 
  rules, one for $\_^0$ and one for $\_^{n+1}$.)
 | 
| 
109
 
f2a90dda7e3b
added
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
104 
diff
changeset
 | 
96  | 
|
| 876 | 97  | 
     \solution{Two rules: 0-case and n+1 case.}
 | 
98  | 
||
| 
355
 
a259eec25156
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
331 
diff
changeset
 | 
99  | 
\item Let $A = \{[a], [b], [c], [d]\}$. (1) How many strings
 | 
| 
438
 
84608b4b3578
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
416 
diff
changeset
 | 
100  | 
are in $A^4$? (2) Consider also the case of $A^4$ where one of  | 
| 
355
 
a259eec25156
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
331 
diff
changeset
 | 
101  | 
the strings in $A$ is the empty string, for example $A =  | 
| 
 
a259eec25156
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
331 
diff
changeset
 | 
102  | 
      \{[a], [b], [c], []\}$.
 | 
| 
293
 
ca349cfe3474
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
267 
diff
changeset
 | 
103  | 
|
| 876 | 104  | 
      \solution{121 is correct. But make sure you understand why it is 121
 | 
105  | 
in cases you do not have a computer at your fingertips.}  | 
|
106  | 
||
| 507 | 107  | 
\item (1) How many basic regular expressions are there to match  | 
| 776 | 108  | 
      \textbf{only} the string $abcd$? (2) How many if they cannot include
 | 
| 
444
 
3056a4c071b0
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
438 
diff
changeset
 | 
109  | 
$\ONE$ and $\ZERO$? (3) How many if they are also not  | 
| 
 
3056a4c071b0
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
438 
diff
changeset
 | 
110  | 
allowed to contain stars? (4) How many if they are also  | 
| 
401
 
5d85dc9779b1
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
394 
diff
changeset
 | 
111  | 
not allowed to contain $\_ + \_$?  | 
| 0 | 112  | 
|
| 961 | 113  | 
      \solution{1-3 are infinite (tell the idea why and give examples);
 | 
114  | 
4 is five - remember regexes are trees (that is the main point of the question.}  | 
|
| 876 | 115  | 
|
| 
355
 
a259eec25156
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
331 
diff
changeset
 | 
116  | 
\item When are two regular expressions equivalent? Can you  | 
| 
 
a259eec25156
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
331 
diff
changeset
 | 
117  | 
think of instances where two regular expressions match  | 
| 
 
a259eec25156
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
331 
diff
changeset
 | 
118  | 
the same strings, but it is not so obvious that they do?  | 
| 
 
a259eec25156
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
331 
diff
changeset
 | 
119  | 
For example $a + b$ and $b + a$ do not count\ldots they  | 
| 
 
a259eec25156
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
331 
diff
changeset
 | 
120  | 
obviously match the same strings, namely $[a]$ and  | 
| 
 
a259eec25156
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
331 
diff
changeset
 | 
121  | 
$[b]$.  | 
| 
403
 
564f7584eff1
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
401 
diff
changeset
 | 
122  | 
|
| 876 | 123  | 
      \solution{for example $r^* = 1 + r \cdot r^*$ for any regular expression $r$.
 | 
| 966 | 124  | 
Can students think about why this is the case? - this would need a proof.\bigskip  | 
125  | 
||
126  | 
||
127  | 
        Lemma to prove: $A^* = \{[]\} \cup A \times A^*$ for any language $A$.\bigskip
 | 
|
128  | 
||
129  | 
        The definition of ${}^*$: $\bigcup n. A^n$\bigskip
 | 
|
130  | 
||
131  | 
We first show $\subseteq$ of the lemma: a string $s \in A^*$, must also be in  | 
|
132  | 
$\bigcup n. A^n$, which means there is an $n$ such that $s\in A^n$.  | 
|
133  | 
        If $n = 0$ then $s = []$ and $s$ is also in $\{[]\} \cup A \times A^*$.
 | 
|
134  | 
If $n$ is bigger than $0$, then $s\in A^n$, which means by  | 
|
135  | 
        definition of power that $s\in A \times A^{n - 1}$. But then
 | 
|
136  | 
also $s \in A \times A^*$. That is one direction.\bigskip  | 
|
137  | 
||
138  | 
        The other direction: Two cases: (i) $s\in \{[]\}$ then
 | 
|
139  | 
also $s\in A^*$. (ii) $s\in A \times A^*$ means there exists  | 
|
140  | 
an $n$ such that $s\in A\times A^n$. This in turn means  | 
|
141  | 
        $s\in A^{n + 1}$ (by definition of power). But this also means $s\in A^*$.
 | 
|
142  | 
        So $\{[]\} \cup A \times A^*$ is a subset of $A^*$. Done!
 | 
|
143  | 
}  | 
|
| 876 | 144  | 
|
| 416 | 145  | 
\item What is meant by the notions \emph{evil regular expressions}
 | 
| 726 | 146  | 
  and by \emph{catastrophic backtracking}?
 | 
147  | 
||
| 961 | 148  | 
  \solution{catastrophic backtracking also applies to other regexes,
 | 
149  | 
not just $(a^*)^*b$. Maybe  | 
|
150  | 
    \url{https://www.trevorlasn.com/blog/when-regex-goes-wrong/} is
 | 
|
151  | 
of help - even the CrowdStrike issue had an underlying problem  | 
|
152  | 
with a regex, though this one was not due to catastrophic  | 
|
153  | 
backtracking.}  | 
|
| 876 | 154  | 
|
| 726 | 155  | 
\item Given the regular expression $(a + b)^* \cdot b \cdot (a + b)^*$,  | 
| 841 | 156  | 
which of the following regular expressions are equivalent  | 
| 726 | 157  | 
|
158  | 
\begin{center}
 | 
|
159  | 
\begin{tabular}{ll}    
 | 
|
160  | 
1) & $(ab + bb)^* \cdot (a + b)^*$\\ % no  | 
|
161  | 
2) & $(a + b)^* \cdot (ba + bb + b) \cdot (a + b)^*$\\ % yes  | 
|
162  | 
3) & $(a + b)^* \cdot (a + b) \cdot (a + b)^*$ % no  | 
|
163  | 
\end{tabular}
 | 
|
164  | 
\end{center}
 | 
|
| 876 | 165  | 
|
166  | 
  \solution{no, yes (why?), no.}
 | 
|
| 921 | 167  | 
|
168  | 
||
169  | 
\item Given the extended regular expression \texttt{[b-d]a?e+},
 | 
|
170  | 
what does the equivalent basic regular expression look like?  | 
|
| 726 | 171  | 
|
| 934 | 172  | 
  \solution{$(b + c + d) \cdot (a + \ONE) \cdot (e \cdot e^*)$}
 | 
| 921 | 173  | 
|
174  | 
||
| 
403
 
564f7584eff1
updated
 
Christian Urban <christian dot urban at kcl dot ac dot uk> 
parents: 
401 
diff
changeset
 | 
175  | 
\item \POSTSCRIPT  | 
| 0 | 176  | 
\end{enumerate}
 | 
177  | 
||
178  | 
\end{document}
 | 
|
179  | 
||
180  | 
%%% Local Variables:  | 
|
181  | 
%%% mode: latex  | 
|
182  | 
%%% TeX-master: t  | 
|
183  | 
%%% End:  |