author | Christian Urban <christian.urban@kcl.ac.uk> |
Fri, 11 Oct 2024 19:13:00 +0100 | |
changeset 967 | ce5de01b9632 |
parent 965 | 94f5cce73a4f |
permissions | -rw-r--r-- |
631 | 1 |
% !TEX program = xelatex |
0 | 2 |
\documentclass{article} |
249
377c59df7297
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
227
diff
changeset
|
3 |
\usepackage{../style} |
0 | 4 |
|
5 |
\begin{document} |
|
6 |
||
7 |
\section*{Homework 1} |
|
8 |
||
964 | 9 |
%%\HEADER |
331
a2c18456c6b7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
294
diff
changeset
|
10 |
|
0 | 11 |
\begin{enumerate} |
249
377c59df7297
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
227
diff
changeset
|
12 |
|
401
5d85dc9779b1
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
394
diff
changeset
|
13 |
\item {\bf (Optional)} If you want to run the code presented |
5d85dc9779b1
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
394
diff
changeset
|
14 |
in the lectures, install the Scala programming language |
5d85dc9779b1
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
394
diff
changeset
|
15 |
available (for free) from |
249
377c59df7297
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
227
diff
changeset
|
16 |
|
743 | 17 |
\begin{center} |
18 |
\url{http://www.scala-lang.org} |
|
19 |
\end{center} |
|
20 |
||
965 | 21 |
and the Ammonite REPL from |
22 |
||
23 |
\begin{center} |
|
24 |
\url{https://ammonite.io} |
|
25 |
\end{center} |
|
0 | 26 |
|
401
5d85dc9779b1
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
394
diff
changeset
|
27 |
If you want to follow the code I present during the |
962 | 28 |
lectures, it might be useful to install VS Code or Codium. |
965 | 29 |
Please have a look at the handout about Ammonite and |
30 |
if you need a refresher for Scala - I linked on KEATS |
|
31 |
the Scala handout from PEP. |
|
32 |
%I will be using Scala Version 3.5, which has the \texttt{scala-cli} |
|
33 |
%REPL used in PEP already built in. |
|
962 | 34 |
|
35 |
%handout about Scala. |
|
36 |
%Make sure Ammonite |
|
37 |
%uses the Scala 3 compiler. |
|
0 | 38 |
|
639 | 39 |
%\item {\bf (Optional)} Have a look at the crawler programs. |
40 |
% Can you find a usage for them in your daily programming |
|
41 |
% life? Can you improve them? For example in cases there |
|
42 |
% are links that appear on different recursion levels, the |
|
43 |
% crawlers visit such web-pages several times. Can this be |
|
44 |
% avoided? Also, the crawlers flag as problematic any page |
|
45 |
% that gives an error, but probably only 404 Not Found |
|
46 |
% errors should be flagged. Can you change that?) |
|
104
ffde837b1db1
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
102
diff
changeset
|
47 |
|
640 | 48 |
\item {\bf (Optional)} Have a look at the catastrophic backtracking |
49 |
programs uploaded on KEATS. Convince yourself that they really require |
|
50 |
a lot of computation time. If you have similar examples in your own |
|
51 |
favourite programming language, I am happy to hear about it. |
|
52 |
||
53 |
||
401
5d85dc9779b1
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
394
diff
changeset
|
54 |
\item Read the handout of the first lecture and the handout |
5d85dc9779b1
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
394
diff
changeset
|
55 |
about notation. Make sure you understand the concepts of |
498 | 56 |
strings and languages. In the context of the CFL-course, |
401
5d85dc9779b1
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
394
diff
changeset
|
57 |
what is meant by the term \emph{language}? |
9 | 58 |
|
876 | 59 |
\solution{A language - in this context - is just a set of |
60 |
strings. Some of these sets can actually not be described by |
|
61 |
regular expressions. Only regular​ languages can. This is |
|
62 |
something for lecture 3.} |
|
63 |
||
550 | 64 |
\item Give the definition for regular expressions---this is an |
498 | 65 |
inductive datatype. What is the |
355
a259eec25156
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
331
diff
changeset
|
66 |
meaning of a regular expression? (Hint: The meaning is |
a259eec25156
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
331
diff
changeset
|
67 |
defined recursively.) |
0 | 68 |
|
876 | 69 |
\solution{Here I would also expect the grammar for basic regular |
70 |
expressions and the definition of the recursive L-function. Discuss |
|
71 |
differences between $r_1 + r_2$ and $r^+$. Discuss differences between |
|
72 |
``real-life regexes'' and regexes in this module.} |
|
73 |
||
355
a259eec25156
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
331
diff
changeset
|
74 |
\item Assume the concatenation operation of two strings is |
a259eec25156
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
331
diff
changeset
|
75 |
written as $s_1 @ s_2$. Define the operation of |
a259eec25156
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
331
diff
changeset
|
76 |
\emph{concatenating} two sets of strings. This operation |
394
2f9fe225ecc8
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
355
diff
changeset
|
77 |
is also written as $\_ \,@\, \_$. According to |
2f9fe225ecc8
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
355
diff
changeset
|
78 |
this definition, what is $A \,@\, \{\}$ equal to? |
498 | 79 |
Is in general $A\,@\,B$ equal to $B\,@\,A$? |
0 | 80 |
|
876 | 81 |
\solution{ What is $A @ {[]}$? Are there special cases |
962 | 82 |
where $A @ B = B @ A$? Obviously when $A = B$ the stament is true. |
83 |
But there are also cases when $A \not= B$, for example $A = \{a\}$ |
|
84 |
and $B = \{aaa\}$.} |
|
876 | 85 |
|
355
a259eec25156
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
331
diff
changeset
|
86 |
\item Assume a set $A$ contains 4 strings and a set $B$ |
a259eec25156
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
331
diff
changeset
|
87 |
contains 7 strings. None of the strings is the empty |
a259eec25156
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
331
diff
changeset
|
88 |
string. How many strings are in $A \,@\, B$? |
249
377c59df7297
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
227
diff
changeset
|
89 |
|
962 | 90 |
\solution{Everyone will probably answer with 28, but there are corner cases where there are fewer |
876 | 91 |
than 28 elements. Can students think of such corner cases? |
92 |
For example $A = \{a, ab, \ldots\}$, $B = \{bc, c,\ldots\}$ } |
|
93 |
||
267
a1544b804d1e
updated homeworks
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
258
diff
changeset
|
94 |
\item How is the power of a language defined? (Hint: There are two |
a1544b804d1e
updated homeworks
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
258
diff
changeset
|
95 |
rules, one for $\_^0$ and one for $\_^{n+1}$.) |
109
f2a90dda7e3b
added
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
104
diff
changeset
|
96 |
|
876 | 97 |
\solution{Two rules: 0-case and n+1 case.} |
98 |
||
355
a259eec25156
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
331
diff
changeset
|
99 |
\item Let $A = \{[a], [b], [c], [d]\}$. (1) How many strings |
438
84608b4b3578
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
416
diff
changeset
|
100 |
are in $A^4$? (2) Consider also the case of $A^4$ where one of |
355
a259eec25156
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
331
diff
changeset
|
101 |
the strings in $A$ is the empty string, for example $A = |
a259eec25156
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
331
diff
changeset
|
102 |
\{[a], [b], [c], []\}$. |
293
ca349cfe3474
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
267
diff
changeset
|
103 |
|
876 | 104 |
\solution{121 is correct. But make sure you understand why it is 121 |
105 |
in cases you do not have a computer at your fingertips.} |
|
106 |
||
507 | 107 |
\item (1) How many basic regular expressions are there to match |
776 | 108 |
\textbf{only} the string $abcd$? (2) How many if they cannot include |
444
3056a4c071b0
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
438
diff
changeset
|
109 |
$\ONE$ and $\ZERO$? (3) How many if they are also not |
3056a4c071b0
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
438
diff
changeset
|
110 |
allowed to contain stars? (4) How many if they are also |
401
5d85dc9779b1
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
394
diff
changeset
|
111 |
not allowed to contain $\_ + \_$? |
0 | 112 |
|
962 | 113 |
\solution{1-3 are infinite (tell the idea why and give examples); |
114 |
4 is five - remember regexes are trees (that is the main point of the question.} |
|
876 | 115 |
|
355
a259eec25156
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
331
diff
changeset
|
116 |
\item When are two regular expressions equivalent? Can you |
a259eec25156
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
331
diff
changeset
|
117 |
think of instances where two regular expressions match |
a259eec25156
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
331
diff
changeset
|
118 |
the same strings, but it is not so obvious that they do? |
a259eec25156
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
331
diff
changeset
|
119 |
For example $a + b$ and $b + a$ do not count\ldots they |
a259eec25156
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
331
diff
changeset
|
120 |
obviously match the same strings, namely $[a]$ and |
a259eec25156
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
331
diff
changeset
|
121 |
$[b]$. |
403
564f7584eff1
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
401
diff
changeset
|
122 |
|
876 | 123 |
\solution{for example $r^* = 1 + r \cdot r^*$ for any regular expression $r$. |
967 | 124 |
Can students think about why this is the case? - this would need a proof.\bigskip |
125 |
||
126 |
||
127 |
Lemma to prove: $A^* = \{[]\} \cup A \times A^*$ for any language $A$.\bigskip |
|
128 |
||
129 |
The definition of ${}^*$: $\bigcup n. A^n$\bigskip |
|
130 |
||
131 |
We first show $\subseteq$ of the lemma: a string $s \in A^*$, must also be in |
|
132 |
$\bigcup n. A^n$, which means there is an $n$ such that $s\in A^n$. |
|
133 |
If $n = 0$ then $s = []$ and $s$ is also in $\{[]\} \cup A \times A^*$. |
|
134 |
If $n$ is bigger than $0$, then $s\in A^n$, which means by |
|
135 |
definition of power that $s\in A \times A^{n - 1}$. But then |
|
136 |
also $s \in A \times A^*$. That is one direction.\bigskip |
|
137 |
||
138 |
The other direction: Two cases: (i) $s\in \{[]\}$ then |
|
139 |
also $s\in A^*$. (ii) $s\in A \times A^*$ means there exists |
|
140 |
an $n$ such that $s\in A\times A^n$. This in turn means |
|
141 |
$s\in A^{n + 1}$ (by definition of power). But this also means $s\in A^*$. |
|
142 |
So $\{[]\} \cup A \times A^*$ is a subset of $A^*$. Done! |
|
143 |
} |
|
876 | 144 |
|
416 | 145 |
\item What is meant by the notions \emph{evil regular expressions} |
726 | 146 |
and by \emph{catastrophic backtracking}? |
147 |
||
962 | 148 |
\solution{catastrophic backtracking also applies to other regexes, |
149 |
not just $(a^*)^*b$. Maybe |
|
150 |
\url{https://www.trevorlasn.com/blog/when-regex-goes-wrong/} is |
|
151 |
of help - even the CrowdStrike issue had an underlying problem |
|
152 |
with a regex, though this one was not due to catastrophic |
|
153 |
backtracking.} |
|
876 | 154 |
|
726 | 155 |
\item Given the regular expression $(a + b)^* \cdot b \cdot (a + b)^*$, |
841 | 156 |
which of the following regular expressions are equivalent |
726 | 157 |
|
158 |
\begin{center} |
|
159 |
\begin{tabular}{ll} |
|
160 |
1) & $(ab + bb)^* \cdot (a + b)^*$\\ % no |
|
161 |
2) & $(a + b)^* \cdot (ba + bb + b) \cdot (a + b)^*$\\ % yes |
|
162 |
3) & $(a + b)^* \cdot (a + b) \cdot (a + b)^*$ % no |
|
163 |
\end{tabular} |
|
164 |
\end{center} |
|
876 | 165 |
|
166 |
\solution{no, yes (why?), no.} |
|
922 | 167 |
|
168 |
||
169 |
\item Given the extended regular expression \texttt{[b-d]a?e+}, |
|
170 |
what does the equivalent basic regular expression look like? |
|
726 | 171 |
|
935 | 172 |
\solution{$(b + c + d) \cdot (a + \ONE) \cdot (e \cdot e^*)$} |
922 | 173 |
|
174 |
||
403
564f7584eff1
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
401
diff
changeset
|
175 |
\item \POSTSCRIPT |
0 | 176 |
\end{enumerate} |
177 |
||
178 |
\end{document} |
|
179 |
||
180 |
%%% Local Variables: |
|
181 |
%%% mode: latex |
|
182 |
%%% TeX-master: t |
|
183 |
%%% End: |