| author | Christian Urban <christian.urban@kcl.ac.uk> | 
| Sun, 28 Sep 2025 14:06:57 +0100 | |
| changeset 993 | 9a85924835c0 | 
| parent 967 | ce5de01b9632 | 
| permissions | -rw-r--r-- | 
| 631 | 1 | % !TEX program = xelatex | 
| 0 | 2 | \documentclass{article}
 | 
| 249 
377c59df7297
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
227diff
changeset | 3 | \usepackage{../style}
 | 
| 0 | 4 | |
| 5 | \begin{document}
 | |
| 6 | ||
| 7 | \section*{Homework 1}
 | |
| 8 | ||
| 964 | 9 | %%\HEADER | 
| 331 
a2c18456c6b7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
294diff
changeset | 10 | |
| 0 | 11 | \begin{enumerate}
 | 
| 249 
377c59df7297
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
227diff
changeset | 12 | |
| 401 
5d85dc9779b1
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
394diff
changeset | 13 | \item {\bf (Optional)} If you want to run the code presented
 | 
| 
5d85dc9779b1
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
394diff
changeset | 14 | in the lectures, install the Scala programming language | 
| 
5d85dc9779b1
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
394diff
changeset | 15 | available (for free) from | 
| 249 
377c59df7297
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
227diff
changeset | 16 | |
| 743 | 17 |       \begin{center}
 | 
| 18 |         \url{http://www.scala-lang.org}
 | |
| 19 |       \end{center}
 | |
| 20 | ||
| 965 | 21 | and the Ammonite REPL from | 
| 22 | ||
| 23 |        \begin{center}
 | |
| 24 |        \url{https://ammonite.io}
 | |
| 25 |        \end{center}      
 | |
| 0 | 26 | |
| 401 
5d85dc9779b1
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
394diff
changeset | 27 | If you want to follow the code I present during the | 
| 962 | 28 | lectures, it might be useful to install VS Code or Codium. | 
| 965 | 29 | Please have a look at the handout about Ammonite and | 
| 30 | if you need a refresher for Scala - I linked on KEATS | |
| 31 | the Scala handout from PEP. | |
| 32 |       %I will be using Scala Version 3.5, which has the \texttt{scala-cli}
 | |
| 33 | %REPL used in PEP already built in. | |
| 962 | 34 | |
| 35 | %handout about Scala. | |
| 36 | %Make sure Ammonite | |
| 37 | %uses the Scala 3 compiler. | |
| 0 | 38 | |
| 639 | 39 | %\item {\bf (Optional)} Have a look at the crawler programs.
 | 
| 40 | % Can you find a usage for them in your daily programming | |
| 41 | % life? Can you improve them? For example in cases there | |
| 42 | % are links that appear on different recursion levels, the | |
| 43 | % crawlers visit such web-pages several times. Can this be | |
| 44 | % avoided? Also, the crawlers flag as problematic any page | |
| 45 | % that gives an error, but probably only 404 Not Found | |
| 46 | % errors should be flagged. Can you change that?) | |
| 104 
ffde837b1db1
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
102diff
changeset | 47 | |
| 640 | 48 | \item {\bf (Optional)} Have a look at the catastrophic backtracking
 | 
| 49 | programs uploaded on KEATS. Convince yourself that they really require | |
| 50 | a lot of computation time. If you have similar examples in your own | |
| 51 | favourite programming language, I am happy to hear about it. | |
| 52 | ||
| 53 | ||
| 401 
5d85dc9779b1
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
394diff
changeset | 54 | \item Read the handout of the first lecture and the handout | 
| 
5d85dc9779b1
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
394diff
changeset | 55 | about notation. Make sure you understand the concepts of | 
| 498 | 56 | strings and languages. In the context of the CFL-course, | 
| 401 
5d85dc9779b1
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
394diff
changeset | 57 |       what is meant by the term \emph{language}?
 | 
| 9 | 58 | |
| 876 | 59 |       \solution{A language - in this context - is just a set of
 | 
| 60 | strings. Some of these sets can actually not be described by | |
| 61 | regular expressions. Only regular​ languages can. This is | |
| 62 | something for lecture 3.} | |
| 63 | ||
| 550 | 64 | \item Give the definition for regular expressions---this is an | 
| 498 | 65 | inductive datatype. What is the | 
| 355 
a259eec25156
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
331diff
changeset | 66 | meaning of a regular expression? (Hint: The meaning is | 
| 
a259eec25156
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
331diff
changeset | 67 | defined recursively.) | 
| 0 | 68 | |
| 876 | 69 |       \solution{Here I would also expect the grammar for basic regular
 | 
| 70 | expressions and the definition of the recursive L-function. Discuss | |
| 71 | differences between $r_1 + r_2$ and $r^+$. Discuss differences between | |
| 72 | ``real-life regexes'' and regexes in this module.} | |
| 73 | ||
| 355 
a259eec25156
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
331diff
changeset | 74 | \item Assume the concatenation operation of two strings is | 
| 
a259eec25156
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
331diff
changeset | 75 | written as $s_1 @ s_2$. Define the operation of | 
| 
a259eec25156
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
331diff
changeset | 76 |       \emph{concatenating} two sets of strings. This operation
 | 
| 394 
2f9fe225ecc8
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
355diff
changeset | 77 | is also written as $\_ \,@\, \_$. According to | 
| 
2f9fe225ecc8
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
355diff
changeset | 78 |       this definition, what is $A \,@\, \{\}$ equal to?
 | 
| 498 | 79 | Is in general $A\,@\,B$ equal to $B\,@\,A$? | 
| 0 | 80 | |
| 876 | 81 |       \solution{ What is $A @ {[]}$? Are there special cases
 | 
| 962 | 82 | where $A @ B = B @ A$? Obviously when $A = B$ the stament is true. | 
| 83 |         But there are also cases when $A \not= B$, for example $A = \{a\}$
 | |
| 84 |       and $B = \{aaa\}$.}
 | |
| 876 | 85 | |
| 355 
a259eec25156
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
331diff
changeset | 86 | \item Assume a set $A$ contains 4 strings and a set $B$ | 
| 
a259eec25156
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
331diff
changeset | 87 | contains 7 strings. None of the strings is the empty | 
| 
a259eec25156
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
331diff
changeset | 88 | string. How many strings are in $A \,@\, B$? | 
| 249 
377c59df7297
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
227diff
changeset | 89 | |
| 962 | 90 |       \solution{Everyone will probably answer with 28, but there are corner cases where there are fewer
 | 
| 876 | 91 | than 28 elements. Can students think of such corner cases? | 
| 92 |       For example $A = \{a, ab, \ldots\}$, $B = \{bc, c,\ldots\}$ }
 | |
| 93 | ||
| 267 
a1544b804d1e
updated homeworks
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
258diff
changeset | 94 | \item How is the power of a language defined? (Hint: There are two | 
| 
a1544b804d1e
updated homeworks
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
258diff
changeset | 95 |   rules, one for $\_^0$ and one for $\_^{n+1}$.)
 | 
| 109 
f2a90dda7e3b
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
104diff
changeset | 96 | |
| 876 | 97 |      \solution{Two rules: 0-case and n+1 case.}
 | 
| 98 | ||
| 355 
a259eec25156
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
331diff
changeset | 99 | \item Let $A = \{[a], [b], [c], [d]\}$. (1) How many strings
 | 
| 438 
84608b4b3578
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
416diff
changeset | 100 | are in $A^4$? (2) Consider also the case of $A^4$ where one of | 
| 355 
a259eec25156
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
331diff
changeset | 101 | the strings in $A$ is the empty string, for example $A = | 
| 
a259eec25156
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
331diff
changeset | 102 |       \{[a], [b], [c], []\}$.
 | 
| 293 
ca349cfe3474
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
267diff
changeset | 103 | |
| 876 | 104 |       \solution{121 is correct. But make sure you understand why it is 121
 | 
| 105 | in cases you do not have a computer at your fingertips.} | |
| 106 | ||
| 507 | 107 | \item (1) How many basic regular expressions are there to match | 
| 776 | 108 |       \textbf{only} the string $abcd$? (2) How many if they cannot include
 | 
| 444 
3056a4c071b0
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
438diff
changeset | 109 | $\ONE$ and $\ZERO$? (3) How many if they are also not | 
| 
3056a4c071b0
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
438diff
changeset | 110 | allowed to contain stars? (4) How many if they are also | 
| 401 
5d85dc9779b1
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
394diff
changeset | 111 | not allowed to contain $\_ + \_$? | 
| 0 | 112 | |
| 962 | 113 |       \solution{1-3 are infinite (tell the idea why and give examples);
 | 
| 114 | 4 is five - remember regexes are trees (that is the main point of the question.} | |
| 876 | 115 | |
| 355 
a259eec25156
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
331diff
changeset | 116 | \item When are two regular expressions equivalent? Can you | 
| 
a259eec25156
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
331diff
changeset | 117 | think of instances where two regular expressions match | 
| 
a259eec25156
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
331diff
changeset | 118 | the same strings, but it is not so obvious that they do? | 
| 
a259eec25156
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
331diff
changeset | 119 | For example $a + b$ and $b + a$ do not count\ldots they | 
| 
a259eec25156
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
331diff
changeset | 120 | obviously match the same strings, namely $[a]$ and | 
| 
a259eec25156
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
331diff
changeset | 121 | $[b]$. | 
| 403 
564f7584eff1
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
401diff
changeset | 122 | |
| 876 | 123 |       \solution{for example $r^* = 1 + r \cdot r^*$ for any regular expression $r$.
 | 
| 967 | 124 | Can students think about why this is the case? - this would need a proof.\bigskip | 
| 125 | ||
| 126 | ||
| 127 |         Lemma to prove: $A^* = \{[]\} \cup A \times A^*$ for any language $A$.\bigskip
 | |
| 128 | ||
| 129 |         The definition of ${}^*$: $\bigcup n. A^n$\bigskip
 | |
| 130 | ||
| 131 | We first show $\subseteq$ of the lemma: a string $s \in A^*$, must also be in | |
| 132 | $\bigcup n. A^n$, which means there is an $n$ such that $s\in A^n$. | |
| 133 |         If $n = 0$ then $s = []$ and $s$ is also in $\{[]\} \cup A \times A^*$.
 | |
| 134 | If $n$ is bigger than $0$, then $s\in A^n$, which means by | |
| 135 |         definition of power that $s\in A \times A^{n - 1}$. But then
 | |
| 136 | also $s \in A \times A^*$. That is one direction.\bigskip | |
| 137 | ||
| 138 |         The other direction: Two cases: (i) $s\in \{[]\}$ then
 | |
| 139 | also $s\in A^*$. (ii) $s\in A \times A^*$ means there exists | |
| 140 | an $n$ such that $s\in A\times A^n$. This in turn means | |
| 141 |         $s\in A^{n + 1}$ (by definition of power). But this also means $s\in A^*$.
 | |
| 142 |         So $\{[]\} \cup A \times A^*$ is a subset of $A^*$. Done!
 | |
| 143 | } | |
| 876 | 144 | |
| 416 | 145 | \item What is meant by the notions \emph{evil regular expressions}
 | 
| 726 | 146 |   and by \emph{catastrophic backtracking}?
 | 
| 147 | ||
| 962 | 148 |   \solution{catastrophic backtracking also applies to other regexes,
 | 
| 149 | not just $(a^*)^*b$. Maybe | |
| 150 |     \url{https://www.trevorlasn.com/blog/when-regex-goes-wrong/} is
 | |
| 151 | of help - even the CrowdStrike issue had an underlying problem | |
| 152 | with a regex, though this one was not due to catastrophic | |
| 153 | backtracking.} | |
| 876 | 154 | |
| 726 | 155 | \item Given the regular expression $(a + b)^* \cdot b \cdot (a + b)^*$, | 
| 841 | 156 | which of the following regular expressions are equivalent | 
| 726 | 157 | |
| 158 | \begin{center}
 | |
| 159 | \begin{tabular}{ll}    
 | |
| 160 | 1) & $(ab + bb)^* \cdot (a + b)^*$\\ % no | |
| 161 | 2) & $(a + b)^* \cdot (ba + bb + b) \cdot (a + b)^*$\\ % yes | |
| 162 | 3) & $(a + b)^* \cdot (a + b) \cdot (a + b)^*$ % no | |
| 163 | \end{tabular}
 | |
| 164 | \end{center}
 | |
| 876 | 165 | |
| 166 |   \solution{no, yes (why?), no.}
 | |
| 922 | 167 | |
| 168 | ||
| 169 | \item Given the extended regular expression \texttt{[b-d]a?e+},
 | |
| 170 | what does the equivalent basic regular expression look like? | |
| 726 | 171 | |
| 935 | 172 |   \solution{$(b + c + d) \cdot (a + \ONE) \cdot (e \cdot e^*)$}
 | 
| 922 | 173 | |
| 174 | ||
| 403 
564f7584eff1
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
401diff
changeset | 175 | \item \POSTSCRIPT | 
| 0 | 176 | \end{enumerate}
 | 
| 177 | ||
| 178 | \end{document}
 | |
| 179 | ||
| 180 | %%% Local Variables: | |
| 181 | %%% mode: latex | |
| 182 | %%% TeX-master: t | |
| 183 | %%% End: |