% !TEX program = xelatex\documentclass{article}\usepackage{../style}\begin{document}\section*{Homework 1}%%\HEADER\begin{enumerate}\item {\bf (Optional)} If you want to run the code presented in the lectures, install the Scala programming language available (for free) from \begin{center} \url{http://www.scala-lang.org} \end{center} and the Ammonite REPL from \begin{center} \url{https://ammonite.io} \end{center} If you want to follow the code I present during the lectures, it might be useful to install VS Code or Codium. Please have a look at the handout about Ammonite and if you need a refresher for Scala - I linked on KEATS the Scala handout from PEP. %I will be using Scala Version 3.5, which has the \texttt{scala-cli} %REPL used in PEP already built in. %handout about Scala. %Make sure Ammonite %uses the Scala 3 compiler.%\item {\bf (Optional)} Have a look at the crawler programs.% Can you find a usage for them in your daily programming% life? Can you improve them? For example in cases there% are links that appear on different recursion levels, the% crawlers visit such web-pages several times. Can this be% avoided? Also, the crawlers flag as problematic any page% that gives an error, but probably only 404 Not Found% errors should be flagged. Can you change that?)\item {\bf (Optional)} Have a look at the catastrophic backtracking programs uploaded on KEATS. Convince yourself that they really require a lot of computation time. If you have similar examples in your own favourite programming language, I am happy to hear about it.\item Read the handout of the first lecture and the handout about notation. Make sure you understand the concepts of strings and languages. In the context of the CFL-course, what is meant by the term \emph{language}? \solution{A language - in this context - is just a set of strings. Some of these sets can actually not be described by regular expressions. Only regular​ languages can. This is something for lecture 3.}\item Give the definition for regular expressions---this is an inductive datatype. What is the meaning of a regular expression? (Hint: The meaning is defined recursively.) \solution{Here I would also expect the grammar for basic regular expressions and the definition of the recursive L-function. Discuss differences between $r_1 + r_2$ and $r^+$. Discuss differences between ``real-life regexes'' and regexes in this module.}\item Assume the concatenation operation of two strings is written as $s_1 @ s_2$. Define the operation of \emph{concatenating} two sets of strings. This operation is also written as $\_ \,@\, \_$. According to this definition, what is $A \,@\, \{\}$ equal to? Is in general $A\,@\,B$ equal to $B\,@\,A$? \solution{ What is $A @ {[]}$? Are there special cases where $A @ B = B @ A$? Obviously when $A = B$ the stament is true. But there are also cases when $A \not= B$, for example $A = \{a\}$ and $B = \{aaa\}$.}\item Assume a set $A$ contains 4 strings and a set $B$ contains 7 strings. None of the strings is the empty string. How many strings are in $A \,@\, B$? \solution{Everyone will probably answer with 28, but there are corner cases where there are fewer than 28 elements. Can students think of such corner cases? For example $A = \{a, ab, \ldots\}$, $B = \{bc, c,\ldots\}$ }\item How is the power of a language defined? (Hint: There are two rules, one for $\_^0$ and one for $\_^{n+1}$.) \solution{Two rules: 0-case and n+1 case.}\item Let $A = \{[a], [b], [c], [d]\}$. (1) How many strings are in $A^4$? (2) Consider also the case of $A^4$ where one of the strings in $A$ is the empty string, for example $A = \{[a], [b], [c], []\}$. \solution{121 is correct. But make sure you understand why it is 121 in cases you do not have a computer at your fingertips.}\item (1) How many basic regular expressions are there to match \textbf{only} the string $abcd$? (2) How many if they cannot include $\ONE$ and $\ZERO$? (3) How many if they are also not allowed to contain stars? (4) How many if they are also not allowed to contain $\_ + \_$? \solution{1-3 are infinite (tell the idea why and give examples); 4 is five - remember regexes are trees (that is the main point of the question.}\item When are two regular expressions equivalent? Can you think of instances where two regular expressions match the same strings, but it is not so obvious that they do? For example $a + b$ and $b + a$ do not count\ldots they obviously match the same strings, namely $[a]$ and $[b]$. \solution{for example $r^* = 1 + r \cdot r^*$ for any regular expression $r$. Can students think about why this is the case? - this would need a proof.\bigskip Lemma to prove: $A^* = \{[]\} \cup A \times A^*$ for any language $A$.\bigskip The definition of ${}^*$: $\bigcup n. A^n$\bigskip We first show $\subseteq$ of the lemma: a string $s \in A^*$, must also be in $\bigcup n. A^n$, which means there is an $n$ such that $s\in A^n$. If $n = 0$ then $s = []$ and $s$ is also in $\{[]\} \cup A \times A^*$. If $n$ is bigger than $0$, then $s\in A^n$, which means by definition of power that $s\in A \times A^{n - 1}$. But then also $s \in A \times A^*$. That is one direction.\bigskip The other direction: Two cases: (i) $s\in \{[]\}$ then also $s\in A^*$. (ii) $s\in A \times A^*$ means there exists an $n$ such that $s\in A\times A^n$. This in turn means $s\in A^{n + 1}$ (by definition of power). But this also means $s\in A^*$. So $\{[]\} \cup A \times A^*$ is a subset of $A^*$. Done! }\item What is meant by the notions \emph{evil regular expressions} and by \emph{catastrophic backtracking}? \solution{catastrophic backtracking also applies to other regexes, not just $(a^*)^*b$. Maybe \url{https://www.trevorlasn.com/blog/when-regex-goes-wrong/} is of help - even the CrowdStrike issue had an underlying problem with a regex, though this one was not due to catastrophic backtracking.}\item Given the regular expression $(a + b)^* \cdot b \cdot (a + b)^*$, which of the following regular expressions are equivalent\begin{center}\begin{tabular}{ll} 1) & $(ab + bb)^* \cdot (a + b)^*$\\ % no 2) & $(a + b)^* \cdot (ba + bb + b) \cdot (a + b)^*$\\ % yes 3) & $(a + b)^* \cdot (a + b) \cdot (a + b)^*$ % no\end{tabular}\end{center} \solution{no, yes (why?), no.}\item Given the extended regular expression \texttt{[b-d]a?e+}, what does the equivalent basic regular expression look like? \solution{$(b + c + d) \cdot (a + \ONE) \cdot (e \cdot e^*)$}\item \POSTSCRIPT \end{enumerate}\end{document}%%% Local Variables: %%% mode: latex%%% TeX-master: t%%% End: