% !TEX program = xelatex+ −
\documentclass{article}+ −
\usepackage{../style}+ −
+ −
\begin{document}+ −
+ −
\section*{Homework 1}+ −
+ −
%%\HEADER+ −
+ −
\begin{enumerate}+ −
+ −
\item {\bf (Optional)} If you want to run the code presented+ −
in the lectures, install the Scala programming language+ −
available (for free) from+ −
+ −
\begin{center}+ −
\url{http://www.scala-lang.org}+ −
\end{center}+ −
+ −
and the Ammonite REPL from+ −
+ −
\begin{center}+ −
\url{https://ammonite.io}+ −
\end{center} + −
+ −
If you want to follow the code I present during the+ −
lectures, it might be useful to install VS Code or Codium.+ −
Please have a look at the handout about Ammonite and+ −
if you need a refresher for Scala - I linked on KEATS+ −
the Scala handout from PEP.+ −
%I will be using Scala Version 3.5, which has the \texttt{scala-cli}+ −
%REPL used in PEP already built in.+ −
+ −
%handout about Scala.+ −
%Make sure Ammonite+ −
%uses the Scala 3 compiler.+ −
+ −
%\item {\bf (Optional)} Have a look at the crawler programs.+ −
% Can you find a usage for them in your daily programming+ −
% life? Can you improve them? For example in cases there+ −
% are links that appear on different recursion levels, the+ −
% crawlers visit such web-pages several times. Can this be+ −
% avoided? Also, the crawlers flag as problematic any page+ −
% that gives an error, but probably only 404 Not Found+ −
% errors should be flagged. Can you change that?)+ −
+ −
\item {\bf (Optional)} Have a look at the catastrophic backtracking+ −
programs uploaded on KEATS. Convince yourself that they really require+ −
a lot of computation time. If you have similar examples in your own+ −
favourite programming language, I am happy to hear about it.+ −
+ −
+ −
\item Read the handout of the first lecture and the handout+ −
about notation. Make sure you understand the concepts of+ −
strings and languages. In the context of the CFL-course,+ −
what is meant by the term \emph{language}?+ −
+ −
\solution{A language - in this context - is just a set of+ −
strings. Some of these sets can actually not be described by+ −
regular expressions. Only regular​ languages can. This is+ −
something for lecture 3.}+ −
+ −
\item Give the definition for regular expressions---this is an+ −
inductive datatype. What is the+ −
meaning of a regular expression? (Hint: The meaning is+ −
defined recursively.)+ −
+ −
\solution{Here I would also expect the grammar for basic regular+ −
expressions and the definition of the recursive L-function. Discuss+ −
differences between $r_1 + r_2$ and $r^+$. Discuss differences between+ −
``real-life regexes'' and regexes in this module.}+ −
+ −
\item Assume the concatenation operation of two strings is+ −
written as $s_1 @ s_2$. Define the operation of+ −
\emph{concatenating} two sets of strings. This operation+ −
is also written as $\_ \,@\, \_$. According to + −
this definition, what is $A \,@\, \{\}$ equal to?+ −
Is in general $A\,@\,B$ equal to $B\,@\,A$?+ −
+ −
\solution{ What is $A @ {[]}$? Are there special cases+ −
where $A @ B = B @ A$? Obviously when $A = B$ the stament is true.+ −
But there are also cases when $A \not= B$, for example $A = \{a\}$+ −
and $B = \{aaa\}$.}+ −
+ −
\item Assume a set $A$ contains 4 strings and a set $B$+ −
contains 7 strings. None of the strings is the empty+ −
string. How many strings are in $A \,@\, B$?+ −
+ −
\solution{Everyone will probably answer with 28, but there are corner cases where there are fewer+ −
than 28 elements. Can students think of such corner cases?+ −
For example $A = \{a, ab, \ldots\}$, $B = \{bc, c,\ldots\}$ }+ −
+ −
\item How is the power of a language defined? (Hint: There are two+ −
rules, one for $\_^0$ and one for $\_^{n+1}$.)+ −
+ −
\solution{Two rules: 0-case and n+1 case.}+ −
+ −
\item Let $A = \{[a], [b], [c], [d]\}$. (1) How many strings+ −
are in $A^4$? (2) Consider also the case of $A^4$ where one of+ −
the strings in $A$ is the empty string, for example $A =+ −
\{[a], [b], [c], []\}$.+ −
+ −
\solution{121 is correct. But make sure you understand why it is 121+ −
in cases you do not have a computer at your fingertips.}+ −
+ −
\item (1) How many basic regular expressions are there to match+ −
\textbf{only} the string $abcd$? (2) How many if they cannot include+ −
$\ONE$ and $\ZERO$? (3) How many if they are also not+ −
allowed to contain stars? (4) How many if they are also+ −
not allowed to contain $\_ + \_$?+ −
+ −
\solution{1-3 are infinite (tell the idea why and give examples);+ −
4 is five - remember regexes are trees (that is the main point of the question.}+ −
+ −
\item When are two regular expressions equivalent? Can you+ −
think of instances where two regular expressions match+ −
the same strings, but it is not so obvious that they do?+ −
For example $a + b$ and $b + a$ do not count\ldots they+ −
obviously match the same strings, namely $[a]$ and+ −
$[b]$.+ −
+ −
\solution{for example $r^* = 1 + r \cdot r^*$ for any regular expression $r$.+ −
Can students think about why this is the case? - this would need a proof.\bigskip+ −
+ −
+ −
Lemma to prove: $A^* = \{[]\} \cup A \times A^*$ for any language $A$.\bigskip+ −
+ −
The definition of ${}^*$: $\bigcup n. A^n$\bigskip+ −
+ −
We first show $\subseteq$ of the lemma: a string $s \in A^*$, must also be in+ −
$\bigcup n. A^n$, which means there is an $n$ such that $s\in A^n$.+ −
If $n = 0$ then $s = []$ and $s$ is also in $\{[]\} \cup A \times A^*$.+ −
If $n$ is bigger than $0$, then $s\in A^n$, which means by+ −
definition of power that $s\in A \times A^{n - 1}$. But then+ −
also $s \in A \times A^*$. That is one direction.\bigskip+ −
+ −
The other direction: Two cases: (i) $s\in \{[]\}$ then+ −
also $s\in A^*$. (ii) $s\in A \times A^*$ means there exists+ −
an $n$ such that $s\in A\times A^n$. This in turn means+ −
$s\in A^{n + 1}$ (by definition of power). But this also means $s\in A^*$.+ −
So $\{[]\} \cup A \times A^*$ is a subset of $A^*$. Done!+ −
}+ −
+ −
\item What is meant by the notions \emph{evil regular expressions}+ −
and by \emph{catastrophic backtracking}?+ −
+ −
\solution{catastrophic backtracking also applies to other regexes,+ −
not just $(a^*)^*b$. Maybe+ −
\url{https://www.trevorlasn.com/blog/when-regex-goes-wrong/} is+ −
of help - even the CrowdStrike issue had an underlying problem+ −
with a regex, though this one was not due to catastrophic+ −
backtracking.}+ −
+ −
\item Given the regular expression $(a + b)^* \cdot b \cdot (a + b)^*$,+ −
which of the following regular expressions are equivalent+ −
+ −
\begin{center}+ −
\begin{tabular}{ll} + −
1) & $(ab + bb)^* \cdot (a + b)^*$\\ % no+ −
2) & $(a + b)^* \cdot (ba + bb + b) \cdot (a + b)^*$\\ % yes+ −
3) & $(a + b)^* \cdot (a + b) \cdot (a + b)^*$ % no+ −
\end{tabular}+ −
\end{center}+ −
+ −
\solution{no, yes (why?), no.}+ −
+ −
+ −
\item Given the extended regular expression \texttt{[b-d]a?e+},+ −
what does the equivalent basic regular expression look like?+ −
+ −
\solution{$(b + c + d) \cdot (a + \ONE) \cdot (e \cdot e^*)$}+ −
+ −
+ −
\item \POSTSCRIPT + −
\end{enumerate}+ −
+ −
\end{document}+ −
+ −
%%% Local Variables: + −
%%% mode: latex+ −
%%% TeX-master: t+ −
%%% End: + −