% !TEX program = xelatex
\documentclass{article}
\usepackage{../style}
\begin{document}
\section*{Homework 1}
%%\HEADER
\begin{enumerate}
\item {\bf (Optional)} If you want to run the code presented
in the lectures, install the Scala programming language
available (for free) from
\begin{center}
\url{http://www.scala-lang.org}
\end{center}
and the Ammonite REPL from
\begin{center}
\url{https://ammonite.io}
\end{center}
If you want to follow the code I present during the
lectures, it might be useful to install VS Code or Codium.
Please have a look at the handout about Ammonite and
if you need a refresher for Scala - I linked on KEATS
the Scala handout from PEP.
%I will be using Scala Version 3.5, which has the \texttt{scala-cli}
%REPL used in PEP already built in.
%handout about Scala.
%Make sure Ammonite
%uses the Scala 3 compiler.
%\item {\bf (Optional)} Have a look at the crawler programs.
% Can you find a usage for them in your daily programming
% life? Can you improve them? For example in cases there
% are links that appear on different recursion levels, the
% crawlers visit such web-pages several times. Can this be
% avoided? Also, the crawlers flag as problematic any page
% that gives an error, but probably only 404 Not Found
% errors should be flagged. Can you change that?)
\item {\bf (Optional)} Have a look at the catastrophic backtracking
programs uploaded on KEATS. Convince yourself that they really require
a lot of computation time. If you have similar examples in your own
favourite programming language, I am happy to hear about it.
\item Read the handout of the first lecture and the handout
about notation. Make sure you understand the concepts of
strings and languages. In the context of the CFL-course,
what is meant by the term \emph{language}?
\solution{A language - in this context - is just a set of
strings. Some of these sets can actually not be described by
regular expressions. Only regular​ languages can. This is
something for lecture 3.}
\item Give the definition for regular expressions---this is an
inductive datatype. What is the
meaning of a regular expression? (Hint: The meaning is
defined recursively.)
\solution{Here I would also expect the grammar for basic regular
expressions and the definition of the recursive L-function. Discuss
differences between $r_1 + r_2$ and $r^+$. Discuss differences between
``real-life regexes'' and regexes in this module.}
\item Assume the concatenation operation of two strings is
written as $s_1 @ s_2$. Define the operation of
\emph{concatenating} two sets of strings. This operation
is also written as $\_ \,@\, \_$. According to
this definition, what is $A \,@\, \{\}$ equal to?
Is in general $A\,@\,B$ equal to $B\,@\,A$?
\solution{ What is $A @ {[]}$? Are there special cases
where $A @ B = B @ A$? Obviously when $A = B$ the stament is true.
But there are also cases when $A \not= B$, for example $A = \{a\}$
and $B = \{aaa\}$.}
\item Assume a set $A$ contains 4 strings and a set $B$
contains 7 strings. None of the strings is the empty
string. How many strings are in $A \,@\, B$?
\solution{Everyone will probably answer with 28, but there are corner cases where there are fewer
than 28 elements. Can students think of such corner cases?
For example $A = \{a, ab, \ldots\}$, $B = \{bc, c,\ldots\}$ }
\item How is the power of a language defined? (Hint: There are two
rules, one for $\_^0$ and one for $\_^{n+1}$.)
\solution{Two rules: 0-case and n+1 case.}
\item Let $A = \{[a], [b], [c], [d]\}$. (1) How many strings
are in $A^4$? (2) Consider also the case of $A^4$ where one of
the strings in $A$ is the empty string, for example $A =
\{[a], [b], [c], []\}$.
\solution{121 is correct. But make sure you understand why it is 121
in cases you do not have a computer at your fingertips.}
\item (1) How many basic regular expressions are there to match
\textbf{only} the string $abcd$? (2) How many if they cannot include
$\ONE$ and $\ZERO$? (3) How many if they are also not
allowed to contain stars? (4) How many if they are also
not allowed to contain $\_ + \_$?
\solution{1-3 are infinite (tell the idea why and give examples);
4 is five - remember regexes are trees (that is the main point of the question.}
\item When are two regular expressions equivalent? Can you
think of instances where two regular expressions match
the same strings, but it is not so obvious that they do?
For example $a + b$ and $b + a$ do not count\ldots they
obviously match the same strings, namely $[a]$ and
$[b]$.
\solution{for example $r^* = 1 + r \cdot r^*$ for any regular expression $r$.
Can students think about why this is the case? - this would need a proof.\bigskip
Lemma to prove: $A^* = \{[]\} \cup A \times A^*$ for any language $A$.\bigskip
The definition of ${}^*$: $\bigcup n. A^n$\bigskip
We first show $\subseteq$ of the lemma: a string $s \in A^*$, must also be in
$\bigcup n. A^n$, which means there is an $n$ such that $s\in A^n$.
If $n = 0$ then $s = []$ and $s$ is also in $\{[]\} \cup A \times A^*$.
If $n$ is bigger than $0$, then $s\in A^n$, which means by
definition of power that $s\in A \times A^{n - 1}$. But then
also $s \in A \times A^*$. That is one direction.\bigskip
The other direction: Two cases: (i) $s\in \{[]\}$ then
also $s\in A^*$. (ii) $s\in A \times A^*$ means there exists
an $n$ such that $s\in A\times A^n$. This in turn means
$s\in A^{n + 1}$ (by definition of power). But this also means $s\in A^*$.
So $\{[]\} \cup A \times A^*$ is a subset of $A^*$. Done!
}
\item What is meant by the notions \emph{evil regular expressions}
and by \emph{catastrophic backtracking}?
\solution{catastrophic backtracking also applies to other regexes,
not just $(a^*)^*b$. Maybe
\url{https://www.trevorlasn.com/blog/when-regex-goes-wrong/} is
of help - even the CrowdStrike issue had an underlying problem
with a regex, though this one was not due to catastrophic
backtracking.}
\item Given the regular expression $(a + b)^* \cdot b \cdot (a + b)^*$,
which of the following regular expressions are equivalent
\begin{center}
\begin{tabular}{ll}
1) & $(ab + bb)^* \cdot (a + b)^*$\\ % no
2) & $(a + b)^* \cdot (ba + bb + b) \cdot (a + b)^*$\\ % yes
3) & $(a + b)^* \cdot (a + b) \cdot (a + b)^*$ % no
\end{tabular}
\end{center}
\solution{no, yes (why?), no.}
\item Given the extended regular expression \texttt{[b-d]a?e+},
what does the equivalent basic regular expression look like?
\solution{$(b + c + d) \cdot (a + \ONE) \cdot (e \cdot e^*)$}
\item \POSTSCRIPT
\end{enumerate}
\end{document}
%%% Local Variables:
%%% mode: latex
%%% TeX-master: t
%%% End: