afl-material: comparison handouts/notation.tex

equal deleted inserted replaced

-:fc2e3609d5e5
+:d3e477fe6c66
 \usepackage{../langs}
 \begin{document}
-\fnote{\copyright{} Christian Urban, King's College London, 2014, 2015, 2016, 2017, 2018}
+\fnote{\copyright{} Christian Urban, King's College London, 2014, 2015, 2016, 2017, 2018, 2020}
 \section*{A Crash-Course on Notation}
 There are innumerable books available about compilers, automata theory
 readable and understandable), but other times we need pedantic
 precision for actual programs.
 \subsubsection*{Characters and Strings}
-The most important concept in this module are strings. Strings
+The most basic concept in this module are strings. Strings
 are composed of \defn{characters}. While characters are surely
 a familiar concept, we will make one subtle distinction in
 this module. If we want to refer to concrete characters, like
-\code{a}, \code{b}, \code{c} and so on, we use a typewriter font.
+\code{a}, \code{b}, \code{c} and so on, we will use a typewriter font.
 Accordingly if we want to refer to the concrete characters of
 my email address we shall write
 \begin{center}
 \pcode{christian.urban@kcl.ac.uk}
 \noindent In this string, we do not really care what the
 characters stand for, except we do care about the fact that
 for example the character $a$ is not equal to $b$ and so on.
 Why do I make this distinction? Because we often need to
 define functions using variables ranging over characters. We
-need to somehow say this is a variable, say $c$, ranging over
+need to somehow say ``this-is-a-variable'' and give it a name.
-characters, while this is the actual character \pcode{c}.
+In such cases we use the italic font.
 An \defn{alphabet} is a (non-empty) finite set of characters.
 Often the letter $\Sigma$ is used to refer to an alphabet. For
 example the ASCII characters \pcode{a} to \pcode{z} form an
 alphabet. The digits $0$ to $9$ are another alphabet. The
 typed programming languages, such as Scala, strings have a special
 type---namely \pcode{String} which is different from the type
 for lists of characters. This is because strings can be
 efficiently represented in memory, unlike lists. Since
 \code{String} and the type of lists of characters
-(\code{List[Char]}) are not the same, we need to explicitly
+(namely \code{List[Char]}) are not the same, we need to explicitly
 coerce elements between the two types, for example
 \begin{lstlisting}[numbers=none]
 scala> "abc".toList
 res01: List[Char] = List(a, b, c)
 say \dq{\textit{ello}} when making definitions about strings.
 There are also some subtleties with the empty string,
 sometimes written as \dq{} but also as the empty list of
 characters $[\,]$.\footnote{In the literature you can also
 often find that $\varepsilon$ or $\lambda$ is used to
-represent the empty string.}
+represent the empty string. But we are not going to use this notation.}
 Two strings, say $s_1$ and $s_2$, can be \defn{concatenated},
 which we write as $s_1 @ s_2$. If we regard $s_1$ and $s_2$ as
 lists of characters, then $@$ is the list-append function.
 Suppose we are given two strings \dq{\textit{foo}} and
 \dq{\textit{bar}}, then their concatenation, written
 \dq{\textit{foo}} $@$ \dq{\textit{bar}}, gives
 \dq{\textit{foobar}}. But as said above, we will often
 simplify our life and just drop the double quotes whenever it
-is clear we are talking about strings. So we will often just
+is clear we are talking about strings. So we will just
-write \textit{foo}, \textit{bar}, \textit{foobar} or
+write \textit{foo}, \textit{bar}, \textit{foobar}
-\textit{foo $@$ bar}.
+\textit{foo $@$ bar} and so on.
 Occasionally we will use the notation $a^n$ for strings, which stands
 for the string of $n$ repeated $a$s. So $a^{n}b^{n}$ is a string that
-has some number of $a$s followed by the same number of $b$s.  A simple
+has some number of $a$s followed by the same number of $b$s.
-property of string concatenation is \emph{associativity}, meaning
+Confusingly, in Scala the notation is ``times'' for this opration.
+So you can write
+\begin{lstlisting}[numbers=none]
+scala> "a" * 13
+val res02: String = aaaaaaaaaaaaa
+\end{lstlisting}
+\noindent
+A simple property of string concatenation is \emph{associativity}, meaning
 \[(s_1 @ s_2) @ s_3 = s_1 @ (s_2 @ s_3)\]
 \noindent are always equal strings. The empty string behaves
 like a \emph{unit element}, therefore
 A \backslash B & \dn & \{x\;|\; x \in A \wedge x \not\in B\}
 \end{eqnarray*}
 \noindent In general set comprehensions are of the form
 $\{a\;|\;P\}$ which stands for the set of all elements $a$
-(from some set) for which some property $P$ holds.
+(from some set) for which some property $P$ holds. If programming
+is more your-kind-of-thing, you might recognise the similarities
+with for-comprehensions, for example for the silly set above you
+could write
+\begin{lstlisting}[numbers=none]
+scala> for (n <- (0 to 10).toSet; if n * n < 9) yield n
+val res03: Set[Int] = Set(0, 1, 2)
+\end{lstlisting}
+\noindent
+This is pretty much the same as $\{n\;|\;  n \in \mathbb{N} \wedge n^2 < 9\}$
+just in Scala syntax.
 For defining sets, we will also often use the notion of the
 ``big union''. An example is as follows:
 \begin{equation}\label{bigunion}
 \noindent
 contain actually the same amount of elements. Does this make sense?
 Though this might all look strange, infinite sets will be a
 topic that is very relevant to the material of this module. It tells
-us what we can compute with a computer (actually algorithm) and what
+us what we can compute with a computer (actually an algorithm) and what
 we cannot. But during the first 9 lectures we can go by without this
 ``weird'' stuff. End of aside.\smallskip
 Another important notion in this module are \defn{languages}, which
 are sets of strings. One of the main goals for us will be how to
 hardness is meant in terms of Turing decidable, for example.}
 Note that the language containing the empty string $\{\dq{}\}$
 is not equal to $\varnothing$, the empty language (or empty
 set): The former contains one element, namely \dq{} (also
 written $[\,]$), but the latter does not contain any
-element.
+element at all! Make sure you see the difference.
 For languages we define the operation of \defn{language
 concatenation}, written like in the string case as $A @ B$:
 \begin{equation}\label{langconc}
 \[
 \{abzzz, abqq, abr, aczzz, acqq, acr\}
 \]
-\noindent Recall the properties for string concatenation. For
+\noindent The cool thing about Scala is that we can define language
+concatenation very elegantly as
+\begin{lstlisting}[numbers=none]
+def concat(A: Set[String], B: Set[String]) =
+for (x <- A ; y <- B) yield x ++ y
+\end{lstlisting}
+\noindent
+where \code{++} is string concatenation in Scala.
+Recall the properties for string concatenation. For
 language concatenation we have the following properties
 \begin{center}
 \begin{tabular}{ll}
 associativity: & $(A @ B) @ C = A @ (B @ C)$\\
 \varnothing$
 \end{tabular}
 \end{center}
 \noindent Note the difference in the last two lines: the empty
-set behaves like $0$ for multiplication and the set $\{[]\}$
+set behaves like $0$ for multiplication, whereas the set $\{[]\}$
-like $1$ for multiplication ($n * 1 = n$ and $n * 0 = 0$).
+behaves like $1$ for multiplication ($n * 1 = n$ and $n * 0 = 0$).
+Again this is a subtletly you need to get compfortable with.
 Using the operation of language concatenation, we can define a
 \defn{language power} operation as follows:
 \begin{eqnarray*}
 \begin{equation}\label{star}
 A\star \dn \bigcup_{0\le n}\; A^n
 \end{equation}
 \noindent This star operation is often also called
-\emph{Kleene-star}. Unfolding the definition in~\eqref{star}
+\emph{Kleene-star} after the mathematician/computer scientist
+Stephen Cole Kleene. Unfolding the definition in~\eqref{star}
 gives
 \[
 A\star \dn A^0 \cup A^1 \cup A^2 \cup A^3 \cup \ldots
 \]
 what $\{[]\}\star$ and $\varnothing\star$ are?
 Recall that an alphabet is often referred to by the letter
 $\Sigma$. We can now write for the set of \emph{all} strings
 over this alphabet as $\Sigma\star$. In doing so we also include the
-empty string as a possible string over $\Sigma$. So if $\Sigma
+empty string as a possible string (over $\Sigma$). Assuming $\Sigma
 = \{a, b\}$, then $\Sigma\star$ is
 \[
 \{[], a, b, aa, ab, ba, bb, aaa, aab, aba, abb, baa, bab, \ldots\}
 \]
-\noindent or in other words all strings containing $a$s and
+\noindent or in words all strings containing $a$s and
 $b$s only, plus the empty string.
+\bigskip
+\noindent
+Thanks for making it until here! There are also some personal conventions
+about regular expressions. But I will explain them in the handout for the
+first week. An exercise you can do: Implement the power operation for languages
+and try out some examples.
 \end{document}
 %%% Local Variables:
 %%% mode: latex
 %%% TeX-master: t

changeset 736	d3e477fe6c66
parent 550	71fc4a7a7039
child 830	dbf9d710ce65