afl-material: comparison handouts/notation.tex

equal deleted inserted replaced

-:564f7584eff1
+:245d302791c7
 \begin{document}
 \section*{A Crash-Course on Notation}
-There are innumerable books available about compiler, automata
+There are innumerable books available about compilers,
-and formal languages. Unfortunately, they often use their own
+automata theory and formal languages. Unfortunately, they
-notational conventions and their own symbols. This handout is
+often use their own notational conventions and their own
-meant to clarify some of the notation I will use.
+symbols. This handout is meant to clarify some of the notation
+I will use. I appologise in advance that sometimes I will be a
+bit fuzzy\ldots the problem is that often we want to have
+convenience in our mathematical definitions (to make them
+readable and understandable), but sometimes we need precision
+for actual programs.
 \subsubsection*{Characters and Strings}
 The most important concept in this module are strings. Strings
 are composed of \defn{characters}. While characters are surely
 \noindent But often we do not care which particular characters
 we use. In such cases we use the italic font and write $a$,
 $b$, $c$ and so on for characters. Therefore if we need a
 representative string, we might write
-\begin{equation}\label{abracadabra}
+\[
 abracadabra
-\end{equation}
+\]
 \noindent In this string, we do not really care what the
 characters stand for, except we do care about the fact that
 for example the character $a$ is not equal to $b$ and so on.
+Why do I make this distinction? Because we often need to
+define functions using variables ranging over characters. We
+need to somehow say this is a variable, say $c$, ranging over
+characters, while this is the atual character \pcode{c}.
 An \defn{alphabet} is a (non-empty) finite set of characters.
 Often the letter $\Sigma$ is used to refer to an alphabet. For
 example the ASCII characters \pcode{a} to \pcode{z} form an
 alphabet. The digits $0$ to $9$ are another alphabet. The
 case we will state that the alphabet is the set $\{a, b\}$.
 \defn{Strings} are lists of characters. Unfortunately, there
 are many ways how we can write down strings. In programming
 languages, they are usually written as \dq{$hello$} where the
-double quotes indicate that we are dealing with a string. But
+double quotes indicate that we are dealing with a string. In
-since we regard strings as lists of characters we could also
+programming languages, such as Scala, strings have a special
-write this string as
+type---namely \pcode{String} which is different from the type
+for lists of chatacters. This is because strings can be
-\[
+efficiently represented in memory, unlike general lists. Since
-[\text{\it h, e, l, l, o}] \;\;\text{or simply}\;\; \textit{hello}
+\code{String} and the type of lists of characters,
+\code{List[Char]} are not the same, we need to explicitly
+coerce elements between the two types, for example
+\begin{lstlisting}[numbers=none]
+scala> "abc".toList
+res01: List[Char] = List(a, b, c)
+\end{lstlisting}
+\noindent Since in our (mathematical) definitions we regard
+strings as lists of characters, we will also write
+\dq{$hello$} as
+\[
+[\text{\it h, e, l, l, o}] \qquad\text{or simply}\qquad \textit{hello}
 \]
 \noindent The important point is that we can always decompose
 such strings. For example, we will often consider the first
 character of a string, say $h$, and the ``rest'' of a string
 characters $[\,]$.\footnote{In the literature you can also
 often find that $\varepsilon$ or $\lambda$ is used to
 represent the empty string.}
 Two strings, say $s_1$ and $s_2$, can be \defn{concatenated},
-which we write as $s_1 @ s_2$. Suppose we are given two
+which we write as $s_1 @ s_2$. If we regard $s_1$ and $s_2$ as
-strings \dq{\textit{foo}} and \dq{\textit{bar}}, then their
+lists of characters, then $@$ is the list-append function.
-concatenation, writen \dq{\textit{foo}} $@$ \dq{\textit{bar}},
+Suppose we are given two strings \dq{\textit{foo}} and
-gives \dq{\textit{foobar}}. Often we will simplify our life
+\dq{\textit{bar}}, then their concatenation, writen
-and just drop the double quotes whenever it is clear we are
+\dq{\textit{foo}} $@$ \dq{\textit{bar}}, gives
-talking about strings, writing as already in
+\dq{\textit{foobar}}. But as said above, we will often
-\eqref{abracadabra} just \textit{foo}, \textit{bar},
+simplify our life and just drop the double quotes whenever it
-\textit{foobar} or \textit{foo $@$ bar}.
+is clear we are talking about strings, So we will often just
+write \textit{foo}, \textit{bar}, \textit{foobar} or
-Some simple properties of string concatenation hold. For
+\textit{foo $@$ bar}.
-example the concatenation operation is \emph{associative},
-meaning
-\[(s_1 @ s_2) @ s_3 = s_1 @ (s_2 @ s_3)\]
-\noindent are always equal strings. The empty string behaves
-like a unit element, therefore
-\[s \,@\, [] = [] \,@\, s = s\]
 Occasionally we will use the notation $a^n$ for strings, which
 stands for the string of $n$ repeated $a$s. So $a^{n}b^{n}$ is
-a string that has as many $a$s as $b$s.
+a string that has as many $a$s by as many $b$s.
-Note however that while for us strings are just lists of
+A simple property of string concatenation is
-characters, programming languages often differentiate between
+\emph{associativity}, meaning
-the two concepts. In Scala, for example, there is the type of
-\code{String} and the type of lists of characters,
+\[(s_1 @ s_2) @ s_3 = s_1 @ (s_2 @ s_3)\]
-\code{List[Char]}. They are not the same and we need to
-explicitly coerce elements between the two types, for example
+\noindent are always equal strings. The empty string behaves
+like a \emph{unit element}, therefore
-\begin{lstlisting}[numbers=none]
-scala> "abc".toList
+\[s \,@\, [] = [] \,@\, s = s\]
-res01: List[Char] = List(a, b, c)
-\end{lstlisting}
 \subsubsection*{Sets and Languages}
 We will use the familiar operations $\cup$, $\cap$, $\subset$
 \in \{1, 2, 3\}$ is true and $4 \in \{1, 2, 3\}$ is false.
 Sets can potentially have infinitely many elements. For
 example the set of all natural numbers $\{0, 1, 2, \ldots\}$
 is infinite. This set is often also abbreviated as
 $\mathbb{N}$. We can define sets by giving all elements, for
-example $\{0, 1\}$, but also by \defn{set comprehensions}. For
+example $\{0, 1\}$ for the set containing just $0$ and $1$,
-example the set of all even natural numbers can be defined as
+but also by \defn{set comprehensions}. For example the set of
+all even natural numbers can be defined as
 \[
 \{n\;|\;n\in\mathbb{N} \wedge n\;\text{is even}\}
 \]
 \ldots
 \]
 \noindent but using the big union notation is more concise.
-An important notion in this module are \defn{languages}, which
+While this stuff about sete might all look trivial or even
+needlessly pedantic, \emph{Nature} is never simple. If you
+want to be amazed how complicated sets can get, watch out for
+the last lecture just before Christmas where I want you to
+convince you of the fact that some sets are more infinite than
+others. Actually that will be a fact that is very relevant to
+the material of this module.
+Another important notion in this module are \defn{languages}, which
 are sets of strings. One of the main goals for us will be how to
 (formally) specify languages and to find out whether a string
 is in a language or not.\footnote{You might wish to ponder
 whether this is in general a hard or easy problem, where
 hardness is meant in terms of Turing decidable, for example.}
 another hint about a connection between the $@$-operation and
 multiplication: How is $x^n$ defined recursively and what is
 $x^0$?)
 Next we can define the \defn{star operation} for languages:
-$A^*$ is the union of all powers of $A$, or short
+$A\star$ is the union of all powers of $A$, or short
 \begin{equation}\label{star}
-A^* \dn \bigcup_{0\le n}\; A^n
+A\star \dn \bigcup_{0\le n}\; A^n
 \end{equation}
 \noindent This star operation is often also called
 \emph{Kleene-star}. Unfolding the definition in \eqref{star}
 gives
 \[
 \{[]\} \,\cup\, A \,\cup\, A @ A \,\cup\, A @ A @ A \,\cup\, \ldots
 \]
-\noindent We can see that the empty string is always in $A^*$,
+\noindent We can see that the empty string is always in $A\star$,
 no matter what $A$ is. This is because $[] \in A^0$. To make
 sure you understand these definitions, I leave you to answer
-what $\{[]\}^*$ and $\varnothing^*$ are.
+what $\{[]\}\star$ and $\varnothing\star$ are?
 Recall that an alphabet is often referred to by the letter
-$\Sigma$. We can now write for the set of all strings over
+$\Sigma$. We can now write for the set of \emph{all} strings
-this alphabet $\Sigma^*$. In doing so we also include the
+over this alphabet $\Sigma\star$. In doing so we also include the
-empty string as a possible string over $\Sigma$. So if
+empty string as a possible string over $\Sigma$. So if $\Sigma
-$\Sigma = \{a, b\}$, then $\Sigma^*$ is
+= \{a, b\}$, then $\Sigma\star$ is
 \[
 \{[], a, b, aa, ab, ba, bb, aaa, aab, aba, abb, baa, bab, \ldots\}
 \]

changeset 404	245d302791c7
parent 398	c8ce95067c1a
child 476	d922cc83b70c