diff -r 3b9496db3fb9 -r 3390e863d796 handouts/notation.tex --- a/handouts/notation.tex Tue Sep 26 14:07:29 2017 +0100 +++ b/handouts/notation.tex Tue Sep 26 14:08:49 2017 +0100 @@ -8,15 +8,14 @@ \section*{A Crash-Course on Notation} -There are innumerable books available about compilers, -automata theory and formal languages. Unfortunately, they -often use their own notational conventions and their own -symbols. This handout is meant to clarify some of the notation -I will use. I appologise in advance that sometimes I will be a -bit fuzzy\ldots the problem is that often we want to have -convenience in our mathematical definitions (to make them -readable and understandable), but sometimes we need precision -for actual programs. +There are innumerable books available about compilers, automata theory +and formal languages. Unfortunately, they often use their own +notational conventions and their own symbols. This handout is meant to +clarify some of the notation I will use. I apologise in advance that +sometimes I will be a bit fuzzy\ldots the problem is that often we +want to have convenience in our mathematical definitions (to make them +readable and understandable), but other times we need pedantic +precision for actual programs. \subsubsection*{Characters and Strings} @@ -55,7 +54,7 @@ Why do I make this distinction? Because we often need to define functions using variables ranging over characters. We need to somehow say this is a variable, say $c$, ranging over -characters, while this is the atual character \pcode{c}. +characters, while this is the actual character \pcode{c}. An \defn{alphabet} is a (non-empty) finite set of characters. Often the letter $\Sigma$ is used to refer to an alphabet. For @@ -70,14 +69,14 @@ \defn{Strings} are lists of characters. Unfortunately, there are many ways how we can write down strings. In programming -languages, they are usually written as \dq{$hello$} where the +languages, they are usually written as \dq{\texttt{hello}} where the double quotes indicate that we are dealing with a string. In -programming languages, such as Scala, strings have a special +typed programming languages, such as Scala, strings have a special type---namely \pcode{String} which is different from the type -for lists of chatacters. This is because strings can be -efficiently represented in memory, unlike general lists. Since -\code{String} and the type of lists of characters, -\code{List[Char]} are not the same, we need to explicitly +for lists of characters. This is because strings can be +efficiently represented in memory, unlike lists. Since +\code{String} and the type of lists of characters +(\code{List[Char]}) are not the same, we need to explicitly coerce elements between the two types, for example \begin{lstlisting}[numbers=none] @@ -85,9 +84,11 @@ res01: List[Char] = List(a, b, c) \end{lstlisting} -\noindent Since in our (mathematical) definitions we regard -strings as lists of characters, we will also write -\dq{$hello$} as +\noindent +However, we do not want to do this kind of explicit coercion in our +pencil-and-paper, everyday arguments. So in our (mathematical) +definitions we regard strings as lists of characters, we will also +write \dq{$hello$} as \[ [\text{\it h, e, l, l, o}] \qquad\text{or simply}\qquad \textit{hello} @@ -107,7 +108,7 @@ which we write as $s_1 @ s_2$. If we regard $s_1$ and $s_2$ as lists of characters, then $@$ is the list-append function. Suppose we are given two strings \dq{\textit{foo}} and -\dq{\textit{bar}}, then their concatenation, writen +\dq{\textit{bar}}, then their concatenation, written \dq{\textit{foo}} $@$ \dq{\textit{bar}}, gives \dq{\textit{foobar}}. But as said above, we will often simplify our life and just drop the double quotes whenever it @@ -115,12 +116,10 @@ write \textit{foo}, \textit{bar}, \textit{foobar} or \textit{foo $@$ bar}. -Occasionally we will use the notation $a^n$ for strings, which -stands for the string of $n$ repeated $a$s. So $a^{n}b^{n}$ is -a string that has as many $a$s by as many $b$s. - -A simple property of string concatenation is -\emph{associativity}, meaning +Occasionally we will use the notation $a^n$ for strings, which stands +for the string of $n$ repeated $a$s. So $a^{n}b^{n}$ is a string that +has some number of $a$s followed by the same number of $b$s. A simple +property of string concatenation is \emph{associativity}, meaning \[(s_1 @ s_2) @ s_3 = s_1 @ (s_2 @ s_3)\] @@ -142,15 +141,20 @@ \{1, 2, 3\} \] -\noindent The notation $\in$ means \emph{element of}, so $1 -\in \{1, 2, 3\}$ is true and $4 \in \{1, 2, 3\}$ is false. -Sets can potentially have infinitely many elements. For -example the set of all natural numbers $\{0, 1, 2, \ldots\}$ -is infinite. This set is often also abbreviated as -$\mathbb{N}$. We can define sets by giving all elements, for -example $\{0, 1\}$ for the set containing just $0$ and $1$, -but also by \defn{set comprehensions}. For example the set of -all even natural numbers can be defined as +\noindent The notation $\in$ means \emph{element of}, so $1 \in \{1, +2, 3\}$ is true and $4 \in \{1, 2, 3\}$ is false. Note that the +\emph{list} $[1, 2, 3]$ is something different from the \emph{set} +$\{1, 2, 3\}$: in the former we care about the order and potentially +several occurrences of a number; while with the latter we do not. +Also sets can potentially have infinitely many elements, whereas lists +cannot. For example +the set of all natural numbers $\{0, 1, 2, \ldots\}$ is infinite. This +set is often also abbreviated as $\mathbb{N}$. Lists can be very large, but they cannot contain infinitely many elements. + +We can define sets by giving all elements, for example $\{0, 1\}$ for +the set containing just $0$ and $1$, but also by \defn{set + comprehensions}. For example the set of all even natural numbers can +be defined as \[ \{n\;|\;n\in\mathbb{N} \wedge n\;\text{is even}\} @@ -163,8 +167,9 @@ \{n\;|\; n^2 < 9 \wedge n \in \mathbb{N}\} \] -\noindent Notice that set comprehensions could be used -to define set union, intersection and difference: +\noindent Can you see why this defines the set $\{0, 1, 2\}$? Notice +that set comprehensions could be used to define set union, +intersection and difference: \begin{eqnarray*} A \cup B & \dn & \{x\;|\; x \in A \vee x \in B\}\\ @@ -201,13 +206,30 @@ \noindent but using the big union notation is more concise. -While this stuff about sete might all look trivial or even -needlessly pedantic, \emph{Nature} is never simple. If you -want to be amazed how complicated sets can get, watch out for -the last lecture just before Christmas where I want you to -convince you of the fact that some sets are more infinite than -others. Actually that will be a fact that is very relevant to -the material of this module. +As an aside: While this stuff about sets might all look trivial or even needlessly +pedantic, \emph{Nature} is never simple. If you want to be amazed how +complicated sets can get, watch out for the last lecture just before +Christmas where I want to convince you of the fact that some sets are +more infinite than others. Yes, you read correctly, there can be sets +that are ``more infinite'' then others. If you think this is obvious: +say you have the infinite set $\mathbb{N}\backslash\{0\} = \{1, 2, 3, 4, \ldots\}$ which is all +the natural numbers except $0$, and then compare it to the set +$\{0, 1, 2, 3, 4, \ldots\}$ which contains the $0$. If you think, +the second must be more infinite\ldots{} well, then think again. Because the two +infinite sets + +\begin{center} + $\{1, 2, 3, 4, \ldots\}$ and + $\{0, 1, 2, 3, 4, \ldots\}$ +\end{center} + +\noindent +contain actually the same number of elements. Does this make sense? +Though this might all look strange this about infinite sets will be a +topic that is very relevant to the material of this module. It tells +us what we can compute with a computer (actually algorithm) and what +we cannot. But during the first 9 lectures we can go by without this +``weird'' stuff. Another important notion in this module are \defn{languages}, which are sets of strings. One of the main goals for us will be how to @@ -254,7 +276,7 @@ set behaves like $0$ for multiplication and the set $\{[]\}$ like $1$ for multiplication ($n * 1 = n$ and $n * 0 = 0$). -Following the language concatenation, we can define a +Using the operation of language concatenation, we can define a \defn{language power} operation as follows: \begin{eqnarray*} @@ -267,7 +289,7 @@ set, but the set containing the empty string. So no matter what the set $A$ is, $A^0$ will always be $\{[]\}$. (There is another hint about a connection between the $@$-operation and -multiplication: How is $x^n$ defined recursively and what is +multiplication: How is $x^n$ defined in mathematics and what is $x^0$?) Next we can define the \defn{star operation} for languages: @@ -282,14 +304,14 @@ gives \[ -A^0 \cup A^1 \cup A^2 \cup A^3 \cup \ldots +A\star \dn A^0 \cup A^1 \cup A^2 \cup A^3 \cup \ldots \] \noindent which is equal to \[ -\{[]\} \,\cup\, A \,\cup\, A @ A \,\cup\, A @ A @ A \,\cup\, \ldots +A\star \dn \{[]\} \,\cup\, A \,\cup\, A @ A \,\cup\, A @ A @ A \,\cup\, \ldots \] \noindent We can see that the empty string is always in $A\star$, @@ -299,7 +321,7 @@ Recall that an alphabet is often referred to by the letter $\Sigma$. We can now write for the set of \emph{all} strings -over this alphabet $\Sigma\star$. In doing so we also include the +over this alphabet as $\Sigma\star$. In doing so we also include the empty string as a possible string over $\Sigma$. So if $\Sigma = \{a, b\}$, then $\Sigma\star$ is