diff -r b718b9770dae -r bf1c472b244e handouts/notation.tex --- a/handouts/notation.tex Tue Sep 26 13:10:56 2017 +0100 +++ b/handouts/notation.tex Tue Sep 26 13:46:45 2017 +0100 @@ -11,7 +11,7 @@ There are innumerable books available about compilers, automata theory and formal languages. Unfortunately, they often use their own notational conventions and their own symbols. This handout is meant to -clarify some of the notation I will use. I appologise in advance that +clarify some of the notation I will use. I apologise in advance that sometimes I will be a bit fuzzy\ldots the problem is that often we want to have convenience in our mathematical definitions (to make them readable and understandable), but other times we need pedantic @@ -54,7 +54,7 @@ Why do I make this distinction? Because we often need to define functions using variables ranging over characters. We need to somehow say this is a variable, say $c$, ranging over -characters, while this is the atual character \pcode{c}. +characters, while this is the actual character \pcode{c}. An \defn{alphabet} is a (non-empty) finite set of characters. Often the letter $\Sigma$ is used to refer to an alphabet. For @@ -73,10 +73,10 @@ double quotes indicate that we are dealing with a string. In typed programming languages, such as Scala, strings have a special type---namely \pcode{String} which is different from the type -for lists of chatacters. This is because strings can be -efficiently represented in memory, unlike general lists. Since -\code{String} and the type of lists of characters, -\code{List[Char]} are not the same, we need to explicitly +for lists of characters. This is because strings can be +efficiently represented in memory, unlike lists. Since +\code{String} and the type of lists of characters +(\code{List[Char]}) are not the same, we need to explicitly coerce elements between the two types, for example \begin{lstlisting}[numbers=none] @@ -84,9 +84,11 @@ res01: List[Char] = List(a, b, c) \end{lstlisting} -\noindent Since in our (mathematical) definitions we regard -strings as lists of characters, we will also write -\dq{$hello$} as +\noindent +However, we do not want to do this kind of explicit coercion in our +pencil-and-paper, everyday arguments. So in our (mathematical) +definitions we regard strings as lists of characters, we will also +write \dq{$hello$} as \[ [\text{\it h, e, l, l, o}] \qquad\text{or simply}\qquad \textit{hello} @@ -106,7 +108,7 @@ which we write as $s_1 @ s_2$. If we regard $s_1$ and $s_2$ as lists of characters, then $@$ is the list-append function. Suppose we are given two strings \dq{\textit{foo}} and -\dq{\textit{bar}}, then their concatenation, writen +\dq{\textit{bar}}, then their concatenation, written \dq{\textit{foo}} $@$ \dq{\textit{bar}}, gives \dq{\textit{foobar}}. But as said above, we will often simplify our life and just drop the double quotes whenever it @@ -116,8 +118,8 @@ Occasionally we will use the notation $a^n$ for strings, which stands for the string of $n$ repeated $a$s. So $a^{n}b^{n}$ is a string that -has as many $a$s by as many $b$s. A simple property of string -concatenation is \emph{associativity}, meaning +has some number of $a$s followed by the same number of $b$s. A simple +property of string concatenation is \emph{associativity}, meaning \[(s_1 @ s_2) @ s_3 = s_1 @ (s_2 @ s_3)\] @@ -140,11 +142,12 @@ \] \noindent The notation $\in$ means \emph{element of}, so $1 \in \{1, -2, 3\}$ is true and $4 \in \{1, 2, 3\}$ is false. Not that the +2, 3\}$ is true and $4 \in \{1, 2, 3\}$ is false. Note that the \emph{list} $[1, 2, 3]$ is something different from the \emph{set} $\{1, 2, 3\}$: in the former we care about the order and potentially several occurrences of a number; while with the latter we do not. -Also sets can potentially have infinitely many elements. For example +Also sets can potentially have infinitely many elements, whereas lists +cannot. For example the set of all natural numbers $\{0, 1, 2, \ldots\}$ is infinite. This set is often also abbreviated as $\mathbb{N}$. Lists can be very large, but they cannot contain infinitely many elements. @@ -203,15 +206,16 @@ \noindent but using the big union notation is more concise. -While this stuff about sets might all look trivial or even needlessly +As an aside: While this stuff about sets might all look trivial or even needlessly pedantic, \emph{Nature} is never simple. If you want to be amazed how complicated sets can get, watch out for the last lecture just before Christmas where I want to convince you of the fact that some sets are more infinite than others. Yes, you read correctly, there can be sets that are ``more infinite'' then others. If you think this is obvious: -say you have the infinite set $\{1, 2, 3, 4, \ldots\}$ which is all +say you have the infinite set $\mathbb{N}\backslash\{0\} = \{1, 2, 3, 4, \ldots\}$ which is all the natural numbers except $0$, and then compare it to the set -$\{0, 1, 2, 3, 4, \ldots\}$ which contains the $0$. Yes, the second must be more infinite\ldots{} well, then think again. Because the two +$\{0, 1, 2, 3, 4, \ldots\}$ which contains the $0$. If you think, +the second must be more infinite\ldots{} well, then think again. Because the two infinite sets \begin{center} @@ -224,7 +228,8 @@ Though this might all look strange this about infinite sets will be a topic that is very relevant to the material of this module. It tells us what we can compute with a computer (actually algorithm) and what -we cannot. +we cannot. But during the first 9 lectures we can go by without this +``weird'' stuff. Another important notion in this module are \defn{languages}, which are sets of strings. One of the main goals for us will be how to @@ -271,7 +276,7 @@ set behaves like $0$ for multiplication and the set $\{[]\}$ like $1$ for multiplication ($n * 1 = n$ and $n * 0 = 0$). -Following the language concatenation, we can define a +Using the operation of language concatenation, we can define a \defn{language power} operation as follows: \begin{eqnarray*} @@ -284,7 +289,7 @@ set, but the set containing the empty string. So no matter what the set $A$ is, $A^0$ will always be $\{[]\}$. (There is another hint about a connection between the $@$-operation and -multiplication: How is $x^n$ defined recursively and what is +multiplication: How is $x^n$ defined in mathematics and what is $x^0$?) Next we can define the \defn{star operation} for languages: @@ -299,14 +304,14 @@ gives \[ -A^0 \cup A^1 \cup A^2 \cup A^3 \cup \ldots +A\star \dn A^0 \cup A^1 \cup A^2 \cup A^3 \cup \ldots \] \noindent which is equal to \[ -\{[]\} \,\cup\, A \,\cup\, A @ A \,\cup\, A @ A @ A \,\cup\, \ldots +A\star \dn \{[]\} \,\cup\, A \,\cup\, A @ A \,\cup\, A @ A @ A \,\cup\, \ldots \] \noindent We can see that the empty string is always in $A\star$, @@ -316,7 +321,7 @@ Recall that an alphabet is often referred to by the letter $\Sigma$. We can now write for the set of \emph{all} strings -over this alphabet $\Sigma\star$. In doing so we also include the +over this alphabet as $\Sigma\star$. In doing so we also include the empty string as a possible string over $\Sigma$. So if $\Sigma = \{a, b\}$, then $\Sigma\star$ is