| author | Christian Urban <christian.urban@kcl.ac.uk> | 
| Tue, 29 Sep 2020 00:51:43 +0100 | |
| changeset 765 | b66602e0b42d | 
| parent 736 | 0494995fd979 | 
| child 830 | c602edae2978 | 
| permissions | -rw-r--r-- | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 1 | \documentclass{article}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 2 | \usepackage{../style}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 3 | \usepackage{../langs}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 4 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 5 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 6 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 7 | \begin{document}
 | 
| 736 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 8 | \fnote{\copyright{} Christian Urban, King's College London, 2014, 2015, 2016, 2017, 2018, 2020}
 | 
| 505 | 9 | |
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 10 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 11 | \section*{A Crash-Course on Notation}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 12 | |
| 476 | 13 | There are innumerable books available about compilers, automata theory | 
| 14 | and formal languages. Unfortunately, they often use their own | |
| 15 | notational conventions and their own symbols. This handout is meant to | |
| 502 | 16 | clarify some of the notation I will use. I apologise in advance that | 
| 476 | 17 | sometimes I will be a bit fuzzy\ldots the problem is that often we | 
| 18 | want to have convenience in our mathematical definitions (to make them | |
| 19 | readable and understandable), but other times we need pedantic | |
| 20 | precision for actual programs. | |
| 241 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 21 | |
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 22 | \subsubsection*{Characters and Strings}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 23 | |
| 736 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 24 | The most basic concept in this module are strings. Strings | 
| 241 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 25 | are composed of \defn{characters}. While characters are surely
 | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 26 | a familiar concept, we will make one subtle distinction in | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 27 | this module. If we want to refer to concrete characters, like | 
| 736 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 28 | \code{a}, \code{b}, \code{c} and so on, we will use a typewriter font.
 | 
| 241 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 29 | Accordingly if we want to refer to the concrete characters of | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 30 | my email address we shall write | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 31 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 32 | \begin{center}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 33 | \pcode{christian.urban@kcl.ac.uk}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 34 | \end{center}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 35 | |
| 241 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 36 | \noindent If we also need to explicitly indicate the ``space'' | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 37 | character, we write \VS{}\hspace{1mm}. For example
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 38 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 39 | \begin{center}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 40 | \tt{}hello\VS\hspace{0.5mm}world
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 41 | \end{center}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 42 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 43 | |
| 241 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 44 | \noindent But often we do not care which particular characters | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 45 | we use. In such cases we use the italic font and write $a$, | 
| 332 
4755ad4b457b
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
266diff
changeset | 46 | $b$, $c$ and so on for characters. Therefore if we need a | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 47 | representative string, we might write | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 48 | |
| 404 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 49 | \[ | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 50 | abracadabra | 
| 404 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 51 | \] | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 52 | |
| 241 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 53 | \noindent In this string, we do not really care what the | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 54 | characters stand for, except we do care about the fact that | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 55 | for example the character $a$ is not equal to $b$ and so on. | 
| 404 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 56 | Why do I make this distinction? Because we often need to | 
| 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 57 | define functions using variables ranging over characters. We | 
| 736 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 58 | need to somehow say ``this-is-a-variable'' and give it a name. | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 59 | In such cases we use the italic font. | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 60 | |
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 61 | |
| 241 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 62 | An \defn{alphabet} is a (non-empty) finite set of characters.
 | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 63 | Often the letter $\Sigma$ is used to refer to an alphabet. For | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 64 | example the ASCII characters \pcode{a} to \pcode{z} form an
 | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 65 | alphabet. The digits $0$ to $9$ are another alphabet. The | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 66 | Greek letters $\alpha$ to $\omega$ also form an alphabet. If | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 67 | nothing else is specified, we usually assume the alphabet | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 68 | consists of just the lower-case letters $a$, $b$, \ldots, $z$. | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 69 | Sometimes, however, we explicitly want to restrict strings to | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 70 | contain only the letters $a$ and $b$, for example. In this | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 71 | case we will state that the alphabet is the set $\{a, b\}$. 
 | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 72 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 73 | \defn{Strings} are lists of characters. Unfortunately, there
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 74 | are many ways how we can write down strings. In programming | 
| 496 | 75 | languages, they are usually written as \dq{\texttt{hello}} where the
 | 
| 404 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 76 | double quotes indicate that we are dealing with a string. In | 
| 496 | 77 | typed programming languages, such as Scala, strings have a special | 
| 404 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 78 | type---namely \pcode{String} which is different from the type
 | 
| 502 | 79 | for lists of characters. This is because strings can be | 
| 80 | efficiently represented in memory, unlike lists. Since | |
| 81 | \code{String} and the type of lists of characters
 | |
| 736 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 82 | (namely \code{List[Char]}) are not the same, we need to explicitly
 | 
| 404 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 83 | coerce elements between the two types, for example | 
| 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 84 | |
| 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 85 | \begin{lstlisting}[numbers=none]
 | 
| 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 86 | scala> "abc".toList | 
| 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 87 | res01: List[Char] = List(a, b, c) | 
| 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 88 | \end{lstlisting}
 | 
| 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 89 | |
| 502 | 90 | \noindent | 
| 91 | However, we do not want to do this kind of explicit coercion in our | |
| 92 | pencil-and-paper, everyday arguments. So in our (mathematical) | |
| 550 | 93 | definitions we regard strings as lists of characters and we will also | 
| 94 | write \dq{$hello$} as list
 | |
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 95 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 96 | \[ | 
| 404 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 97 | [\text{\it h, e, l, l, o}] \qquad\text{or simply}\qquad \textit{hello}
 | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 98 | \] | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 99 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 100 | \noindent The important point is that we can always decompose | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 101 | such strings. For example, we will often consider the first | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 102 | character of a string, say $h$, and the ``rest'' of a string | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 103 | say \dq{\textit{ello}} when making definitions about strings.
 | 
| 241 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 104 | There are also some subtleties with the empty string, | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 105 | sometimes written as \dq{} but also as the empty list of
 | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 106 | characters $[\,]$.\footnote{In the literature you can also
 | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 107 | often find that $\varepsilon$ or $\lambda$ is used to | 
| 736 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 108 | represent the empty string. But we are not going to use this notation.} | 
| 241 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 109 | |
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 110 | Two strings, say $s_1$ and $s_2$, can be \defn{concatenated},
 | 
| 404 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 111 | which we write as $s_1 @ s_2$. If we regard $s_1$ and $s_2$ as | 
| 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 112 | lists of characters, then $@$ is the list-append function. | 
| 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 113 | Suppose we are given two strings \dq{\textit{foo}} and
 | 
| 502 | 114 | \dq{\textit{bar}}, then their concatenation, written
 | 
| 404 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 115 | \dq{\textit{foo}} $@$ \dq{\textit{bar}}, gives
 | 
| 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 116 | \dq{\textit{foobar}}. But as said above, we will often
 | 
| 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 117 | simplify our life and just drop the double quotes whenever it | 
| 736 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 118 | is clear we are talking about strings. So we will just | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 119 | write \textit{foo}, \textit{bar}, \textit{foobar} 
 | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 120 | \textit{foo $@$ bar} and so on.
 | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 121 | |
| 496 | 122 | Occasionally we will use the notation $a^n$ for strings, which stands | 
| 123 | for the string of $n$ repeated $a$s. So $a^{n}b^{n}$ is a string that
 | |
| 736 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 124 | has some number of $a$s followed by the same number of $b$s. | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 125 | Confusingly, in Scala the notation is ``times'' for this opration. | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 126 | So you can write | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 127 | |
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 128 | \begin{lstlisting}[numbers=none]
 | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 129 | scala> "a" * 13 | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 130 | val res02: String = aaaaaaaaaaaaa | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 131 | \end{lstlisting}
 | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 132 | |
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 133 | \noindent | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 134 | A simple property of string concatenation is \emph{associativity}, meaning
 | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 135 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 136 | \[(s_1 @ s_2) @ s_3 = s_1 @ (s_2 @ s_3)\] | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 137 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 138 | \noindent are always equal strings. The empty string behaves | 
| 404 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 139 | like a \emph{unit element}, therefore
 | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 140 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 141 | \[s \,@\, [] = [] \,@\, s = s\] | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 142 | |
| 241 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 143 | |
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 144 | \subsubsection*{Sets and Languages}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 145 | |
| 241 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 146 | We will use the familiar operations $\cup$, $\cap$, $\subset$ | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 147 | and $\subseteq$ for sets. For the empty set we will either | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 148 | write $\varnothing$ or $\{\,\}$. The set containing the
 | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 149 | natural numbers $1$, $2$ and $3$, for example, we will write | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 150 | with curly braces as | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 151 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 152 | \[ | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 153 | \{1, 2, 3\}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 154 | \] | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 155 | |
| 496 | 156 | \noindent The notation $\in$ means \emph{element of}, so $1 \in \{1,
 | 
| 502 | 157 | 2, 3\}$ is true and $4 \in \{1, 2, 3\}$ is false.  Note that the
 | 
| 496 | 158 | \emph{list} $[1, 2, 3]$ is something different from the \emph{set}
 | 
| 159 | $\{1, 2, 3\}$: in the former we care about the order and potentially
 | |
| 160 | several occurrences of a number; while with the latter we do not. | |
| 502 | 161 | Also sets can potentially have infinitely many elements, whereas lists | 
| 162 | cannot. For example | |
| 496 | 163 | the set of all natural numbers $\{0, 1, 2, \ldots\}$ is infinite. This
 | 
| 164 | set is often also abbreviated as $\mathbb{N}$. Lists can be very large, but they cannot contain infinitely many elements.
 | |
| 165 | ||
| 550 | 166 | We can define sets by giving all their elements, for example $\{0, 1\}$ for
 | 
| 167 | the set containing just $0$ and $1$. But often we need to use \defn{set
 | |
| 168 |   comprehensions} to define sets. For example the set of all \emph{even}
 | |
| 169 | natural numbers can be defined as | |
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 170 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 171 | \[ | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 172 | \{n\;|\;n\in\mathbb{N} \wedge n\;\text{is even}\}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 173 | \] | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 174 | |
| 550 | 175 | \noindent Set comprehensions consist of a ``base set'' (in this case | 
| 176 | all the natural numbers) and a predicate (here eveness). | |
| 177 | ||
| 178 | Though silly, but the set $\{0, 1, 2\}$ could also be
 | |
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 179 | defined by the following set comprehension | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 180 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 181 | \[ | 
| 550 | 182 | \{n\;|\;  n \in \mathbb{N} \wedge n^2 < 9\}
 | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 183 | \] | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 184 | |
| 496 | 185 | \noindent Can you see why this defines the set $\{0, 1, 2\}$?  Notice
 | 
| 550 | 186 | that set comprehensions are quite powerful constructions. For example they | 
| 187 | could be used to define set union, | |
| 188 | set intersection and set difference: | |
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 189 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 190 | \begin{eqnarray*}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 191 | A \cup B & \dn & \{x\;|\; x \in A \vee x \in B\}\\
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 192 | A \cap B & \dn & \{x\;|\; x \in A \wedge x \in B\}\\
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 193 | A \backslash B & \dn & \{x\;|\; x \in A \wedge x \not\in B\} 
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 194 | \end{eqnarray*}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 195 | |
| 241 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 196 | \noindent In general set comprehensions are of the form | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 197 | $\{a\;|\;P\}$ which stands for the set of all elements $a$
 | 
| 736 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 198 | (from some set) for which some property $P$ holds. If programming | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 199 | is more your-kind-of-thing, you might recognise the similarities | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 200 | with for-comprehensions, for example for the silly set above you | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 201 | could write | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 202 | |
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 203 | \begin{lstlisting}[numbers=none]
 | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 204 | scala> for (n <- (0 to 10).toSet; if n * n < 9) yield n | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 205 | val res03: Set[Int] = Set(0, 1, 2) | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 206 | \end{lstlisting}
 | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 207 | |
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 208 | \noindent | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 209 | This is pretty much the same as $\{n\;|\;  n \in \mathbb{N} \wedge n^2 < 9\}$
 | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 210 | just in Scala syntax. | 
| 241 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 211 | |
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 212 | For defining sets, we will also often use the notion of the | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 213 | ``big union''. An example is as follows: | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 214 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 215 | \begin{equation}\label{bigunion}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 216 | \bigcup_{0\le n}\; \{n^2, n^2 + 1\}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 217 | \end{equation}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 218 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 219 | \noindent which is the set of all squares and their immediate | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 220 | successors, so | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 221 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 222 | \[ | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 223 | \{0, 1, 2, 4, 5, 9, 10, 16, 17, \ldots\}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 224 | \] | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 225 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 226 | \noindent A big union is a sequence of unions which are | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 227 | indexed typically by a natural number. So the big union in | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 228 | \eqref{bigunion} could equally be written as
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 229 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 230 | \[ | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 231 | \{0, 1\} \cup \{1, 2\} \cup \{4, 5\} \cup \{9, 10\} \cup 
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 232 | \ldots | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 233 | \] | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 234 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 235 | \noindent but using the big union notation is more concise. | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 236 | |
| 550 | 237 | As an aside: While this stuff about sets might all look trivial or | 
| 238 | even needlessly pedantic, \emph{Nature} is never simple. If you want
 | |
| 239 | to be amazed how complicated sets can get, watch out for the last | |
| 240 | lecture just before Christmas where I want to convince you of the fact | |
| 241 | that some sets are more infinite than other sets. Yes, you read | |
| 242 | correctly, there can be sets that are ``more infinite'' than | |
| 243 | others. If you think this is obvious: say you have the infinite set | |
| 244 | $\mathbb{N}\backslash\{0\} = \{1, 2, 3, 4, \ldots\}$ which is all the
 | |
| 245 | natural numbers except $0$, and then compare it to the set | |
| 246 | $\{0, 1, 2, 3, 4, \ldots\}$ which contains the $0$ and all other
 | |
| 247 | numbers. If you think, the second must be more infinite\ldots{} well,
 | |
| 248 | then think again. Because the two infinite sets | |
| 496 | 249 | |
| 250 | \begin{center}
 | |
| 251 |   $\{1, 2, 3, 4, \ldots\}$ and
 | |
| 252 |   $\{0, 1, 2, 3, 4, \ldots\}$
 | |
| 253 | \end{center}
 | |
| 254 | ||
| 255 | \noindent | |
| 550 | 256 | contain actually the same amount of elements. Does this make sense? | 
| 257 | Though this might all look strange, infinite sets will be a | |
| 496 | 258 | topic that is very relevant to the material of this module. It tells | 
| 736 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 259 | us what we can compute with a computer (actually an algorithm) and what | 
| 502 | 260 | we cannot. But during the first 9 lectures we can go by without this | 
| 550 | 261 | ``weird'' stuff. End of aside.\smallskip | 
| 404 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 262 | |
| 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 263 | Another important notion in this module are \defn{languages}, which
 | 
| 398 
c8ce95067c1a
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
332diff
changeset | 264 | are sets of strings. One of the main goals for us will be how to | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 265 | (formally) specify languages and to find out whether a string | 
| 241 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 266 | is in a language or not.\footnote{You might wish to ponder
 | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 267 | whether this is in general a hard or easy problem, where | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 268 | hardness is meant in terms of Turing decidable, for example.} | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 269 | Note that the language containing the empty string $\{\dq{}\}$
 | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 270 | is not equal to $\varnothing$, the empty language (or empty | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 271 | set): The former contains one element, namely \dq{} (also
 | 
| 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 272 | written $[\,]$), but the latter does not contain any | 
| 736 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 273 | element at all! Make sure you see the difference. | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 274 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 275 | For languages we define the operation of \defn{language
 | 
| 241 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 276 | concatenation}, written like in the string case as $A @ B$: | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 277 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 278 | \begin{equation}\label{langconc}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 279 | A @ B \dn \{s_1 @ s_2\;|\; s_1\in A \wedge s_2\in B\}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 280 | \end{equation}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 281 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 282 | \noindent Be careful to understand the difference: the $@$ | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 283 | in $s_1 @ s_2$ is string concatenation, while $A @ B$ refers | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 284 | to the concatenation of two languages (or sets of strings). | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 285 | As an example suppose $A=\{ab, ac\}$ and $B=\{zzz, qq, r\}$,
 | 
| 241 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 286 | then $A \,@\, B$ is the language | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 287 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 288 | \[ | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 289 | \{abzzz, abqq, abr, aczzz, acqq, acr\}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 290 | \] | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 291 | |
| 736 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 292 | \noindent The cool thing about Scala is that we can define language | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 293 | concatenation very elegantly as | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 294 | |
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 295 | \begin{lstlisting}[numbers=none]
 | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 296 | def concat(A: Set[String], B: Set[String]) = | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 297 | for (x <- A ; y <- B) yield x ++ y | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 298 | \end{lstlisting}
 | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 299 | |
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 300 | \noindent | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 301 | where \code{++} is string concatenation in Scala.
 | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 302 | |
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 303 | Recall the properties for string concatenation. For | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 304 | language concatenation we have the following properties | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 305 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 306 | \begin{center}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 307 | \begin{tabular}{ll}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 308 | associativity: & $(A @ B) @ C = A @ (B @ C)$\\ | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 309 | unit element:  & $A \,@\, \{[]\} = \{[]\} \,@\, A = A$\\
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 310 | zero element: & $A \,@\, \varnothing = \varnothing \,@\, A = | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 311 | \varnothing$ | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 312 | \end{tabular}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 313 | \end{center}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 314 | |
| 241 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 315 | \noindent Note the difference in the last two lines: the empty | 
| 736 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 316 | set behaves like $0$ for multiplication, whereas the set $\{[]\}$
 | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 317 | behaves like $1$ for multiplication ($n * 1 = n$ and $n * 0 = 0$). | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 318 | Again this is a subtletly you need to get compfortable with. | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 319 | |
| 502 | 320 | Using the operation of language concatenation, we can define a | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 321 | \defn{language power} operation as follows:
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 322 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 323 | \begin{eqnarray*}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 324 | A^0     & \dn & \{[]\}\\
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 325 | A^{n+1} & \dn & A \,@\, A^n
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 326 | \end{eqnarray*}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 327 | |
| 241 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 328 | \noindent This definition is by recursion on natural numbers. | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 329 | Note carefully that the zero-case is not defined as the empty | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 330 | set, but the set containing the empty string. So no matter | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 331 | what the set $A$ is, $A^0$ will always be $\{[]\}$. (There is
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 332 | another hint about a connection between the $@$-operation and | 
| 502 | 333 | multiplication: How is $x^n$ defined in mathematics and what is | 
| 242 
35104ee14f87
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
241diff
changeset | 334 | $x^0$?) | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 335 | |
| 242 
35104ee14f87
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
241diff
changeset | 336 | Next we can define the \defn{star operation} for languages:
 | 
| 404 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 337 | $A\star$ is the union of all powers of $A$, or short | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 338 | |
| 241 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 339 | \begin{equation}\label{star}
 | 
| 404 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 340 | A\star \dn \bigcup_{0\le n}\; A^n
 | 
| 241 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 341 | \end{equation}
 | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 342 | |
| 241 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 343 | \noindent This star operation is often also called | 
| 736 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 344 | \emph{Kleene-star} after the mathematician/computer scientist
 | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 345 | Stephen Cole Kleene. Unfolding the definition in~\eqref{star}
 | 
| 241 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 346 | gives | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 347 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 348 | \[ | 
| 502 | 349 | A\star \dn A^0 \cup A^1 \cup A^2 \cup A^3 \cup \ldots | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 350 | \] | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 351 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 352 | \noindent | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 353 | which is equal to | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 354 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 355 | \[ | 
| 502 | 356 | A\star \dn \{[]\} \,\cup\, A \,\cup\, A @ A \,\cup\, A @ A @ A \,\cup\, \ldots
 | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 357 | \] | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 358 | |
| 404 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 359 | \noindent We can see that the empty string is always in $A\star$, | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 360 | no matter what $A$ is. This is because $[] \in A^0$. To make | 
| 241 
10f02605a46a
polished
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
239diff
changeset | 361 | sure you understand these definitions, I leave you to answer | 
| 404 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 362 | what $\{[]\}\star$ and $\varnothing\star$ are?
 | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 363 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 364 | Recall that an alphabet is often referred to by the letter | 
| 404 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 365 | $\Sigma$. We can now write for the set of \emph{all} strings
 | 
| 502 | 366 | over this alphabet as $\Sigma\star$. In doing so we also include the | 
| 736 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 367 | empty string as a possible string (over $\Sigma$). Assuming $\Sigma | 
| 404 
245d302791c7
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
398diff
changeset | 368 | = \{a, b\}$, then $\Sigma\star$ is
 | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 369 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 370 | \[ | 
| 246 
baf41b05210f
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
242diff
changeset | 371 | \{[], a, b, aa, ab, ba, bb, aaa, aab, aba, abb, baa, bab, \ldots\}
 | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 372 | \] | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 373 | |
| 736 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 374 | \noindent or in words all strings containing $a$s and | 
| 246 
baf41b05210f
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
242diff
changeset | 375 | $b$s only, plus the empty string. | 
| 736 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 376 | \bigskip | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 377 | |
| 736 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 378 | \noindent | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 379 | Thanks for making it until here! There are also some personal conventions | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 380 | about regular expressions. But I will explain them in the handout for the | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 381 | first week. An exercise you can do: Implement the power operation for languages | 
| 
0494995fd979
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
550diff
changeset | 382 | and try out some examples. | 
| 239 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 383 | \end{document}
 | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 384 | |
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 385 | %%% Local Variables: | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 386 | %%% mode: latex | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 387 | %%% TeX-master: t | 
| 
68d98140b90b
added notation handout
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 388 | %%% End: |