author | Christian Urban <christian.urban@kcl.ac.uk> |
Fri, 29 Nov 2024 18:59:32 +0000 | |
changeset 976 | e9eac62928f5 |
parent 875 | 49d21814a633 |
permissions | -rw-r--r-- |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
1 |
\documentclass{article} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
2 |
\usepackage{../style} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
3 |
\usepackage{../langs} |
830 | 4 |
\usepackage{../graphics} |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
5 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
6 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
7 |
\begin{document} |
736
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
8 |
\fnote{\copyright{} Christian Urban, King's College London, 2014, 2015, 2016, 2017, 2018, 2020} |
505 | 9 |
|
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
10 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
11 |
\section*{A Crash-Course on Notation} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
12 |
|
875 | 13 |
There are an innumerable number of books available on compilers, automata theory |
476 | 14 |
and formal languages. Unfortunately, they often use their own |
15 |
notational conventions and their own symbols. This handout is meant to |
|
502 | 16 |
clarify some of the notation I will use. I apologise in advance that |
476 | 17 |
sometimes I will be a bit fuzzy\ldots the problem is that often we |
18 |
want to have convenience in our mathematical definitions (to make them |
|
19 |
readable and understandable), but other times we need pedantic |
|
20 |
precision for actual programs. |
|
241
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
21 |
|
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
22 |
\subsubsection*{Characters and Strings} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
23 |
|
736
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
24 |
The most basic concept in this module are strings. Strings |
241
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
25 |
are composed of \defn{characters}. While characters are surely |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
26 |
a familiar concept, we will make one subtle distinction in |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
27 |
this module. If we want to refer to concrete characters, like |
736
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
28 |
\code{a}, \code{b}, \code{c} and so on, we will use a typewriter font. |
241
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
29 |
Accordingly if we want to refer to the concrete characters of |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
30 |
my email address we shall write |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
31 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
32 |
\begin{center} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
33 |
\pcode{christian.urban@kcl.ac.uk} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
34 |
\end{center} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
35 |
|
241
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
36 |
\noindent If we also need to explicitly indicate the ``space'' |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
37 |
character, we write \VS{}\hspace{1mm}. For example |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
38 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
39 |
\begin{center} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
40 |
\tt{}hello\VS\hspace{0.5mm}world |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
41 |
\end{center} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
42 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
43 |
|
241
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
44 |
\noindent But often we do not care which particular characters |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
45 |
we use. In such cases we use the italic font and write $a$, |
332
4755ad4b457b
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
266
diff
changeset
|
46 |
$b$, $c$ and so on for characters. Therefore if we need a |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
47 |
representative string, we might write |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
48 |
|
404
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
49 |
\[ |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
50 |
abracadabra |
404
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
51 |
\] |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
52 |
|
241
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
53 |
\noindent In this string, we do not really care what the |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
54 |
characters stand for, except we do care about the fact that |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
55 |
for example the character $a$ is not equal to $b$ and so on. |
404
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
56 |
Why do I make this distinction? Because we often need to |
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
57 |
define functions using variables ranging over characters. We |
736
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
58 |
need to somehow say ``this-is-a-variable'' and give it a name. |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
59 |
In such cases we use the italic font. |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
60 |
|
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
61 |
|
241
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
62 |
An \defn{alphabet} is a (non-empty) finite set of characters. |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
63 |
Often the letter $\Sigma$ is used to refer to an alphabet. For |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
64 |
example the ASCII characters \pcode{a} to \pcode{z} form an |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
65 |
alphabet. The digits $0$ to $9$ are another alphabet. The |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
66 |
Greek letters $\alpha$ to $\omega$ also form an alphabet. If |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
67 |
nothing else is specified, we usually assume the alphabet |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
68 |
consists of just the lower-case letters $a$, $b$, \ldots, $z$. |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
69 |
Sometimes, however, we explicitly want to restrict strings to |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
70 |
contain only the letters $a$ and $b$, for example. In this |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
71 |
case we will state that the alphabet is the set $\{a, b\}$. |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
72 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
73 |
\defn{Strings} are lists of characters. Unfortunately, there |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
74 |
are many ways how we can write down strings. In programming |
496 | 75 |
languages, they are usually written as \dq{\texttt{hello}} where the |
404
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
76 |
double quotes indicate that we are dealing with a string. In |
496 | 77 |
typed programming languages, such as Scala, strings have a special |
404
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
78 |
type---namely \pcode{String} which is different from the type |
502 | 79 |
for lists of characters. This is because strings can be |
80 |
efficiently represented in memory, unlike lists. Since |
|
81 |
\code{String} and the type of lists of characters |
|
736
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
82 |
(namely \code{List[Char]}) are not the same, we need to explicitly |
404
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
83 |
coerce elements between the two types, for example |
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
84 |
|
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
85 |
\begin{lstlisting}[numbers=none] |
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
86 |
scala> "abc".toList |
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
87 |
res01: List[Char] = List(a, b, c) |
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
88 |
\end{lstlisting} |
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
89 |
|
502 | 90 |
\noindent |
91 |
However, we do not want to do this kind of explicit coercion in our |
|
92 |
pencil-and-paper, everyday arguments. So in our (mathematical) |
|
550 | 93 |
definitions we regard strings as lists of characters and we will also |
94 |
write \dq{$hello$} as list |
|
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
95 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
96 |
\[ |
404
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
97 |
[\text{\it h, e, l, l, o}] \qquad\text{or simply}\qquad \textit{hello} |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
98 |
\] |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
99 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
100 |
\noindent The important point is that we can always decompose |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
101 |
such strings. For example, we will often consider the first |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
102 |
character of a string, say $h$, and the ``rest'' of a string |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
103 |
say \dq{\textit{ello}} when making definitions about strings. |
241
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
104 |
There are also some subtleties with the empty string, |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
105 |
sometimes written as \dq{} but also as the empty list of |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
106 |
characters $[\,]$.\footnote{In the literature you can also |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
107 |
often find that $\varepsilon$ or $\lambda$ is used to |
736
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
108 |
represent the empty string. But we are not going to use this notation.} |
241
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
109 |
|
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
110 |
Two strings, say $s_1$ and $s_2$, can be \defn{concatenated}, |
404
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
111 |
which we write as $s_1 @ s_2$. If we regard $s_1$ and $s_2$ as |
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
112 |
lists of characters, then $@$ is the list-append function. |
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
113 |
Suppose we are given two strings \dq{\textit{foo}} and |
502 | 114 |
\dq{\textit{bar}}, then their concatenation, written |
404
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
115 |
\dq{\textit{foo}} $@$ \dq{\textit{bar}}, gives |
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
116 |
\dq{\textit{foobar}}. But as said above, we will often |
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
117 |
simplify our life and just drop the double quotes whenever it |
736
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
118 |
is clear we are talking about strings. So we will just |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
119 |
write \textit{foo}, \textit{bar}, \textit{foobar} |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
120 |
\textit{foo $@$ bar} and so on. |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
121 |
|
496 | 122 |
Occasionally we will use the notation $a^n$ for strings, which stands |
123 |
for the string of $n$ repeated $a$s. So $a^{n}b^{n}$ is a string that |
|
736
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
124 |
has some number of $a$s followed by the same number of $b$s. |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
125 |
Confusingly, in Scala the notation is ``times'' for this opration. |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
126 |
So you can write |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
127 |
|
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
128 |
\begin{lstlisting}[numbers=none] |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
129 |
scala> "a" * 13 |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
130 |
val res02: String = aaaaaaaaaaaaa |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
131 |
\end{lstlisting} |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
132 |
|
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
133 |
\noindent |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
134 |
A simple property of string concatenation is \emph{associativity}, meaning |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
135 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
136 |
\[(s_1 @ s_2) @ s_3 = s_1 @ (s_2 @ s_3)\] |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
137 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
138 |
\noindent are always equal strings. The empty string behaves |
404
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
139 |
like a \emph{unit element}, therefore |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
140 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
141 |
\[s \,@\, [] = [] \,@\, s = s\] |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
142 |
|
241
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
143 |
|
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
144 |
\subsubsection*{Sets and Languages} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
145 |
|
241
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
146 |
We will use the familiar operations $\cup$, $\cap$, $\subset$ |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
147 |
and $\subseteq$ for sets. For the empty set we will either |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
148 |
write $\varnothing$ or $\{\,\}$. The set containing the |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
149 |
natural numbers $1$, $2$ and $3$, for example, we will write |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
150 |
with curly braces as |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
151 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
152 |
\[ |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
153 |
\{1, 2, 3\} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
154 |
\] |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
155 |
|
496 | 156 |
\noindent The notation $\in$ means \emph{element of}, so $1 \in \{1, |
502 | 157 |
2, 3\}$ is true and $4 \in \{1, 2, 3\}$ is false. Note that the |
496 | 158 |
\emph{list} $[1, 2, 3]$ is something different from the \emph{set} |
159 |
$\{1, 2, 3\}$: in the former we care about the order and potentially |
|
160 |
several occurrences of a number; while with the latter we do not. |
|
502 | 161 |
Also sets can potentially have infinitely many elements, whereas lists |
162 |
cannot. For example |
|
496 | 163 |
the set of all natural numbers $\{0, 1, 2, \ldots\}$ is infinite. This |
164 |
set is often also abbreviated as $\mathbb{N}$. Lists can be very large, but they cannot contain infinitely many elements. |
|
165 |
||
550 | 166 |
We can define sets by giving all their elements, for example $\{0, 1\}$ for |
167 |
the set containing just $0$ and $1$. But often we need to use \defn{set |
|
168 |
comprehensions} to define sets. For example the set of all \emph{even} |
|
169 |
natural numbers can be defined as |
|
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
170 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
171 |
\[ |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
172 |
\{n\;|\;n\in\mathbb{N} \wedge n\;\text{is even}\} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
173 |
\] |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
174 |
|
550 | 175 |
\noindent Set comprehensions consist of a ``base set'' (in this case |
176 |
all the natural numbers) and a predicate (here eveness). |
|
177 |
||
178 |
Though silly, but the set $\{0, 1, 2\}$ could also be |
|
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
179 |
defined by the following set comprehension |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
180 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
181 |
\[ |
550 | 182 |
\{n\;|\; n \in \mathbb{N} \wedge n^2 < 9\} |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
183 |
\] |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
184 |
|
496 | 185 |
\noindent Can you see why this defines the set $\{0, 1, 2\}$? Notice |
550 | 186 |
that set comprehensions are quite powerful constructions. For example they |
187 |
could be used to define set union, |
|
188 |
set intersection and set difference: |
|
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
189 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
190 |
\begin{eqnarray*} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
191 |
A \cup B & \dn & \{x\;|\; x \in A \vee x \in B\}\\ |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
192 |
A \cap B & \dn & \{x\;|\; x \in A \wedge x \in B\}\\ |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
193 |
A \backslash B & \dn & \{x\;|\; x \in A \wedge x \not\in B\} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
194 |
\end{eqnarray*} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
195 |
|
241
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
196 |
\noindent In general set comprehensions are of the form |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
197 |
$\{a\;|\;P\}$ which stands for the set of all elements $a$ |
736
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
198 |
(from some set) for which some property $P$ holds. If programming |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
199 |
is more your-kind-of-thing, you might recognise the similarities |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
200 |
with for-comprehensions, for example for the silly set above you |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
201 |
could write |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
202 |
|
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
203 |
\begin{lstlisting}[numbers=none] |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
204 |
scala> for (n <- (0 to 10).toSet; if n * n < 9) yield n |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
205 |
val res03: Set[Int] = Set(0, 1, 2) |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
206 |
\end{lstlisting} |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
207 |
|
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
208 |
\noindent |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
209 |
This is pretty much the same as $\{n\;|\; n \in \mathbb{N} \wedge n^2 < 9\}$ |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
210 |
just in Scala syntax. |
241
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
211 |
|
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
212 |
For defining sets, we will also often use the notion of the |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
213 |
``big union''. An example is as follows: |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
214 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
215 |
\begin{equation}\label{bigunion} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
216 |
\bigcup_{0\le n}\; \{n^2, n^2 + 1\} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
217 |
\end{equation} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
218 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
219 |
\noindent which is the set of all squares and their immediate |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
220 |
successors, so |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
221 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
222 |
\[ |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
223 |
\{0, 1, 2, 4, 5, 9, 10, 16, 17, \ldots\} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
224 |
\] |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
225 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
226 |
\noindent A big union is a sequence of unions which are |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
227 |
indexed typically by a natural number. So the big union in |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
228 |
\eqref{bigunion} could equally be written as |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
229 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
230 |
\[ |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
231 |
\{0, 1\} \cup \{1, 2\} \cup \{4, 5\} \cup \{9, 10\} \cup |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
232 |
\ldots |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
233 |
\] |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
234 |
|
830 | 235 |
\noindent but using the big union notation is more concise.\medskip |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
236 |
|
550 | 237 |
As an aside: While this stuff about sets might all look trivial or |
238 |
even needlessly pedantic, \emph{Nature} is never simple. If you want |
|
239 |
to be amazed how complicated sets can get, watch out for the last |
|
240 |
lecture just before Christmas where I want to convince you of the fact |
|
241 |
that some sets are more infinite than other sets. Yes, you read |
|
242 |
correctly, there can be sets that are ``more infinite'' than |
|
243 |
others. If you think this is obvious: say you have the infinite set |
|
244 |
$\mathbb{N}\backslash\{0\} = \{1, 2, 3, 4, \ldots\}$ which is all the |
|
245 |
natural numbers except $0$, and then compare it to the set |
|
246 |
$\{0, 1, 2, 3, 4, \ldots\}$ which contains the $0$ and all other |
|
247 |
numbers. If you think, the second must be more infinite\ldots{} well, |
|
248 |
then think again. Because the two infinite sets |
|
496 | 249 |
|
250 |
\begin{center} |
|
251 |
$\{1, 2, 3, 4, \ldots\}$ and |
|
252 |
$\{0, 1, 2, 3, 4, \ldots\}$ |
|
253 |
\end{center} |
|
254 |
||
255 |
\noindent |
|
830 | 256 |
contain actually the same amount of elements. Does this make sense to you? |
257 |
If yes, good. If not, then something to learn about. |
|
258 |
||
550 | 259 |
Though this might all look strange, infinite sets will be a |
496 | 260 |
topic that is very relevant to the material of this module. It tells |
736
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
261 |
us what we can compute with a computer (actually an algorithm) and what |
502 | 262 |
we cannot. But during the first 9 lectures we can go by without this |
550 | 263 |
``weird'' stuff. End of aside.\smallskip |
404
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
264 |
|
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
265 |
Another important notion in this module are \defn{languages}, which |
398
c8ce95067c1a
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
332
diff
changeset
|
266 |
are sets of strings. One of the main goals for us will be how to |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
267 |
(formally) specify languages and to find out whether a string |
241
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
268 |
is in a language or not.\footnote{You might wish to ponder |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
269 |
whether this is in general a hard or easy problem, where |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
270 |
hardness is meant in terms of Turing decidable, for example.} |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
271 |
Note that the language containing the empty string $\{\dq{}\}$ |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
272 |
is not equal to $\varnothing$, the empty language (or empty |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
273 |
set): The former contains one element, namely \dq{} (also |
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
274 |
written $[\,]$), but the latter does not contain any |
736
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
275 |
element at all! Make sure you see the difference. |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
276 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
277 |
For languages we define the operation of \defn{language |
241
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
278 |
concatenation}, written like in the string case as $A @ B$: |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
279 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
280 |
\begin{equation}\label{langconc} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
281 |
A @ B \dn \{s_1 @ s_2\;|\; s_1\in A \wedge s_2\in B\} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
282 |
\end{equation} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
283 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
284 |
\noindent Be careful to understand the difference: the $@$ |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
285 |
in $s_1 @ s_2$ is string concatenation, while $A @ B$ refers |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
286 |
to the concatenation of two languages (or sets of strings). |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
287 |
As an example suppose $A=\{ab, ac\}$ and $B=\{zzz, qq, r\}$, |
241
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
288 |
then $A \,@\, B$ is the language |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
289 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
290 |
\[ |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
291 |
\{abzzz, abqq, abr, aczzz, acqq, acr\} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
292 |
\] |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
293 |
|
736
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
294 |
\noindent The cool thing about Scala is that we can define language |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
295 |
concatenation very elegantly as |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
296 |
|
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
297 |
\begin{lstlisting}[numbers=none] |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
298 |
def concat(A: Set[String], B: Set[String]) = |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
299 |
for (x <- A ; y <- B) yield x ++ y |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
300 |
\end{lstlisting} |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
301 |
|
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
302 |
\noindent |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
303 |
where \code{++} is string concatenation in Scala. |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
304 |
|
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
305 |
Recall the properties for string concatenation. For |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
306 |
language concatenation we have the following properties |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
307 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
308 |
\begin{center} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
309 |
\begin{tabular}{ll} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
310 |
associativity: & $(A @ B) @ C = A @ (B @ C)$\\ |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
311 |
unit element: & $A \,@\, \{[]\} = \{[]\} \,@\, A = A$\\ |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
312 |
zero element: & $A \,@\, \varnothing = \varnothing \,@\, A = |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
313 |
\varnothing$ |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
314 |
\end{tabular} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
315 |
\end{center} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
316 |
|
241
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
317 |
\noindent Note the difference in the last two lines: the empty |
736
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
318 |
set behaves like $0$ for multiplication, whereas the set $\{[]\}$ |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
319 |
behaves like $1$ for multiplication ($n * 1 = n$ and $n * 0 = 0$). |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
320 |
Again this is a subtletly you need to get compfortable with. |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
321 |
|
502 | 322 |
Using the operation of language concatenation, we can define a |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
323 |
\defn{language power} operation as follows: |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
324 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
325 |
\begin{eqnarray*} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
326 |
A^0 & \dn & \{[]\}\\ |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
327 |
A^{n+1} & \dn & A \,@\, A^n |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
328 |
\end{eqnarray*} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
329 |
|
241
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
330 |
\noindent This definition is by recursion on natural numbers. |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
331 |
Note carefully that the zero-case is not defined as the empty |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
332 |
set, but the set containing the empty string. So no matter |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
333 |
what the set $A$ is, $A^0$ will always be $\{[]\}$. (There is |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
334 |
another hint about a connection between the $@$-operation and |
502 | 335 |
multiplication: How is $x^n$ defined in mathematics and what is |
242
35104ee14f87
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
241
diff
changeset
|
336 |
$x^0$?) |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
337 |
|
242
35104ee14f87
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
241
diff
changeset
|
338 |
Next we can define the \defn{star operation} for languages: |
404
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
339 |
$A\star$ is the union of all powers of $A$, or short |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
340 |
|
241
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
341 |
\begin{equation}\label{star} |
404
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
342 |
A\star \dn \bigcup_{0\le n}\; A^n |
241
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
343 |
\end{equation} |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
344 |
|
241
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
345 |
\noindent This star operation is often also called |
736
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
346 |
\emph{Kleene-star} after the mathematician/computer scientist |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
347 |
Stephen Cole Kleene. Unfolding the definition in~\eqref{star} |
241
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
348 |
gives |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
349 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
350 |
\[ |
502 | 351 |
A\star \dn A^0 \cup A^1 \cup A^2 \cup A^3 \cup \ldots |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
352 |
\] |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
353 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
354 |
\noindent |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
355 |
which is equal to |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
356 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
357 |
\[ |
502 | 358 |
A\star \dn \{[]\} \,\cup\, A \,\cup\, A @ A \,\cup\, A @ A @ A \,\cup\, \ldots |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
359 |
\] |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
360 |
|
404
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
361 |
\noindent We can see that the empty string is always in $A\star$, |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
362 |
no matter what $A$ is. This is because $[] \in A^0$. To make |
241
10f02605a46a
polished
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
239
diff
changeset
|
363 |
sure you understand these definitions, I leave you to answer |
404
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
364 |
what $\{[]\}\star$ and $\varnothing\star$ are? |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
365 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
366 |
Recall that an alphabet is often referred to by the letter |
404
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
367 |
$\Sigma$. We can now write for the set of \emph{all} strings |
502 | 368 |
over this alphabet as $\Sigma\star$. In doing so we also include the |
736
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
369 |
empty string as a possible string (over $\Sigma$). Assuming $\Sigma |
404
245d302791c7
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
398
diff
changeset
|
370 |
= \{a, b\}$, then $\Sigma\star$ is |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
371 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
372 |
\[ |
246
baf41b05210f
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
242
diff
changeset
|
373 |
\{[], a, b, aa, ab, ba, bb, aaa, aab, aba, abb, baa, bab, \ldots\} |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
374 |
\] |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
375 |
|
736
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
376 |
\noindent or in words all strings containing $a$s and |
246
baf41b05210f
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
242
diff
changeset
|
377 |
$b$s only, plus the empty string. |
736
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
378 |
\bigskip |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
379 |
|
736
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
380 |
\noindent |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
381 |
Thanks for making it until here! There are also some personal conventions |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
382 |
about regular expressions. But I will explain them in the handout for the |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
383 |
first week. An exercise you can do: Implement the power operation for languages |
d3e477fe6c66
updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
550
diff
changeset
|
384 |
and try out some examples. |
239
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
385 |
\end{document} |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
386 |
|
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
387 |
%%% Local Variables: |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
388 |
%%% mode: latex |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
389 |
%%% TeX-master: t |
68d98140b90b
added notation handout
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff
changeset
|
390 |
%%% End: |