handouts/notation.tex
changeset 502 bf1c472b244e
parent 496 5c9de27a5b30
child 505 5b9cf7fbd51a
equal deleted inserted replaced
501:b718b9770dae 502:bf1c472b244e
     9 \section*{A Crash-Course on Notation}
     9 \section*{A Crash-Course on Notation}
    10 
    10 
    11 There are innumerable books available about compilers, automata theory
    11 There are innumerable books available about compilers, automata theory
    12 and formal languages. Unfortunately, they often use their own
    12 and formal languages. Unfortunately, they often use their own
    13 notational conventions and their own symbols. This handout is meant to
    13 notational conventions and their own symbols. This handout is meant to
    14 clarify some of the notation I will use. I appologise in advance that
    14 clarify some of the notation I will use. I apologise in advance that
    15 sometimes I will be a bit fuzzy\ldots the problem is that often we
    15 sometimes I will be a bit fuzzy\ldots the problem is that often we
    16 want to have convenience in our mathematical definitions (to make them
    16 want to have convenience in our mathematical definitions (to make them
    17 readable and understandable), but other times we need pedantic
    17 readable and understandable), but other times we need pedantic
    18 precision for actual programs.
    18 precision for actual programs.
    19 
    19 
    52 characters stand for, except we do care about the fact that
    52 characters stand for, except we do care about the fact that
    53 for example the character $a$ is not equal to $b$ and so on.
    53 for example the character $a$ is not equal to $b$ and so on.
    54 Why do I make this distinction? Because we often need to
    54 Why do I make this distinction? Because we often need to
    55 define functions using variables ranging over characters. We
    55 define functions using variables ranging over characters. We
    56 need to somehow say this is a variable, say $c$, ranging over
    56 need to somehow say this is a variable, say $c$, ranging over
    57 characters, while this is the atual character \pcode{c}.
    57 characters, while this is the actual character \pcode{c}.
    58 
    58 
    59 An \defn{alphabet} is a (non-empty) finite set of characters.
    59 An \defn{alphabet} is a (non-empty) finite set of characters.
    60 Often the letter $\Sigma$ is used to refer to an alphabet. For
    60 Often the letter $\Sigma$ is used to refer to an alphabet. For
    61 example the ASCII characters \pcode{a} to \pcode{z} form an
    61 example the ASCII characters \pcode{a} to \pcode{z} form an
    62 alphabet. The digits $0$ to $9$ are another alphabet. The
    62 alphabet. The digits $0$ to $9$ are another alphabet. The
    71 are many ways how we can write down strings. In programming
    71 are many ways how we can write down strings. In programming
    72 languages, they are usually written as \dq{\texttt{hello}} where the
    72 languages, they are usually written as \dq{\texttt{hello}} where the
    73 double quotes indicate that we are dealing with a string. In
    73 double quotes indicate that we are dealing with a string. In
    74 typed programming languages, such as Scala, strings have a special
    74 typed programming languages, such as Scala, strings have a special
    75 type---namely \pcode{String} which is different from the type
    75 type---namely \pcode{String} which is different from the type
    76 for lists of chatacters. This is because strings can be
    76 for lists of characters. This is because strings can be
    77 efficiently represented in memory, unlike general lists. Since
    77 efficiently represented in memory, unlike lists. Since
    78 \code{String} and the type of lists of characters,
    78 \code{String} and the type of lists of characters
    79 \code{List[Char]} are not the same, we need to explicitly
    79 (\code{List[Char]}) are not the same, we need to explicitly
    80 coerce elements between the two types, for example
    80 coerce elements between the two types, for example
    81 
    81 
    82 \begin{lstlisting}[numbers=none]
    82 \begin{lstlisting}[numbers=none]
    83 scala> "abc".toList
    83 scala> "abc".toList
    84 res01: List[Char] = List(a, b, c)
    84 res01: List[Char] = List(a, b, c)
    85 \end{lstlisting}
    85 \end{lstlisting}
    86 
    86 
    87 \noindent Since in our (mathematical) definitions we regard
    87 \noindent
    88 strings as lists of characters, we will also write
    88 However, we do not want to do this kind of explicit coercion in our
    89 \dq{$hello$} as
    89 pencil-and-paper, everyday arguments.  So in our (mathematical)
       
    90 definitions we regard strings as lists of characters, we will also
       
    91 write \dq{$hello$} as
    90 
    92 
    91 \[
    93 \[
    92 [\text{\it h, e, l, l, o}] \qquad\text{or simply}\qquad \textit{hello}
    94 [\text{\it h, e, l, l, o}] \qquad\text{or simply}\qquad \textit{hello}
    93 \]
    95 \]
    94 
    96 
   104 
   106 
   105 Two strings, say $s_1$ and $s_2$, can be \defn{concatenated},
   107 Two strings, say $s_1$ and $s_2$, can be \defn{concatenated},
   106 which we write as $s_1 @ s_2$. If we regard $s_1$ and $s_2$ as
   108 which we write as $s_1 @ s_2$. If we regard $s_1$ and $s_2$ as
   107 lists of characters, then $@$ is the list-append function.
   109 lists of characters, then $@$ is the list-append function.
   108 Suppose we are given two strings \dq{\textit{foo}} and
   110 Suppose we are given two strings \dq{\textit{foo}} and
   109 \dq{\textit{bar}}, then their concatenation, writen
   111 \dq{\textit{bar}}, then their concatenation, written
   110 \dq{\textit{foo}} $@$ \dq{\textit{bar}}, gives
   112 \dq{\textit{foo}} $@$ \dq{\textit{bar}}, gives
   111 \dq{\textit{foobar}}. But as said above, we will often
   113 \dq{\textit{foobar}}. But as said above, we will often
   112 simplify our life and just drop the double quotes whenever it
   114 simplify our life and just drop the double quotes whenever it
   113 is clear we are talking about strings, So we will often just
   115 is clear we are talking about strings, So we will often just
   114 write \textit{foo}, \textit{bar}, \textit{foobar} or
   116 write \textit{foo}, \textit{bar}, \textit{foobar} or
   115 \textit{foo $@$ bar}.
   117 \textit{foo $@$ bar}.
   116 
   118 
   117 Occasionally we will use the notation $a^n$ for strings, which stands
   119 Occasionally we will use the notation $a^n$ for strings, which stands
   118 for the string of $n$ repeated $a$s. So $a^{n}b^{n}$ is a string that
   120 for the string of $n$ repeated $a$s. So $a^{n}b^{n}$ is a string that
   119 has as many $a$s by as many $b$s.  A simple property of string
   121 has some number of $a$s followed by the same number of $b$s.  A simple
   120 concatenation is \emph{associativity}, meaning
   122 property of string concatenation is \emph{associativity}, meaning
   121 
   123 
   122 \[(s_1 @ s_2) @ s_3 = s_1 @ (s_2 @ s_3)\]  
   124 \[(s_1 @ s_2) @ s_3 = s_1 @ (s_2 @ s_3)\]  
   123 
   125 
   124 \noindent are always equal strings. The empty string behaves
   126 \noindent are always equal strings. The empty string behaves
   125 like a \emph{unit element}, therefore
   127 like a \emph{unit element}, therefore
   138 \[
   140 \[
   139 \{1, 2, 3\}
   141 \{1, 2, 3\}
   140 \]
   142 \]
   141 
   143 
   142 \noindent The notation $\in$ means \emph{element of}, so $1 \in \{1,
   144 \noindent The notation $\in$ means \emph{element of}, so $1 \in \{1,
   143 2, 3\}$ is true and $4 \in \{1, 2, 3\}$ is false.  Not that the
   145 2, 3\}$ is true and $4 \in \{1, 2, 3\}$ is false.  Note that the
   144 \emph{list} $[1, 2, 3]$ is something different from the \emph{set}
   146 \emph{list} $[1, 2, 3]$ is something different from the \emph{set}
   145 $\{1, 2, 3\}$: in the former we care about the order and potentially
   147 $\{1, 2, 3\}$: in the former we care about the order and potentially
   146 several occurrences of a number; while with the latter we do not.
   148 several occurrences of a number; while with the latter we do not.
   147 Also sets can potentially have infinitely many elements. For example
   149 Also sets can potentially have infinitely many elements, whereas lists
       
   150 cannot. For example
   148 the set of all natural numbers $\{0, 1, 2, \ldots\}$ is infinite. This
   151 the set of all natural numbers $\{0, 1, 2, \ldots\}$ is infinite. This
   149 set is often also abbreviated as $\mathbb{N}$. Lists can be very large, but they cannot contain infinitely many elements.
   152 set is often also abbreviated as $\mathbb{N}$. Lists can be very large, but they cannot contain infinitely many elements.
   150 
   153 
   151 We can define sets by giving all elements, for example $\{0, 1\}$ for
   154 We can define sets by giving all elements, for example $\{0, 1\}$ for
   152 the set containing just $0$ and $1$, but also by \defn{set
   155 the set containing just $0$ and $1$, but also by \defn{set
   201 \ldots
   204 \ldots
   202 \]
   205 \]
   203 
   206 
   204 \noindent but using the big union notation is more concise.
   207 \noindent but using the big union notation is more concise.
   205 
   208 
   206 While this stuff about sets might all look trivial or even needlessly
   209 As an aside: While this stuff about sets might all look trivial or even needlessly
   207 pedantic, \emph{Nature} is never simple. If you want to be amazed how
   210 pedantic, \emph{Nature} is never simple. If you want to be amazed how
   208 complicated sets can get, watch out for the last lecture just before
   211 complicated sets can get, watch out for the last lecture just before
   209 Christmas where I want to convince you of the fact that some sets are
   212 Christmas where I want to convince you of the fact that some sets are
   210 more infinite than others. Yes, you read correctly, there can be sets
   213 more infinite than others. Yes, you read correctly, there can be sets
   211 that are ``more infinite'' then others. If you think this is obvious:
   214 that are ``more infinite'' then others. If you think this is obvious:
   212 say you have the infinite set $\{1, 2, 3, 4, \ldots\}$ which is all
   215 say you have the infinite set $\mathbb{N}\backslash\{0\} = \{1, 2, 3, 4, \ldots\}$ which is all
   213 the natural numbers except $0$, and then compare it to the set
   216 the natural numbers except $0$, and then compare it to the set
   214 $\{0, 1, 2, 3, 4, \ldots\}$ which contains the $0$. Yes, the second must be more infinite\ldots{} well, then think again. Because the two
   217 $\{0, 1, 2, 3, 4, \ldots\}$ which contains the $0$. If you think,
       
   218 the second must be more infinite\ldots{} well, then think again. Because the two
   215 infinite sets
   219 infinite sets
   216 
   220 
   217 \begin{center}
   221 \begin{center}
   218   $\{1, 2, 3, 4, \ldots\}$ and
   222   $\{1, 2, 3, 4, \ldots\}$ and
   219   $\{0, 1, 2, 3, 4, \ldots\}$
   223   $\{0, 1, 2, 3, 4, \ldots\}$
   222 \noindent
   226 \noindent
   223 contain actually the same number of elements. Does this make sense?
   227 contain actually the same number of elements. Does this make sense?
   224 Though this might all look strange this about infinite sets will be a
   228 Though this might all look strange this about infinite sets will be a
   225 topic that is very relevant to the material of this module. It tells
   229 topic that is very relevant to the material of this module. It tells
   226 us what we can compute with a computer (actually algorithm) and what
   230 us what we can compute with a computer (actually algorithm) and what
   227 we cannot.
   231 we cannot. But during the first 9 lectures we can go by without this
       
   232 ``weird'' stuff.
   228 
   233 
   229 Another important notion in this module are \defn{languages}, which
   234 Another important notion in this module are \defn{languages}, which
   230 are sets of strings. One of the main goals for us will be how to
   235 are sets of strings. One of the main goals for us will be how to
   231 (formally) specify languages and to find out whether a string
   236 (formally) specify languages and to find out whether a string
   232 is in a language or not.\footnote{You might wish to ponder
   237 is in a language or not.\footnote{You might wish to ponder
   269 
   274 
   270 \noindent Note the difference in the last two lines: the empty
   275 \noindent Note the difference in the last two lines: the empty
   271 set behaves like $0$ for multiplication and the set $\{[]\}$
   276 set behaves like $0$ for multiplication and the set $\{[]\}$
   272 like $1$ for multiplication ($n * 1 = n$ and $n * 0 = 0$).
   277 like $1$ for multiplication ($n * 1 = n$ and $n * 0 = 0$).
   273 
   278 
   274 Following the language concatenation, we can define a
   279 Using the operation of language concatenation, we can define a
   275 \defn{language power} operation as follows:
   280 \defn{language power} operation as follows:
   276 
   281 
   277 \begin{eqnarray*}
   282 \begin{eqnarray*}
   278 A^0     & \dn & \{[]\}\\
   283 A^0     & \dn & \{[]\}\\
   279 A^{n+1} & \dn & A \,@\, A^n
   284 A^{n+1} & \dn & A \,@\, A^n
   282 \noindent This definition is by recursion on natural numbers.
   287 \noindent This definition is by recursion on natural numbers.
   283 Note carefully that the zero-case is not defined as the empty
   288 Note carefully that the zero-case is not defined as the empty
   284 set, but the set containing the empty string. So no matter
   289 set, but the set containing the empty string. So no matter
   285 what the set $A$ is, $A^0$ will always be $\{[]\}$. (There is
   290 what the set $A$ is, $A^0$ will always be $\{[]\}$. (There is
   286 another hint about a connection between the $@$-operation and
   291 another hint about a connection between the $@$-operation and
   287 multiplication: How is $x^n$ defined recursively and what is
   292 multiplication: How is $x^n$ defined in mathematics and what is
   288 $x^0$?)
   293 $x^0$?)
   289 
   294 
   290 Next we can define the \defn{star operation} for languages:
   295 Next we can define the \defn{star operation} for languages:
   291 $A\star$ is the union of all powers of $A$, or short
   296 $A\star$ is the union of all powers of $A$, or short
   292 
   297 
   297 \noindent This star operation is often also called
   302 \noindent This star operation is often also called
   298 \emph{Kleene-star}. Unfolding the definition in \eqref{star}
   303 \emph{Kleene-star}. Unfolding the definition in \eqref{star}
   299 gives
   304 gives
   300 
   305 
   301 \[
   306 \[
   302 A^0 \cup A^1 \cup A^2 \cup A^3 \cup \ldots
   307 A\star \dn A^0 \cup A^1 \cup A^2 \cup A^3 \cup \ldots
   303 \]
   308 \]
   304 
   309 
   305 \noindent
   310 \noindent
   306 which is equal to 
   311 which is equal to 
   307 
   312 
   308 \[
   313 \[
   309 \{[]\} \,\cup\, A \,\cup\, A @ A \,\cup\, A @ A @ A \,\cup\, \ldots
   314 A\star \dn \{[]\} \,\cup\, A \,\cup\, A @ A \,\cup\, A @ A @ A \,\cup\, \ldots
   310 \]
   315 \]
   311 
   316 
   312 \noindent We can see that the empty string is always in $A\star$,
   317 \noindent We can see that the empty string is always in $A\star$,
   313 no matter what $A$ is. This is because $[] \in A^0$. To make
   318 no matter what $A$ is. This is because $[] \in A^0$. To make
   314 sure you understand these definitions, I leave you to answer
   319 sure you understand these definitions, I leave you to answer
   315 what $\{[]\}\star$ and $\varnothing\star$ are?
   320 what $\{[]\}\star$ and $\varnothing\star$ are?
   316 
   321 
   317 Recall that an alphabet is often referred to by the letter
   322 Recall that an alphabet is often referred to by the letter
   318 $\Sigma$. We can now write for the set of \emph{all} strings
   323 $\Sigma$. We can now write for the set of \emph{all} strings
   319 over this alphabet $\Sigma\star$. In doing so we also include the
   324 over this alphabet as $\Sigma\star$. In doing so we also include the
   320 empty string as a possible string over $\Sigma$. So if $\Sigma
   325 empty string as a possible string over $\Sigma$. So if $\Sigma
   321 = \{a, b\}$, then $\Sigma\star$ is
   326 = \{a, b\}$, then $\Sigma\star$ is
   322 
   327 
   323 \[
   328 \[
   324 \{[], a, b, aa, ab, ba, bb, aaa, aab, aba, abb, baa, bab, \ldots\}
   329 \{[], a, b, aa, ab, ba, bb, aaa, aab, aba, abb, baa, bab, \ldots\}