handouts/ho09.tex
author Christian Urban <christian dot urban at kcl dot ac dot uk>
Sat, 20 Dec 2014 14:48:15 +0000
changeset 359 c90f803dc7ea
parent 357 5b91f5ad2772
child 360 eb2004430215
permissions -rw-r--r--
updated
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
276
d7109c6e721d updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     1
\documentclass{article}
d7109c6e721d updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     2
\usepackage{../style}
d7109c6e721d updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     3
\usepackage{../langs}
346
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
     4
\usepackage{../graphics}
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
     5
\usepackage{../grammar}
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
     6
\usepackage{multicol}
276
d7109c6e721d updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     7
d7109c6e721d updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     8
\begin{document}
d7109c6e721d updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     9
335
06d5fc15594d updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 276
diff changeset
    10
\section*{Handout 9 (Static Analysis)}
06d5fc15594d updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 276
diff changeset
    11
06d5fc15594d updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 276
diff changeset
    12
If we want to improve the safety and security of our programs,
337
92a718b88e14 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 335
diff changeset
    13
we need a more principled approach to programming. Testing is
359
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
    14
good, but as Edsger Dijkstra famously wrote: 
346
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    15
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    16
\begin{quote}\it 
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    17
``Program testing can be a very effective way to show the
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    18
\underline{\smash{presence}} of bugs, but it is hopelessly
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    19
inadequate for showing their \underline{\smash{absence}}.''
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    20
\end{quote}
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    21
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    22
\noindent While such a more principled approach has been the
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    23
subject of intense study for a long, long time, only in the
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    24
past few years some impressive results have been achieved. One
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    25
is the complete formalisation and (mathematical) verification
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    26
of a microkernel operating system called seL4.
337
92a718b88e14 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 335
diff changeset
    27
92a718b88e14 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 335
diff changeset
    28
\begin{center}
92a718b88e14 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 335
diff changeset
    29
\url{http://sel4.systems}
92a718b88e14 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 335
diff changeset
    30
\end{center}
92a718b88e14 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 335
diff changeset
    31
350
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
    32
\noindent In 2011 this work was included in the MIT Technology
346
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    33
Review in the annual list of the world’s ten most important
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    34
emerging
337
92a718b88e14 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 335
diff changeset
    35
technologies.\footnote{\url{http://www2.technologyreview.com/tr10/?year=2011}}
338
f1491e0d7be0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 337
diff changeset
    36
While this work is impressive, its technical details are too
346
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    37
enormous for an explanation here. Therefore let us look at
337
92a718b88e14 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 335
diff changeset
    38
something much simpler, namely finding out properties about
338
f1491e0d7be0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 337
diff changeset
    39
programs using \emph{static analysis}.
335
06d5fc15594d updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 276
diff changeset
    40
350
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
    41
Static analysis is a technique that checks properties of a
346
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    42
program without actually running the program. This should
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    43
raise alarm bells with you---because almost all interesting
351
cd8d18f7b7ac updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 350
diff changeset
    44
properties about programs  are equivalent to the halting
346
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    45
problem, which we know is undecidable. For example estimating
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    46
the memory consumption of programs is in general undecidable,
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    47
just like the halting problem. Static analysis circumvents
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    48
this undecidability-problem by essentially allowing answers
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    49
\emph{yes} and \emph{no}, but also \emph{don't know}. With
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    50
this ``trick'' even the halting problem becomes
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    51
decidable\ldots{}for example we could always say \emph{don't
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    52
know}. Of course this would be silly. The point is that we
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    53
should be striving for a method that answers as often as
347
efad8155513f updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 346
diff changeset
    54
possible either \emph{yes} or \emph{no}---just in cases when
efad8155513f updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 346
diff changeset
    55
it is too difficult we fall back on the
efad8155513f updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 346
diff changeset
    56
\emph{don't-know}-answer. This might sound all like abstract
efad8155513f updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 346
diff changeset
    57
nonsense. Therefore let us look at a concrete example.
346
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    58
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    59
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    60
\subsubsection*{A Simple, Idealised Programming Language}
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    61
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    62
Our starting point is a small, idealised programming language.
351
cd8d18f7b7ac updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 350
diff changeset
    63
It is idealised because we cut several corners in comparison
cd8d18f7b7ac updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 350
diff changeset
    64
with real programming languages. The language we will study
352
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
    65
contains, amongst other things, variables holding integers.
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
    66
Using static analysis, we want to find out what the sign of
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
    67
these integers (positive or negative) will be when the program
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
    68
runs. This sign-analysis seems like a very simple problem. But
355
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
    69
even such simple problems, if approached naively, are in
354
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
    70
general undecidable, just like Turing's halting problem. I let
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
    71
you think why?
347
efad8155513f updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 346
diff changeset
    72
efad8155513f updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 346
diff changeset
    73
efad8155513f updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 346
diff changeset
    74
Is sign-analysis of variables an interesting problem? Well,
efad8155513f updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 346
diff changeset
    75
yes---if a compiler can find out that for example a variable
efad8155513f updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 346
diff changeset
    76
will never be negative and this variable is used as an index
355
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
    77
for an array, then the compiler does not need to generate code
357
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
    78
for an underflow-check. Remember some languages are immune to
351
cd8d18f7b7ac updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 350
diff changeset
    79
buffer-overflow attacks, but they need to add underflow and
359
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
    80
overflow checks everywhere. According to John Regher, an
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
    81
expert in the field of compilers, overflow checks can cause
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
    82
5-10\% slowdown, in some languages even 100\% for tight
357
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
    83
loops.\footnote{\url{http://blog.regehr.org/archives/1154}} If
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
    84
the compiler can omit the underflow check, for example, then
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
    85
this can potentially drastically speed up the generated code. 
346
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    86
354
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
    87
What do programs in our simple programming language look like?
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
    88
The following grammar gives a first specification:
346
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    89
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    90
\begin{multicols}{2}
350
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
    91
\begin{plstx}[rhs style=,one per line,left margin=9mm]
346
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    92
: \meta{Stmt} ::= \meta{label} \texttt{:}
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    93
                | \meta{var} \texttt{:=} \meta{Exp}
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    94
                | \texttt{jmp?} \meta{Exp} \meta{label}
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    95
                | \texttt{goto} \meta{label}\\
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    96
: \meta{Prog} ::= \meta{Stmt} \ldots{} \meta{Stmt}\\
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
    97
\end{plstx}
350
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
    98
\columnbreak
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
    99
\begin{plstx}[rhs style=,one per line]
: \meta{Exp} ::= \meta{Exp} \texttt{+} \meta{Exp}
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   100
               | \meta{Exp} \texttt{*} \meta{Exp}
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   101
               | \meta{Exp} \texttt{=} \meta{Exp} 
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   102
               | \meta{num}
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   103
               | \meta{var}\\
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   104
\end{plstx}
346
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
   105
\end{multicols}
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
   106
350
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   107
\noindent I assume you are familiar with such
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   108
grammars.\footnote{\url{http://en.wikipedia.org/wiki/Backus–Naur_Form}}
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   109
There are three main syntactic categories: \emph{statments}
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   110
and \emph{expressions} as well as \emph{programs}, which are
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   111
sequences of statements. Statements are either labels,
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   112
variable assignments, conditional jumps (\pcode{jmp?}) and
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   113
unconditional jumps (\pcode{goto}). Labels are just strings,
352
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   114
which can be used as the target of a jump. We assume that in
355
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   115
every program the labels are unique---if there is a clash,
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   116
then we do not know where to jump to. The conditional jumps
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   117
and variable assignments involve (arithmetic) expressions.
352
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   118
Expressions are either numbers, variables or compound
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   119
expressions built up from \pcode{+}, \pcode{*} and \emph{=}
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   120
(for simplicity reasons we do not consider any other
350
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   121
operations). We assume we have negative and positive numbers,
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   122
\ldots \pcode{-2}, \pcode{-1}, \pcode{0}, \pcode{1},
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   123
\pcode{2}\ldots{} An example program that calculates the
359
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   124
factorial of 5 is in our programming language as follows:
350
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   125
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   126
\begin{lstlisting}[language={},xleftmargin=10mm]
346
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
   127
      a := 1
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
   128
      n := 5 
350
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   129
top:  
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   130
      jmp? n = 0 done 
346
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
   131
      a := a * n 
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
   132
      n := n + -1 
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
   133
      goto top 
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
   134
done:
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
   135
\end{lstlisting}
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
   136
354
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   137
\noindent As can be seen each line of the program contains a
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   138
statement. In the first two lines we assign values to the
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   139
variables \pcode{a} and \pcode{n}. In line 4 we test whether
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   140
\pcode{n} is zero, in which case we jump to the end of the
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   141
program marked with the label \pcode{done}. If \pcode{n} is
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   142
not zero, we multiply the content of \pcode{a} by \pcode{n},
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   143
decrease \pcode{n} by one and jump back to the beginning of
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   144
the loop, marked with the label \pcode{top}. Another program
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   145
in our language is shown in Figure~\ref{fib}. I let you think
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   146
what it calculates.
350
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   147
 
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   148
\begin{figure}[t]
346
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
   149
\begin{lstlisting}[numbers=none,
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
   150
                   language={},xleftmargin=10mm]
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
   151
      n := 6
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
   152
      m1 := 0
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
   153
      m2 := 1
350
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   154
loop: 
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   155
      jmp? n = 0 done
346
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
   156
      tmp := m2
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
   157
      m2 := m1 + m2
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
   158
      m1 := tmp
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
   159
      n := n + -1
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
   160
      goto top
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
   161
done:
5a6e8b7d20f7 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 338
diff changeset
   162
\end{lstlisting}
351
cd8d18f7b7ac updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 350
diff changeset
   163
\caption{A mystery program in our idealised programming language.
cd8d18f7b7ac updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 350
diff changeset
   164
Try to find out what it calculates! \label{fib}}
350
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   165
\end{figure}
338
f1491e0d7be0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 337
diff changeset
   166
350
54d6fc856950 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 347
diff changeset
   167
Even if our language is rather small, it is still Turing
351
cd8d18f7b7ac updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 350
diff changeset
   168
complete---meaning quite powerful. However, discussing this
cd8d18f7b7ac updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 350
diff changeset
   169
fact in more detail would lead us too far astray. Clearly, our
cd8d18f7b7ac updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 350
diff changeset
   170
programming is rather low-level and not very comfortable for
354
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   171
writing programs. It is inspired by real machine code, which
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   172
is the code that is executed by a CPU. So a more interesting
351
cd8d18f7b7ac updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 350
diff changeset
   173
question is what is missing in comparison with real machine
cd8d18f7b7ac updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 350
diff changeset
   174
code? Well, not much\ldots{}in principle. Real machine code,
cd8d18f7b7ac updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 350
diff changeset
   175
of course, contains many more arithmetic instructions (not
cd8d18f7b7ac updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 350
diff changeset
   176
just addition and multiplication) and many more conditional
cd8d18f7b7ac updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 350
diff changeset
   177
jumps. We could add these to our language if we wanted, but
cd8d18f7b7ac updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 350
diff changeset
   178
complexity is really beside the point here. Furthermore, real
cd8d18f7b7ac updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 350
diff changeset
   179
machine code has many instructions for manipulating memory. We
cd8d18f7b7ac updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 350
diff changeset
   180
do not have this at all. This is actually a more serious
352
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   181
simplification because we assume numbers to be arbitrary small
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   182
or large, which is not the case with real machine code. In
351
cd8d18f7b7ac updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 350
diff changeset
   183
real code basic number formats have a range and might
352
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   184
over-flow or under-flow from this range. Also the number of
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   185
variables in our programs is potentially unlimited, while
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   186
memory in an actual computer, of course, is always limited
354
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   187
somehow on any actual. To sum up, our language might look
355
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   188
ridiculously simple, but it is not far removed from
354
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   189
practically relevant issues.
276
d7109c6e721d updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   190
351
cd8d18f7b7ac updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 350
diff changeset
   191
cd8d18f7b7ac updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 350
diff changeset
   192
\subsubsection*{An Interpreter}
cd8d18f7b7ac updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 350
diff changeset
   193
352
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   194
Designing a language is like playing god: you can say what
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   195
names for variables you allow; what programs should look like;
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   196
most importantly you can decide what each part of the program
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   197
should mean and do. While our language is rather simple and
354
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   198
the meaning of statements, for example, is rather
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   199
straightforward, there are still places where we need to make
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   200
real choices. For example consider the conditional jumps, say
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   201
the one in the factorial program: 
352
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   202
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   203
\begin{center}
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   204
\code{jmp? n = 0 done}
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   205
\end{center}
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   206
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   207
\noindent How should they work? We could introduce Booleans
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   208
(\pcode{true} and \pcode{false}) and then jump only when the
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   209
condition is \pcode{true}. However, since we have numbers in
354
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   210
our language anyway, why not just encoding \pcode{true} as
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   211
one, and \pcode{false} as zero? In this way we can dispense
355
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   212
with the additional concept of Booleans.
352
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   213
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   214
I hope the above discussion makes it already clear we need to
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   215
be a bit more careful with our programs. Below we shall
354
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   216
describe an interpreter for our programming language, which
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   217
specifies exactly how programs are supposed to be
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   218
run\ldots{}at least we will specify this for all \emph{good}
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   219
programs. By good programs I mean where all variables are
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   220
initialised, for example. Our interpreter will just crash if
359
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   221
it cannot find out the value for a variable when it is not
355
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   222
initialised. Also, we will assume that labels in good programs
354
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   223
are unique, otherwise our programs will calculate ``garbage''.
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   224
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   225
First we will pre-process our programs. This will simplify the
359
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   226
definition of our interpreter later on. By pre-processing our
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   227
programs we will transform programs into \emph{snippets}.
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   228
Their purpose is to simplify the definition of what the
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   229
interpreter should do in case of a jump. A snippet is a label
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   230
and all the code that comes after the label. This essentially
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   231
means a snippet is a \emph{map} from labels to
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   232
code.\footnote{Be sure you know what maps are. In a
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   233
programming context they are often represented as association
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   234
list where some data is associated with a key.} 
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   235
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   236
Given that programs are sequences (or lists) of statements, we
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   237
can easily calculate the snippets by just traversing this
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   238
sequence and recursively generating the map. Suppose a program
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   239
is of the general form
354
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   240
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   241
\begin{center}
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   242
$stmt_1\;stmt_2\; \ldots\; stmt_n$
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   243
\end{center}
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   244
355
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   245
\noindent The idea is to go through this sequence of
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   246
statements one by one and check whether they are a label. If
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   247
yes, we add the label and the remaining statements to our map.
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   248
If no, we just continue with the next statement. To come up
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   249
with a recursive definition for generating snippets, let us
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   250
write $[]$ for the program that does not contain any
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   251
statement. Consider the following definition:
354
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   252
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   253
\begin{center}
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   254
\begin{tabular}{l@{\hspace{1mm}}c@{\hspace{1mm}}l}
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   255
$\textit{snippets}([])$ & $\dn$ & $\varnothing$\\
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   256
$\textit{snippets}(stmt\;\; rest)$ & $\dn$ &
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   257
$\begin{cases}
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   258
   \textit{snippets}(rest)[label := rest] & \text{if}\;stmt = \textit{label:}\\
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   259
   \textit{snippets}(rest)                & \text{otherwise}
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   260
\end{cases}$   
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   261
\end{tabular}
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   262
\end{center}
352
da5713bcdbb0 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 351
diff changeset
   263
354
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   264
\noindent In the first clause we just return the empty map for
355
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   265
the program that does not contain any statement. In the second
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   266
clause, we have to distinguish the case where the first
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   267
statement is a label or not. As said before, if not, then we
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   268
just ``throw away'' the label and recursively calculate the
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   269
snippets for the rest of the program. If yes, then we do the
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   270
same, but also update the map so that $label$ now points to
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   271
the rest of the statements. There is one small problem we need
357
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   272
to overcome: our two programs shown so far have no label as
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   273
\emph{entry point}---that is where the execution is supposed
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   274
to start. We usually assume that the first statement will be
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   275
run first. To make this the default, it is convenient if we
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   276
add to all our programs a default label, say \pcode{""} (the
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   277
empty string). With this we can define our pre-processing of
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   278
programs as follows
354
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   279
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   280
\begin{center}
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   281
$\textit{preproc}(prog) \dn \textit{snippets}(\pcode{"":}\;\; prog)$ 
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   282
\end{center} 
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   283
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   284
\noindent Let us see how this pans out in practice. If we
355
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   285
pre-process the factorial program shown earlier, we obtain the 
354
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   286
following map:
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   287
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   288
\begin{center}
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   289
\small
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   290
\lstset{numbers=none,
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   291
        language={},
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   292
        xleftmargin=0mm,
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   293
        aboveskip=0.5mm,
355
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   294
        belowskip=0.5mm,
354
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   295
        frame=single,
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   296
        framerule=0mm,
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   297
        framesep=0mm}
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   298
\begin{tikzpicture}[node distance=0mm]
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   299
\node (A1) [draw]{\pcode{""}};
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   300
\node (B1) [right=of A1] {$\mapsto$};
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   301
\node (C1) [right=of B1,anchor=north west] {
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   302
\begin{minipage}{3.5cm}
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   303
\begin{lstlisting}[language={},xleftmargin=0mm]
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   304
  a := 1
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   305
  n := 5 
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   306
top:  
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   307
  jmp? n = 0 done 
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   308
  a := a * n 
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   309
  n := n + -1 
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   310
  goto top 
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   311
done:
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   312
\end{lstlisting}
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   313
\end{minipage}};
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   314
\node (A2) [right=of C1.north east,draw] {\pcode{top}};
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   315
\node (B2) [right=of A2] {$\mapsto$};
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   316
\node (C2) [right=of B2, anchor=north west] {
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   317
\begin{minipage}{3.5cm}
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   318
\begin{lstlisting}[language={},xleftmargin=0mm]
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   319
  jmp? n = 0 done 
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   320
  a := a * n 
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   321
  n := n + -1 
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   322
  goto top 
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   323
done:
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   324
\end{lstlisting} 
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   325
\end{minipage}}; 
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   326
\node (A3) [right=of C2.north east,draw] {\pcode{done}};
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   327
\node (B3) [right=of A3] {$\mapsto$};
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   328
\node (C3) [right=of B3] {$[]$};
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   329
\end{tikzpicture}
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   330
\end{center}
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   331
8e5e84b14041 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 352
diff changeset
   332
\noindent I highlighted the \emph{keys} in this map. Since
356
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   333
there are three labels in the factorial program (remember we
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   334
added \pcode{""}), there are three keys. When running the
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   335
factorial program and encountering a jump, then we only have
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   336
to consult this snippets-map in order to find out what the
359
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   337
next statements should be.
355
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   338
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   339
We should now be in the position to define how a program
356
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   340
should be run. In the context of interpreters, this
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   341
``running'' of programs is often called \emph{evaluation}. Let
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   342
us start with the definition of how expressions are evaluated.
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   343
A first attempt might be the following recursive function:
355
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   344
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   345
\begin{center}
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   346
\begin{tabular}{l@{\hspace{1mm}}c@{\hspace{1mm}}l}
356
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   347
$\textit{eval\_exp}(\texttt{n})$ & $\dn$ & $n$
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   348
  \qquad\text{if}\; \texttt{n}\; \text{is a number like} 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   349
  \ldots \pcode{-2}, \pcode{-1}, \pcode{0}, \pcode{1},
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   350
\pcode{2}\ldots{}\\
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   351
$\textit{eval\_exp}(\texttt{e}_\texttt{1} \,\texttt{+}\, 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   352
                    \texttt{e}_\texttt{2})$ & 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   353
  $\dn$ & $\textit{eval\_exp}(\texttt{e}_\texttt{1}) + 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   354
           \textit{eval\_exp}(\texttt{e}_\texttt{2})$\\
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   355
$\textit{eval\_exp}(\texttt{e}_\texttt{1} \,\texttt{*}\, 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   356
                    \texttt{e}_\texttt{2})$ & 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   357
  $\dn$ & $\textit{eval\_exp}(\texttt{e}_\texttt{1}) * 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   358
           \textit{eval\_exp}(\texttt{e}_\texttt{2})$\\
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   359
$\textit{eval\_exp}(\texttt{e}_\texttt{1} \,\texttt{=}\, 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   360
                    \texttt{e}_\texttt{2})$ & 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   361
  $\dn$ & $\begin{cases}
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   362
  1 & \text{if}\;\textit{eval\_exp}(\texttt{e}_\texttt{1}) =
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   363
                 \textit{eval\_exp}(\texttt{e}_\texttt{2})\\
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   364
  0 & \text{otherwise}
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   365
  \end{cases}$          
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   366
\end{tabular}
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   367
\end{center}
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   368
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   369
\noindent While this should look all relatively
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   370
straightforward, still be very careful. There is a subtlety
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   371
which can be easily overlooked: The function
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   372
\textit{eval\_exp} takes an expression of our programming
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   373
language as input and returns a number as output. Therefore
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   374
whenever we have a number in our program, we just return this
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   375
number---this is defined in the first clause above. Whenever
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   376
we encounter an addition, well then we first evaluate the
357
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   377
left-hand side $\texttt{e}_\texttt{1}$ of the addition (this
356
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   378
will give a number), then evaluate the right-hand side
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   379
$\texttt{e}_\texttt{2}$ (this gives another number), and
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   380
finally add both numbers together. Here is the subtlety: on
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   381
the left-hand side of the $\dn$ we have a \texttt{+} (in the
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   382
teletype font) which is the symbol for addition in our
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   383
programming language. On the right-hand side we have $+$ which
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   384
stands for the arithmetic operation from ``mathematics'' of
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   385
adding two numbers. These are rather different concepts---one
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   386
is a symbol (which we made up), and the other a mathematical
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   387
operation. When we will have a look at an actual
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   388
implementation of our interpreter, the mathematical operation
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   389
will be the function for addition from the programming
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   390
language in which we \underline{\smash{implement}} our
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   391
interpreter. While the \texttt{+} is just a symbol that is
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   392
used in our programming language. Clearly we have to use a
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   393
symbol that is a good mnemonic for addition otherwise we will
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   394
confuse the programmers working with our language. Therefore
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   395
we use $\texttt{+}$. A similar choice is made for times in the
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   396
third clause and equality in the fourth clause. Remember I
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   397
wrote at the beginning of this section about being god when
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   398
designing a programming language. You can see this here: we
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   399
need to give meaning to symbols. 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   400
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   401
At the moment however, we are a poor fallible god. Look again
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   402
at the grammar of our programming language and our definition.
357
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   403
Clearly, an expression can contain variables. So far we have
356
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   404
ignored them. What should our interpreter do with variables?
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   405
They might change during the evaluation of a program. For
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   406
example the variable \pcode{n} in the factorial program counts
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   407
down from 5 up to 0. How can we improve our definition above
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   408
to give also an answer whenever our interpreter encounters a
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   409
variable in an expression? The solution is to add an
357
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   410
\emph{environment}, written $env$, as an additional input
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   411
argument to our \textit{eval\_exp} function.
356
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   412
 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   413
\begin{center}
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   414
\begin{tabular}{l@{\hspace{1mm}}c@{\hspace{1mm}}l}
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   415
$\textit{eval\_exp}(\texttt{n}, env)$ & $\dn$ & $n$
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   416
  \qquad\text{if}\; \texttt{n}\; \text{is a number like} 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   417
  \ldots \pcode{-2}, \pcode{-1}, \pcode{0}, \pcode{1},
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   418
\pcode{2}\ldots{}\\
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   419
$\textit{eval\_exp}(\texttt{e}_\texttt{1} \,\texttt{+}\, 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   420
                    \texttt{e}_\texttt{2}, env)$ & 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   421
  $\dn$ & $\textit{eval\_exp}(\texttt{e}_\texttt{1}, env) + 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   422
           \textit{eval\_exp}(\texttt{e}_\texttt{2}, env)$\\
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   423
$\textit{eval\_exp}(\texttt{e}_\texttt{1} \,\texttt{*}\, 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   424
                    \texttt{e}_\texttt{2}, env)$ & 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   425
  $\dn$ & $\textit{eval\_exp}(\texttt{e}_\texttt{1}, env) * 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   426
           \textit{eval\_exp}(\texttt{e}_\texttt{2}, env)$\\
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   427
$\textit{eval\_exp}(\texttt{e}_\texttt{1} \,\texttt{=}\, 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   428
                    \texttt{e}_\texttt{2}, env)$ & 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   429
  $\dn$ & $\begin{cases}
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   430
  1 & \text{if}\;\textit{eval\_exp}(\texttt{e}_\texttt{1}, env) =
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   431
                 \textit{eval\_exp}(\texttt{e}_\texttt{2}, env)\\
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   432
  0 & \text{otherwise}
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   433
  \end{cases}$\\
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   434
$\textit{eval\_exp}(\texttt{x}, env)$ & $\dn$ & $env(x)$      
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   435
\end{tabular}
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   436
\end{center}
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   437
 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   438
\noindent This environment $env$ also acts like a map: it
357
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   439
associates variable with their current values. For example
359
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   440
߆after evaluating the first two lines in our factorial
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   441
program, such an environment might look as follows
357
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   442
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   443
\begin{center}
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   444
\begin{tabular}{ll}
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   445
$\fbox{\texttt{a}} \mapsto \texttt{1}$ &
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   446
$\fbox{\texttt{n}} \mapsto \texttt{5}$
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   447
\end{tabular}
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   448
\end{center}
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   449
359
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   450
\noindent Again I highlighted the keys. In the clause for
357
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   451
variables, we can therefore consult this environment and
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   452
return whatever value is currently stored for this variable.
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   453
This is written $env(x)$. If we query this map with $x$ we
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   454
obtain the corresponding number. You might ask what happens if
5b91f5ad2772 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 356
diff changeset
   455
an environment does not contain any value for, say, the
359
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   456
variable $x$? Well, then our interpreter just ``crashes'', or
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   457
more precisely will raise an exception. In this case we have a
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   458
``bad'' program that tried to use a variable before it was
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   459
initialised. The programmer should not have done this. In a
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   460
real programming language we would of course try a bit harder
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   461
and for example give an error at compile time, or design our
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   462
language in such a way that this can never happen. With the
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   463
second version of \textit{eval\_exp} we completed our
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   464
definition for evaluating expressions.
356
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   465
 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   466
Next comes the evaluation function for statements. We define
359
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   467
this function in such a way that we recursively evaluate a
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   468
whole sequence of statements. Assume a program $p$ (you want
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   469
to evaluate) and its pre-processed snippets $sn$. Then we can
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   470
define:
356
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   471
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   472
\begin{center}
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   473
\begin{tabular}{@{}l@{\hspace{1mm}}c@{\hspace{1mm}}l@{}}
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   474
$\textit{eval\_stmts}([], env)$ & $\dn$ & $env$\\
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   475
$\textit{eval\_stmts}(\texttt{label:}\;rest, env)$ &
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   476
  $\dn$ & $\textit{eval\_stmts}(rest, env)$ \\ 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   477
$\textit{eval\_stmts}(\texttt{x\,:=\,e}\;rest, env)$ & 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   478
  $\dn$ & $\textit{eval\_stmts}(rest, 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   479
           env[x := \textit{eval\_exp}(\texttt{e}, env)])$\\  
359
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   480
$\textit{eval\_stmts}(\texttt{goto\,lbl}\;rest, env)$ 
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   481
 & $\dn$ & $\textit{eval\_stmts}(sn(\texttt{lbl}), env)$\\
356
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   482
$\textit{eval\_stmts}(\texttt{jmp?\,e\,lbl}\;rest, env)$ 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   483
 & $\dn$ & $\begin{cases}\begin{array}{@{}l@{\hspace{-12mm}}r@{}}
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   484
 \textit{eval\_stmts}(sn(\texttt{lbl}), env)\\ 
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   485
 & \text{if}\;\textit{eval\_exp}(\texttt{e}, env) = 1\\
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   486
 \textit{eval\_stmts}(rest, env) & \text{otherwise}\\
359
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   487
 \end{array}\end{cases}$  
355
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   488
\end{tabular}
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   489
\end{center}
619073c37649 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 354
diff changeset
   490
359
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   491
\noindent The first clause is for the empty program, or when
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   492
we arrived at the end of the program. In this case we just
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   493
return the environment. The second clause is for when the next
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   494
statement is a label. That means the program is of the form
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   495
$\texttt{label:}\;rest$ where the label is some string and
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   496
$rest$ stands for all following statement. This case is easy,
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   497
because our evaluation function just discards the label and
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   498
evaluates the rest of the statements (we already extracted all
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   499
important information about labels when we pre-processed our
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   500
programs and generated the snippets). The third clause is for
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   501
variable assignments. Again we just evaluate the rest for the
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   502
statements, but with a modified environment---since the
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   503
variable assignment is supposed to introduce a new variable or
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   504
change the current value of a variable. For this modification
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   505
of the environment we first evaluate the expression
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   506
$\texttt{e}$ using our evaluation function for expressions.
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   507
This gives us a number. Then we assign this number to the
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   508
variable $x$ in the environment. This modified environment
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   509
will be used to evaluate the rest of the program. The fourth
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   510
clause is for the unconditional jump to a label. That means we
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   511
have to look up in our snippets map $sn$ what are the next
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   512
statements for this label are. Therefore we will continue with
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   513
evaluating, not with the rest of the program, but with the
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   514
statements stored in the snippets-map under the label
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   515
$\texttt{lbl}$. The fifth clause for conditional jumps is
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   516
similar, but in order to make the jump we first need to
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   517
evaluate the expression $\texttt{e}$ in order to find out
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   518
whether it is $1$. If yes, we jump, otherwise we just continue
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   519
with evaluating the rest of the program.
356
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   520
359
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   521
Our interpreter works in two stages: First we pre-process our
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   522
program generating the snippets map $sn$, say. Second we call
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   523
the evaluation function with the default entry point and the
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   524
empty environment:
356
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   525
359
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   526
\begin{center}
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   527
$\textit{eval\_stmts}(sn(\pcode{""}), \varnothing)$
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   528
\end{center}
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   529
 
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   530
\noindent It is interesting to not that our interpreter when
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   531
it comes to the end of the program returns an environment. Our
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   532
programming language does not contain any constructs for input
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   533
and output. Therefore this environment is the only effect we
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   534
can observe when running the program (apart from that our
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   535
interpreter might need some time before finishing the
c90f803dc7ea updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 357
diff changeset
   536
evaluation of the program)
356
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   537
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   538
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   539
e1e0f78baa70 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 355
diff changeset
   540
 
276
d7109c6e721d updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   541
\end{document}
d7109c6e721d updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   542
d7109c6e721d updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   543
%%% Local Variables: 
d7109c6e721d updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   544
%%% mode: latex
d7109c6e721d updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   545
%%% TeX-master: t
d7109c6e721d updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   546
%%% End: