handouts/ho10.tex
author Christian Urban <urbanc@in.tum.de>
Sun, 24 Nov 2019 01:03:38 +0000
changeset 699 b2dc9198687d
parent 675 27119b4a8d0f
child 871 358a72d7bf71
permissions -rw-r--r--
updated
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
675
27119b4a8d0f updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
     1
% !TEX program = xelatex
539
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     2
\documentclass{article}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     3
\usepackage{../style}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     4
\usepackage{../langs}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     5
\usepackage{../graphics}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     6
\usepackage{../grammar}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     7
\usepackage{multicol}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
     8
675
27119b4a8d0f updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
     9
%%\newcommand{\dn}{\stackrel{\mbox{\scriptsize def}}{=}}
539
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    10
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    11
\begin{document}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    12
\fnote{\copyright{} Christian Urban, King's College London, 2014}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    13
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    14
%% why are shuttle flights so good with software
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    15
%%http://www.fastcompany.com/28121/they-write-right-stuff
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    16
675
27119b4a8d0f updated
Christian Urban <urbanc@in.tum.de>
parents: 539
diff changeset
    17
\section*{Handout 10 (Static Analysis)}
539
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    18
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    19
If we want to improve the safety and security of our programs,
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    20
we need a more principled approach to programming. Testing is
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    21
good, but as Edsger Dijkstra famously wrote: 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    22
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    23
\begin{quote}\it 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    24
``Program testing can be a very effective way to show the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    25
\underline{\smash{presence}} of bugs, but it is hopelessly
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    26
inadequate for showing their \underline{\smash{absence}}.''
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    27
\end{quote}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    28
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    29
\noindent While such a more principled approach has been the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    30
subject of intense study for a long, long time, only in the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    31
past few years some impressive results have been achieved. One
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    32
is the complete formalisation and (mathematical) verification
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    33
of a microkernel operating system called seL4.
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    34
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    35
\begin{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    36
\url{http://sel4.systems}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    37
\end{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    38
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    39
\noindent In 2011 this work was included in the MIT Technology
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    40
Review in the annual list of the world’s ten most important
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    41
emerging
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    42
technologies.\footnote{\url{http://www2.technologyreview.com/tr10/?year=2011}}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    43
While this work is impressive, its technical details are too
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    44
enormous for an explanation here. Therefore let us look at
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    45
something much simpler, namely finding out properties about
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    46
programs using \emph{static analysis}.
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    47
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    48
Static analysis is a technique that checks properties of a
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    49
program without actually running the program. This should
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    50
raise alarm bells with you---because almost all interesting
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    51
properties about programs  are equivalent to the halting
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    52
problem, which we know is undecidable. For example estimating
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    53
the memory consumption of programs is in general undecidable,
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    54
just like the halting problem. Static analysis circumvents
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    55
this undecidability-problem by essentially allowing answers
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    56
\emph{yes} and \emph{no}, but also \emph{don't know}. With
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    57
this ``trick'' even the halting problem becomes
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    58
decidable\ldots{}for example we could always say \emph{don't
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    59
know}. Of course this would be silly. The point is that we
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    60
should be striving for a method that answers as often as
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    61
possible either \emph{yes} or \emph{no}---just in cases when
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    62
it is too difficult we fall back on the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    63
\emph{don't-know}-answer. This might sound all like abstract
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    64
nonsense. Therefore let us look at a concrete example.
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    65
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    66
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    67
\subsubsection*{A Simple, Idealised Programming Language}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    68
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    69
Our starting point is a small, idealised programming language.
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    70
It is idealised because we cut several corners in comparison
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    71
with real programming languages. The language we will study
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    72
contains, amongst other things, variables holding integers.
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    73
Using static analysis, we want to find out what the sign of
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    74
these integers (positive or negative) will be when the program
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    75
runs. This sign-analysis seems like a very simple problem. But
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    76
even such simple problems, if approached naively, are in
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    77
general undecidable, just like Turing's halting problem. I let
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    78
you think why?
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    79
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    80
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    81
Is sign-analysis of variables an interesting problem? Well,
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    82
yes---if a compiler can find out that for example a variable
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    83
will never be negative and this variable is used as an index
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    84
for an array, then the compiler does not need to generate code
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    85
for an underflow-check. Remember some languages are immune to
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    86
buffer-overflow attacks, but they need to add underflow and
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    87
overflow checks everywhere. According to John Regher, an
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    88
expert in the field of compilers, overflow checks can cause
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    89
5-10\% slowdown, in some languages even 100\% for tight
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    90
loops.\footnote{\url{http://blog.regehr.org/archives/1154}} If
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    91
the compiler can omit the underflow check, for example, then
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    92
this can potentially drastically speed up the generated code. 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    93
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    94
What do programs in our simple programming language look like?
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    95
The following grammar gives a first specification:
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    96
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    97
\begin{multicols}{2}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    98
\begin{plstx}[rhs style=,one per line,left margin=9mm]
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
    99
: \meta{Stmt} ::= \meta{label} \texttt{:}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   100
                | \meta{var} \texttt{:=} \meta{Exp}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   101
                | \texttt{jmp?} \meta{Exp} \meta{label}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   102
                | \texttt{goto} \meta{label}\\
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   103
: \meta{Prog} ::= \meta{Stmt} \ldots{} \meta{Stmt}\\
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   104
\end{plstx}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   105
\columnbreak
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   106
\begin{plstx}[rhs style=,one per line]
: \meta{Exp} ::= \meta{Exp} \texttt{+} \meta{Exp}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   107
               | \meta{Exp} \texttt{*} \meta{Exp}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   108
               | \meta{Exp} \texttt{=} \meta{Exp} 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   109
               | \meta{num}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   110
               | \meta{var}\\
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   111
\end{plstx}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   112
\end{multicols}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   113
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   114
\noindent I assume you are familiar with such
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   115
grammars.\footnote{\url{http://en.wikipedia.org/wiki/Backus–Naur_Form}}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   116
There are three main syntactic categories: \emph{statments}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   117
and \emph{expressions} as well as \emph{programs}, which are
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   118
sequences of statements. Statements are either labels,
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   119
variable assignments, conditional jumps (\pcode{jmp?}) and
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   120
unconditional jumps (\pcode{goto}). Labels are just strings,
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   121
which can be used as the target of a jump. We assume that in
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   122
every program the labels are unique---if there is a clash,
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   123
then we do not know where to jump to. The conditional jumps
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   124
and variable assignments involve (arithmetic) expressions.
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   125
Expressions are either numbers, variables or compound
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   126
expressions built up from \pcode{+}, \pcode{*} and \emph{=}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   127
(for simplicity reasons we do not consider any other
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   128
operations). We assume we have negative and positive numbers,
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   129
\ldots \pcode{-2}, \pcode{-1}, \pcode{0}, \pcode{1},
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   130
\pcode{2}\ldots{} An example program that calculates the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   131
factorial of 5 is in our programming language as follows:
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   132
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   133
\begin{lstlisting}[language={},xleftmargin=10mm]
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   134
      a := 1
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   135
      n := 5 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   136
top:  
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   137
      jmp? n = 0 done 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   138
      a := a * n 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   139
      n := n + -1 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   140
      goto top 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   141
done:
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   142
\end{lstlisting}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   143
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   144
\noindent As can be seen each line of the program contains a
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   145
statement. In the first two lines we assign values to the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   146
variables \pcode{a} and \pcode{n}. In line 4 we test whether
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   147
\pcode{n} is zero, in which case we jump to the end of the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   148
program marked with the label \pcode{done}. If \pcode{n} is
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   149
not zero, we multiply the content of \pcode{a} by \pcode{n},
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   150
decrease \pcode{n} by one and jump back to the beginning of
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   151
the loop, marked with the label \pcode{top}. Another program
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   152
in our language is shown in Figure~\ref{fib}. I let you think
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   153
what it calculates.
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   154
 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   155
\begin{figure}[t]
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   156
\begin{lstlisting}[numbers=none,
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   157
                   language={},xleftmargin=10mm]
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   158
      n := 6
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   159
      m1 := 0
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   160
      m2 := 1
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   161
loop: 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   162
      jmp? n = 0 done
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   163
      tmp := m2
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   164
      m2 := m1 + m2
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   165
      m1 := tmp
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   166
      n := n + -1
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   167
      goto top
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   168
done:
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   169
\end{lstlisting}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   170
\caption{A mystery program in our idealised programming language.
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   171
Try to find out what it calculates! \label{fib}}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   172
\end{figure}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   173
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   174
Even if our language is rather small, it is still Turing
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   175
complete---meaning quite powerful. However, discussing this
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   176
fact in more detail would lead us too far astray. Clearly, our
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   177
programming is rather low-level and not very comfortable for
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   178
writing programs. It is inspired by real machine code, which
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   179
is the code that is executed by a CPU. So a more interesting
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   180
question is what is missing in comparison with real machine
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   181
code? Well, not much\ldots{}in principle. Real machine code,
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   182
of course, contains many more arithmetic instructions (not
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   183
just addition and multiplication) and many more conditional
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   184
jumps. We could add these to our language if we wanted, but
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   185
complexity is really beside the point here. Furthermore, real
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   186
machine code has many instructions for manipulating memory. We
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   187
do not have this at all. This is actually a more serious
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   188
simplification because we assume numbers to be arbitrary small
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   189
or large, which is not the case with real machine code. In
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   190
real machine code, basic number formats have a range and might
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   191
over-flow or under-flow this range. Also the number of
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   192
variables in our programs is potentially unlimited, while
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   193
memory in an actual computer, of course, is always limited. To
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   194
sum up, our language might look ridiculously simple, but it is
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   195
not too far removed from practically relevant issues.
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   196
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   197
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   198
\subsubsection*{An Interpreter}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   199
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   200
Designing a language is like playing god: you can say what
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   201
names for variables you allow; what programs should look like;
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   202
most importantly you can decide what each part of the program
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   203
should mean and do. While our language is quite simple and the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   204
meaning of statements, for example, is rather straightforward,
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   205
there are still places where we need to make real choices. For
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   206
example consider the conditional jumps, say the one in the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   207
factorial program: 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   208
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   209
\begin{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   210
\code{jmp? n = 0 done}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   211
\end{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   212
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   213
\noindent How should they work? We could introduce Booleans
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   214
(\pcode{true} and \pcode{false}) and then jump only when the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   215
condition is \pcode{true}. However, since we have numbers in
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   216
our language anyway, why not just encoding \pcode{true} as
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   217
one, and \pcode{false} as zero? In this way we can dispense
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   218
with the additional concept of Booleans.
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   219
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   220
I hope the above discussion makes it already clear we need to
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   221
be a bit more careful with our programs. Below we shall
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   222
describe an interpreter for our programming language, which
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   223
specifies exactly how programs are supposed to be
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   224
run\ldots{}at least we will specify this for all \emph{good}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   225
programs. By good programs I mean where all variables are
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   226
initialised, for example. Our interpreter will just crash if
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   227
it cannot find out the value for a variable when it is not
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   228
initialised. Also, we will assume that labels in good programs
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   229
are unique, otherwise our programs will calculate ``garbage''.
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   230
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   231
First we will pre-process our programs. This will simplify the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   232
definition of our interpreter later on. By pre-processing our
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   233
programs we will transform programs into \emph{snippets}. A
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   234
snippet is a label and all the code that comes after the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   235
label. This essentially means a snippet is a \emph{map} from
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   236
labels to code.\footnote{Be sure you know what maps are. In a
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   237
programming context they are often represented as association
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   238
list where some data is associated with a key.} 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   239
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   240
Given that programs are sequences (or lists) of statements, we
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   241
can easily calculate the snippets by just traversing this
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   242
sequence and recursively generating the map. Suppose a program
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   243
is of the general form
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   244
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   245
\begin{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   246
$stmt_1\;stmt_2\; \ldots\; stmt_n$
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   247
\end{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   248
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   249
\noindent The idea is to go through this sequence of
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   250
statements one by one and check whether they are a label. If
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   251
yes, we add the label and the remaining statements to our map.
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   252
If no, we just continue with the next statement. To come up
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   253
with a recursive definition for generating snippets, let us
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   254
write $[]$ for the program that does not contain any
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   255
statement. Consider the following definition:
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   256
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   257
\begin{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   258
\begin{tabular}{l@{\hspace{1mm}}c@{\hspace{1mm}}l}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   259
$\textit{snippets}([])$ & $\dn$ & $\varnothing$\\
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   260
$\textit{snippets}(stmt\;\; rest)$ & $\dn$ &
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   261
$\begin{cases}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   262
   \textit{snippets}(rest)[label := rest] & \text{if}\;stmt = \textit{label:}\\
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   263
   \textit{snippets}(rest)                & \text{otherwise}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   264
\end{cases}$   
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   265
\end{tabular}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   266
\end{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   267
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   268
\noindent In the first clause we just return the empty map for
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   269
the program that does not contain any statement. In the second
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   270
clause, we have to distinguish the case where the first
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   271
statement is a label or not. As said before, if not, then we
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   272
just ``throw away'' the label and recursively calculate the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   273
snippets for the rest of the program (the otherwise clause).
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   274
If yes, then we do the same, but also update the map so that
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   275
$label$ now points to the rest of the statements (the if
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   276
clause). This looks all realtively straightforward, but there
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   277
is one small problem we need to overcome: our two programs
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   278
shown so far have no label as \emph{entry point}---that is
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   279
where the execution is supposed to start. We usually assume
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   280
that the first statement will be run first. To make this the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   281
default, it is convenient if we add to all our programs a
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   282
default label, say \pcode{""} (the empty string). With this we
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   283
can define our pre-processing of programs as follows
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   284
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   285
\begin{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   286
$\textit{preproc}(prog) \dn \textit{snippets}(\pcode{"":}\;\; prog)$ 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   287
\end{center} 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   288
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   289
\noindent Let us see how this pans out in practice. If we
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   290
pre-process the factorial program shown earlier, we obtain the 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   291
following map:
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   292
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   293
\begin{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   294
\small
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   295
\lstset{numbers=none,
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   296
        language={},
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   297
        xleftmargin=0mm,
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   298
        aboveskip=0.5mm,
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   299
        belowskip=0.5mm,
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   300
        frame=single,
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   301
        framerule=0mm,
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   302
        framesep=0mm}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   303
\begin{tikzpicture}[node distance=0mm]
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   304
\node (A1) [draw]{\pcode{""}};
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   305
\node (B1) [right=of A1] {$\mapsto$};
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   306
\node (C1) [right=of B1,anchor=north west] {
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   307
\begin{minipage}{3.5cm}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   308
\begin{lstlisting}[language={},xleftmargin=0mm]
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   309
  a := 1
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   310
  n := 5 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   311
top:  
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   312
  jmp? n = 0 done 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   313
  a := a * n 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   314
  n := n + -1 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   315
  goto top 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   316
done:
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   317
\end{lstlisting}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   318
\end{minipage}};
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   319
\node (A2) [right=of C1.north east,draw] {\pcode{top}};
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   320
\node (B2) [right=of A2] {$\mapsto$};
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   321
\node (C2) [right=of B2, anchor=north west] {
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   322
\begin{minipage}{3.5cm}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   323
\begin{lstlisting}[language={},xleftmargin=0mm]
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   324
  jmp? n = 0 done 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   325
  a := a * n 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   326
  n := n + -1 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   327
  goto top 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   328
done:
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   329
\end{lstlisting} 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   330
\end{minipage}}; 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   331
\node (A3) [right=of C2.north east,draw] {\pcode{done}};
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   332
\node (B3) [right=of A3] {$\mapsto$};
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   333
\node (C3) [right=of B3] {$[]$};
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   334
\end{tikzpicture}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   335
\end{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   336
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   337
\noindent I highlighted the \emph{keys} in this map. Since
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   338
there are three labels in the factorial program (remember we
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   339
added \pcode{""}), there are three keys. When running the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   340
factorial program and encountering a jump, then we only have
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   341
to consult this snippets-map in order to find out what the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   342
next statements should be.
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   343
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   344
We should now be in the position to define how a program
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   345
should be run. In the context of interpreters, this
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   346
``running'' of programs is often called \emph{evaluation}. Let
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   347
us start with the definition of how expressions are evaluated.
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   348
A first attempt might be the following recursive function:
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   349
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   350
\begin{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   351
\begin{tabular}{l@{\hspace{1mm}}c@{\hspace{1mm}}l}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   352
$\textit{eval\_exp}(\texttt{n})$ & $\dn$ & $n$
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   353
  \qquad\text{if}\; \texttt{n}\; \text{is a number like} 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   354
  \ldots \pcode{-2}, \pcode{-1}, \pcode{0}, \pcode{1},
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   355
\pcode{2}\ldots{}\\
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   356
$\textit{eval\_exp}(\texttt{e}_\texttt{1} \,\texttt{+}\, 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   357
                    \texttt{e}_\texttt{2})$ & 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   358
  $\dn$ & $\textit{eval\_exp}(\texttt{e}_\texttt{1}) + 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   359
           \textit{eval\_exp}(\texttt{e}_\texttt{2})$\\
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   360
$\textit{eval\_exp}(\texttt{e}_\texttt{1} \,\texttt{*}\, 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   361
                    \texttt{e}_\texttt{2})$ & 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   362
  $\dn$ & $\textit{eval\_exp}(\texttt{e}_\texttt{1}) * 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   363
           \textit{eval\_exp}(\texttt{e}_\texttt{2})$\\
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   364
$\textit{eval\_exp}(\texttt{e}_\texttt{1} \,\texttt{=}\, 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   365
                    \texttt{e}_\texttt{2})$ & 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   366
  $\dn$ & $\begin{cases}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   367
  1 & \text{if}\;\textit{eval\_exp}(\texttt{e}_\texttt{1}) =
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   368
                 \textit{eval\_exp}(\texttt{e}_\texttt{2})\\
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   369
  0 & \text{otherwise}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   370
  \end{cases}$          
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   371
\end{tabular}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   372
\end{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   373
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   374
\noindent While this should look all rather intuitive`, still
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   375
be very careful. There is a subtlety which can be easily
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   376
overlooked: The function \textit{eval\_exp} takes an
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   377
expression of our programming language as input and returns a
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   378
number as output. Therefore whenever we encounter a number in
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   379
our program, we just return this number---this is defined in
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   380
the first clause above. Whenever we encounter an addition,
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   381
well then we first evaluate the left-hand side
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   382
$\texttt{e}_\texttt{1}$ of the addition (this will give a
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   383
number), then evaluate the right-hand side
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   384
$\texttt{e}_\texttt{2}$ (this gives another number), and
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   385
finally add both numbers together. Here is the subtlety: on
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   386
the left-hand side of the $\dn$ we have a \texttt{+} (in the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   387
teletype font) which is the symbol for addition in our
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   388
programming language. On the right-hand side we have $+$ which
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   389
stands for the arithmetic operation from ``mathematics'' of
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   390
adding two numbers. These are rather different concepts---one
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   391
is a symbol (which we made up), and the other a mathematical
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   392
operation. When we will have a look at an actual
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   393
implementation of our interpreter, the mathematical operation
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   394
will be the function for addition from the programming
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   395
language in which we \underline{\smash{implement}} our
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   396
interpreter. While the \texttt{+} is just a symbol that is
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   397
used in our programming language. Clearly we have to use a
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   398
symbol that is a good mnemonic for addition, otherwise we will
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   399
confuse the programmers working with our language. Therefore
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   400
we use $\texttt{+}$. A similar choice is made for times in the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   401
third clause and equality in the fourth clause. 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   402
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   403
Remember I wrote at the beginning of this section about being
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   404
god when designing a programming language. You can see this
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   405
here: we need to give meaning to symbols. At the moment
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   406
however, we are a poor fallible god. Look again at the grammar
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   407
of our programming language and our definition. Clearly, an
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   408
expression can contain variables. So far we have ignored them.
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   409
What should our interpreter do with variables? They might
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   410
change during the evaluation of a program. For example the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   411
variable \pcode{n} in the factorial program counts down from 5
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   412
down to 0. How can we improve our definition above to give also
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   413
an answer whenever our interpreter encounters a variable in an
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   414
expression? The solution is to add an \emph{environment},
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   415
written $env$, as an additional input argument to our
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   416
\textit{eval\_exp} function.
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   417
 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   418
\begin{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   419
\begin{tabular}{l@{\hspace{1mm}}c@{\hspace{1mm}}l}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   420
$\textit{eval\_exp}(\texttt{n}, env)$ & $\dn$ & $n$
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   421
  \qquad\text{if}\; \texttt{n}\; \text{is a number like} 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   422
  \ldots \pcode{-2}, \pcode{-1}, \pcode{0}, \pcode{1},
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   423
\pcode{2}\ldots{}\\
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   424
$\textit{eval\_exp}(\texttt{e}_\texttt{1} \,\texttt{+}\, 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   425
                    \texttt{e}_\texttt{2}, env)$ & 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   426
  $\dn$ & $\textit{eval\_exp}(\texttt{e}_\texttt{1}, env) + 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   427
           \textit{eval\_exp}(\texttt{e}_\texttt{2}, env)$\\
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   428
$\textit{eval\_exp}(\texttt{e}_\texttt{1} \,\texttt{*}\, 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   429
                    \texttt{e}_\texttt{2}, env)$ & 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   430
  $\dn$ & $\textit{eval\_exp}(\texttt{e}_\texttt{1}, env) * 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   431
           \textit{eval\_exp}(\texttt{e}_\texttt{2}, env)$\\
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   432
$\textit{eval\_exp}(\texttt{e}_\texttt{1} \,\texttt{=}\, 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   433
                    \texttt{e}_\texttt{2}, env)$ & 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   434
  $\dn$ & $\begin{cases}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   435
  1 & \text{if}\;\textit{eval\_exp}(\texttt{e}_\texttt{1}, env) =
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   436
                 \textit{eval\_exp}(\texttt{e}_\texttt{2}, env)\\
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   437
  0 & \text{otherwise}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   438
  \end{cases}$\\
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   439
$\textit{eval\_exp}(\texttt{x}, env)$ & $\dn$ & $env(x)$      
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   440
\end{tabular}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   441
\end{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   442
 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   443
\noindent This environment $env$ also acts like a map: it
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   444
associates variables with their current values. For example
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   445
after evaluating the first two lines in our factorial
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   446
program, such an environment might look as follows
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   447
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   448
\begin{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   449
\begin{tabular}{ll}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   450
$\fbox{\texttt{a}} \mapsto \texttt{1}$ &
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   451
$\fbox{\texttt{n}} \mapsto \texttt{5}$
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   452
\end{tabular}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   453
\end{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   454
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   455
\noindent Again I highlighted the keys. In the clause for
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   456
variables, we can therefore consult this environment and
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   457
return whatever value is currently stored for this variable.
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   458
This is written $env(x)$---meaning we query this map with $x$
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   459
we obtain the corresponding number. You might ask what happens
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   460
if an environment does not contain any value for, say, the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   461
variable $x$? Well, then our interpreter just ``crashes'', or
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   462
more precisely will raise an exception. In this case we have a
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   463
``bad'' program that tried to use a variable before it was
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   464
initialised. The programmer should not have done this. In a
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   465
real programming language, we would of course try a bit harder
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   466
and for example give an error at compile time, or design our
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   467
language in such a way that this can never happen. With the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   468
second version of \textit{eval\_exp} we completed our
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   469
definition for evaluating expressions.
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   470
 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   471
Next comes the evaluation function for statements. We define
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   472
this function in such a way that we recursively evaluate a
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   473
whole sequence of statements. Assume a program $p$ (you want
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   474
to evaluate) and its pre-processed snippets $sn$. Then we can
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   475
define:
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   476
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   477
\begin{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   478
\begin{tabular}{@{}l@{\hspace{1mm}}c@{\hspace{1mm}}l@{}}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   479
$\textit{eval\_stmts}([], env)$ & $\dn$ & $env$\\
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   480
$\textit{eval\_stmts}(\texttt{label:}\;rest, env)$ &
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   481
  $\dn$ & $\textit{eval\_stmts}(rest, env)$ \\ 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   482
$\textit{eval\_stmts}(\texttt{x\,:=\,e}\;rest, env)$ & 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   483
  $\dn$ & $\textit{eval\_stmts}(rest, 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   484
           env[x := \textit{eval\_exp}(\texttt{e}, env)])$\\  
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   485
$\textit{eval\_stmts}(\texttt{goto\,lbl}\;rest, env)$ 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   486
 & $\dn$ & $\textit{eval\_stmts}(sn(\texttt{lbl}), env)$\\
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   487
$\textit{eval\_stmts}(\texttt{jmp?\,e\,lbl}\;rest, env)$ 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   488
 & $\dn$ & $\begin{cases}\begin{array}{@{}l@{\hspace{-12mm}}r@{}}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   489
 \textit{eval\_stmts}(sn(\texttt{lbl}), env)\\ 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   490
 & \text{if}\;\textit{eval\_exp}(\texttt{e}, env) = 1\\
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   491
 \textit{eval\_stmts}(rest, env) & \text{otherwise}\\
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   492
 \end{array}\end{cases}$  
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   493
\end{tabular}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   494
\end{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   495
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   496
\noindent The first clause is for the empty program, or when
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   497
we arrived at the end of the program. In this case we just
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   498
return the environment. The second clause is for when the next
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   499
statement is a label. That means the program is of the form
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   500
$\texttt{label:}\;rest$ where the label is some string and
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   501
$rest$ stands for all following statements. This case is easy,
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   502
because our evaluation function just discards the label and
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   503
evaluates the rest of the statements (we already extracted all
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   504
important information about labels when we pre-processed our
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   505
programs and generated the snippets). The third clause is for
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   506
variable assignments. Again we just evaluate the rest for the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   507
statements, but with a modified environment---since the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   508
variable assignment is supposed to introduce a new variable or
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   509
change the current value of a variable. For this modification
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   510
of the environment we first evaluate the expression
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   511
$\texttt{e}$ using our evaluation function for expressions.
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   512
This gives us a number. Then we assign this number to the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   513
variable $x$ in the environment. This modified environment
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   514
will be used to evaluate the rest of the program. The fourth
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   515
clause is for the unconditional jump to a label, called
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   516
\texttt{lbl}. That means we have to look up in our snippets
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   517
map $sn$ what are the next statements for this label.
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   518
Therefore we will continue with evaluating, not with the rest
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   519
of the program, but with the statements stored in the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   520
snippets-map under the label $\texttt{lbl}$. The fifth clause
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   521
for conditional jumps is similar, but to decide whether to
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   522
make the jump we first need to evaluate the expression
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   523
$\texttt{e}$ in order to find out whether it is $1$. If yes,
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   524
we jump, otherwise we just continue with evaluating the rest
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   525
of the program.
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   526
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   527
Our interpreter works in two stages: First we pre-process our
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   528
program generating the snippets map $sn$, say. Second we call
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   529
the evaluation function with the default entry point and the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   530
empty environment:
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   531
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   532
\begin{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   533
$\textit{eval\_stmts}(sn(\pcode{""}), \varnothing)$
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   534
\end{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   535
 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   536
\noindent It is interesting to note that our interpreter when
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   537
it comes to the end of the program returns an environment. Our
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   538
programming language does not contain any constructs for input
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   539
and output. Therefore this environment is the only effect we
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   540
can observe when running the program (apart from that our
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   541
interpreter might need some time before finishing the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   542
evaluation of the program and the CPU getting hot). Evaluating 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   543
the factorial program with our interpreter we receive as
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   544
``answer''-environment
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   545
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   546
\begin{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   547
\begin{tabular}{ll}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   548
$\fbox{\texttt{a}} \mapsto \texttt{120}$ &
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   549
$\fbox{\texttt{n}} \mapsto \texttt{0}$
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   550
\end{tabular}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   551
\end{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   552
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   553
\noindent While the discussion above should have illustrated
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   554
the ideas, in order to do some serious calculations, we clearly
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   555
need to implement the interpreter.
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   556
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   557
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   558
\subsubsection*{Scala Code for the Interpreter}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   559
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   560
Functional programming languages are very convenient for
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   561
implementations of interpreters. A good choice for a
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   562
functional programming language is Scala, a programming
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   563
language that combines functional and object-oriented
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   564
pro\-gramming-styles. It has received in the last five years or
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   565
so quite a bit of attention. One reason for this attention is
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   566
that, like the Java programming language, Scala compiles to
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   567
the Java Virtual Machine (JVM) and therefore Scala programs
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   568
can run under MacOSX, Linux and Windows.\footnote{There are
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   569
also experimental backends for Android and JavaScript.} Unlike
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   570
Java, however, Scala often allows programmers to write very
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   571
concise and elegant code. Some therefore say Scala is the much
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   572
better Java. A number of companies, The Guardian, Twitter,
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   573
Coursera, FourSquare, LinkedIn to name a few, either use Scala
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   574
exclusively in production code, or at least to some
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   575
substantial degree. If you want to try out Scala yourself, the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   576
Scala compiler can be downloaded from
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   577
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   578
\begin{quote}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   579
\url{http://www.scala-lang.org}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   580
\end{quote}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   581
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   582
Let us have a look at the Scala code shown in
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   583
Figure~\ref{code}. It shows the entire code for the
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   584
interpreter, though the implementation is admittedly no
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   585
frills.
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   586
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   587
\begin{figure}[t]
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   588
\small
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   589
\lstinputlisting[language=Scala]{../progs/inter.scala}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   590
\caption{The entire code of the interpreter for our
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   591
idealised programming language.\label{code}}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   592
\end{figure}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   593
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   594
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   595
\subsubsection*{Static Analysis}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   596
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   597
Finally we can come back to our original problem, namely 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   598
finding out what the signs of variables are 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   599
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   600
\begin{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   601
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   602
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   603
\end{center}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   604
 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   605
\end{document}
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   606
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   607
%% list of static analysers for C
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   608
http://spinroot.com/static/index.html
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   609
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   610
%% NASA coding rules for C
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   611
http://pixelscommander.com/wp-content/uploads/2014/12/P10.pdf
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   612
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   613
%%% Local Variables: 
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   614
%%% mode: latex
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   615
%%% TeX-master: t
8a12889f8c8a updated
Christian Urban <urbanc@in.tum.de>
parents:
diff changeset
   616
%%% End: