handouts/ho07.tex
author Christian Urban <christian dot urban at kcl dot ac dot uk>
Tue, 17 Nov 2015 01:58:50 +0000
changeset 372 d6af4b1239de
parent 370 a65767fe5d71
child 373 b018234c9126
permissions -rw-r--r--
updated
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
327
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     1
\documentclass{article}
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     2
\usepackage{../style}
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     3
\usepackage{../langs}
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
     4
\usepackage{../grammar}
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
     5
\usepackage{../graphics}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
     6
327
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     7
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     8
\begin{document}
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     9
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    10
\section*{Handout 7 (Compilation)}
327
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    11
369
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 327
diff changeset
    12
The purpose of a compiler is to transform a program, a human
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 327
diff changeset
    13
can write, into code the machine can run as fast as possible.
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 327
diff changeset
    14
The fastest code would be machine code the CPU can run
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 327
diff changeset
    15
directly, but it is often enough to improve the speed of a
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 327
diff changeset
    16
program by just targeting a virtual machine. This produces not
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 327
diff changeset
    17
the fastest possible code, but code that is fast enough and
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 327
diff changeset
    18
has the advantage that the virtual machine care of things a
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 327
diff changeset
    19
compiler would normally need to take care of (like explicit
43c0ed473720 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 327
diff changeset
    20
memory management).
327
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    21
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    22
As an example we will implement a compiler for the very simple
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    23
While-language. We will be generating code for the Java
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    24
Virtual Machine. This is a stack-based virtual machine, a fact
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    25
which will make it easy to generate code for arithmetic
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    26
expressions. For example for generating code for the
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    27
expression $1 + 2$ we need to generate the following three
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    28
instructions
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    29
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    30
\begin{lstlisting}[numbers=none]
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    31
ldc 1
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    32
ldc 2
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    33
iadd 
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    34
\end{lstlisting}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    35
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    36
\noindent The first instruction loads the constant $1$ onto
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    37
the stack, the next one $2$, the third instruction adds both
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    38
numbers together replacing the top elements of the stack with
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    39
the result $3$. For simplicity, we will throughout consider
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    40
only integer numbers and results. Therefore we can use the
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    41
instructions \code{iadd}, \code{isub}, \code{imul},
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    42
\code{idiv} and so on. The \code{i} stands for integer
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    43
instructions in the JVM (alternatives are \code{d} for doubles,
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    44
\code{l} for longs and \code{f} for floats).
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    45
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    46
Recall our grammar for arithmetic expressions ($E$ is the
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    47
starting symbol):
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    48
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    49
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    50
\begin{plstx}[rhs style=, margin=3cm]
: \meta{E} ::= \meta{T} $+$ \meta{E}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    51
         | \meta{T} $-$ \meta{E}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    52
         | \meta{T}\\
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    53
: \meta{T} ::= \meta{F} $*$ \meta{T}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    54
          | \meta{F} $\backslash$ \meta{T}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    55
          | \meta{F}\\
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    56
: \meta{F} ::= ( \meta{E} )
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    57
          | \meta{Id}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    58
          | \meta{Num}\\
\end{plstx}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    59
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    60
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    61
\noindent where \meta{Id} stands for variables and
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    62
\meta{Num} for numbers. For the moment let us omit variables from
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    63
arithmetic expressions. Our parser will take this grammar and
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    64
produce abstract syntax trees. For
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    65
example for the expression $1 + ((2 * 3) + (4 - 3))$ it will
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    66
produce the following tree.
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    67
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    68
\begin{center}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    69
\begin{tikzpicture}
\Tree [.$+$ [.$1$ ] [.$+$ [.$*$ $2$ $3$ ] [.$-$ $4$ $3$ ]]]
\end{tikzpicture}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    70
\end{center}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    71
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    72
\noindent To generate code for this expression, we need to
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    73
traverse this tree in post-order fashion and emit code for
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    74
each node---this traversal in post-order fashion will produce
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    75
code for a stack-machine (what the JVM is). Doing so for the
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    76
tree above generates the instructions
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    77
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    78
\begin{lstlisting}[numbers=none]
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    79
ldc 1 
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    80
ldc 2 
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    81
ldc 3 
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    82
imul 
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    83
ldc 4 
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    84
ldc 3 
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    85
isub 
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    86
iadd 
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    87
iadd
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    88
\end{lstlisting}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    89
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    90
\noindent If we ``run'' these instructions, the result $8$
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    91
will be on top of the stack (I leave this to you to verify;
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    92
the meaning of each instruction should be clear). The result
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    93
being on the top of the stack will be a convention we always
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    94
observe in our compiler, that is the results of arithmetic
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    95
expressions will always be on top of the stack. Note, that a
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    96
different bracketing of the expression, for example $(1 + (2 *
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    97
3)) + (4 - 3)$, produces a different abstract syntax tree and
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    98
thus potentially also a different list of instructions.
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    99
Generating code in this fashion is rather easy to implement:
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   100
it can be done with the following \textit{compile}-function,
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   101
which takes the abstract syntax tree as argument:
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   102
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   103
\begin{center}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   104
\begin{tabular}{lcl}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   105
$\textit{compile}(n)$ & $\dn$ & $\pcode{ldc}\; n$\\
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   106
$\textit{compile}(a_1 + a_2)$ & $\dn$ &
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   107
$\textit{compile}(a_1) \;@\;\textit{compile}(a_2)\;@\; \pcode{iadd}$\\
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   108
$\textit{compile}(a_1 - a_2)$ & $\dn$ & 
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   109
$\textit{compile}(a_1) \;@\; \textit{compile}(a_2)\;@\; \pcode{isub}$\\
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   110
$\textit{compile}(a_1 * a_2)$ & $\dn$ & 
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   111
$\textit{compile}(a_1) \;@\; \textit{compile}(a_2)\;@\; \pcode{imul}$\\
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   112
$\textit{compile}(a_1 \backslash a_2)$ & $\dn$ & 
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   113
$\textit{compile}(a_1) \;@\; \textit{compile}(a_2)\;@\; \pcode{idiv}$\\
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   114
\end{tabular}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   115
\end{center}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   116
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   117
However, our arithmetic expressions can also contain
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   118
variables. We will represent them as \emph{local variables} in
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   119
the JVM. Essentially, local variables are an array or pointers
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   120
to memory cells, containing in our case only integers. Looking
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   121
up a variable can be done with the instruction
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   122
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   123
\begin{lstlisting}[mathescape,numbers=none]
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   124
iload $index$
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   125
\end{lstlisting}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   126
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   127
\noindent 
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   128
which places the content of the local variable $index$ onto 
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   129
the stack. Storing the top of the stack into a local variable 
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   130
can be done by the instruction
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   131
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   132
\begin{lstlisting}[mathescape,numbers=none]
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   133
istore $index$
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   134
\end{lstlisting}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   135
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   136
\noindent Note that this also pops off the top of the stack.
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   137
One problem we have to overcome, however, is that local
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   138
variables are addressed, not by identifiers, but by numbers
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   139
(starting from $0$). Therefore our compiler needs to maintain
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   140
a kind of environment where variables are associated to
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   141
numbers. This association needs to be unique: if we muddle up
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   142
the numbers, then we essentially confuse variables and the
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   143
consequence will usually be an erroneous result. Our extended
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   144
\textit{compile}-function for arithmetic expressions will
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   145
therefore take two arguments: the abstract syntax tree and the
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   146
environment, $E$, that maps identifiers to index-numbers.
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   147
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   148
\begin{center}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   149
\begin{tabular}{lcl}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   150
$\textit{compile}(n, E)$ & $\dn$ & $\pcode{ldc}\;n$\\
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   151
$\textit{compile}(a_1 + a_2, E)$ & $\dn$ & 
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   152
$\textit{compile}(a_1, E) \;@\;\textit{compile}(a_2, E)\;@\; \pcode{iadd}$\\
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   153
$\textit{compile}(a_1 - a_2, E)$ & $\dn$ &
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   154
$\textit{compile}(a_1, E) \;@\; \textit{compile}(a_2, E)\;@\; \pcode{isub}$\\
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   155
$\textit{compile}(a_1 * a_2, E)$ & $\dn$ &
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   156
$\textit{compile}(a_1, E) \;@\; \textit{compile}(a_2, E)\;@\; \pcode{imul}$\\
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   157
$\textit{compile}(a_1 \backslash a_2, E)$ & $\dn$ & 
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   158
$\textit{compile}(a_1, E) \;@\; \textit{compile}(a_2, E)\;@\; \pcode{idiv}$\\
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   159
$\textit{compile}(x, E)$ & $\dn$ & $\pcode{iload}\;E(x)$\\
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   160
\end{tabular}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   161
\end{center}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   162
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   163
\noindent In the last line we generate the code for variables
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   164
where $E(x)$ stands for looking up the environment to which
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   165
index the variable $x$ maps to.
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   166
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   167
There is a similar \textit{compile}-function for boolean
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   168
expressions, but it includes a ``trick'' to do with
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   169
\pcode{if}- and \pcode{while}-statements. To explain the issue
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   170
let us explain first the compilation of statements of the
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   171
While-language. The clause for \pcode{skip} is trivial, since
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   172
we do not have to generate any instruction
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   173
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   174
\begin{center}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   175
\begin{tabular}{lcl}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   176
$\textit{compile}(\pcode{skip}, E)$ & $\dn$ & $([], E)$\\
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   177
\end{tabular}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   178
\end{center}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   179
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   180
\noindent Note that the \textit{compile}-function for
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   181
statements returns a pair, a list of instructions (in this
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   182
case the empty list) and an environment for variables. The
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   183
reason for the environment is that assignments in the
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   184
While-language might change the environment---clearly if a
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   185
variable is used for the first time, we need to allocate a new
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   186
index and if it has been used before, we need to be able to
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   187
retrieve the associated index. This is reflected in
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   188
the clause for compiling assignments:
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   189
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   190
\begin{center}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   191
\begin{tabular}{lcl}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   192
$\text{compile}(x := a, E)$ & $\dn$ & 
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   193
$(\textit{compile}(a, E) \;@\;\pcode{istore}\;index, E')$
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   194
\end{tabular}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   195
\end{center}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   196
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   197
\noindent We first generate code for the right-hand side of
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   198
the assignment and then add an \pcode{istore}-instruction at
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   199
the end. By convention the result of the arithmetic expression
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   200
$a$ will be on top of the stack. After the \pcode{istore}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   201
instruction, the result will be stored in the index
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   202
corresponding to the variable $x$. If the variable $x$ has
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   203
been used before in the program, we just need to look up what
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   204
the index is and return the environment unchanged (that is in
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   205
this case $E' = E$). However, if this is the first encounter 
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   206
of the variable $x$ in the program, then we have to augment 
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   207
the environment and assign $x$ with the largest index in $E$
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   208
plus one (that is $E' = E(x \mapsto largest\_index + 1)$). 
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   209
That means for the assignment $x := x + 1$ we generate the
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   210
following code
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   211
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   212
\begin{lstlisting}[mathescape,numbers=none]
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   213
iload $n_x$
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   214
ldc 1
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   215
iadd
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   216
istore $n_x$
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   217
\end{lstlisting}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   218
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   219
\noindent 
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   220
where $n_x$ is the index for the variable $x$.
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   221
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   222
More complicated is the code for \pcode{if}-statments, say
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   223
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   224
\begin{lstlisting}[mathescape,language={},numbers=none]
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   225
if $b$ then $cs_1$ else $cs_2$
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   226
\end{lstlisting}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   227
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   228
\noindent where $b$ is a boolean expression and the $cs_i$
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   229
are the instructions for each \pcode{if}-branch. Lets assume
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   230
we already generated code for $b$ and $cs_{1/2}$. Then in the
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   231
true-case the control-flow of the program needs to be
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   232
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   233
\begin{center}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   234
\begin{tikzpicture}[node distance=2mm and 4mm,
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   235
 block/.style={rectangle, minimum size=1cm, draw=black, line width=1mm},
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   236
 point/.style={rectangle, inner sep=0mm, minimum size=0mm, fill=red},
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   237
 skip loop/.style={black, line width=1mm, to path={-- ++(0,-10mm) -| (\tikztotarget)}}]
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   238
\node (A1) [point] {};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   239
\node (b) [block, right=of A1] {code of $b$};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   240
\node (A2) [point, right=of b] {};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   241
\node (cs1) [block, right=of A2] {code of $cs_1$};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   242
\node (A3) [point, right=of cs1] {};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   243
\node (cs2) [block, right=of A3] {code of $cs_2$};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   244
\node (A4) [point, right=of cs2] {};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   245
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   246
\draw (A1) edge [->, black, line width=1mm] (b);
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   247
\draw (b) edge [->, black, line width=1mm] (cs1);
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   248
\draw (cs1) edge [->, black, line width=1mm] (A3);
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   249
\draw (A3) edge [->, black, skip loop] (A4);
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   250
\node [below=of cs2] {\raisebox{-5mm}{\small{}jump}};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   251
\end{tikzpicture}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   252
\end{center}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   253
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   254
\noindent where we start with running the code for $b$; since
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   255
we are in the true case we continue with running the code for
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   256
$cs_1$. After this however, we must not run the code for
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   257
$cs_2$, but always jump after the last instruction of $cs_2$
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   258
(the code for the \pcode{else}-branch). Note that this jump is
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   259
unconditional, meaning we always have to jump to the end of
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   260
$cs_2$. The corresponding instruction of the JVM is
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   261
\pcode{goto}. In case $b$ turns out to be false we need the
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   262
control-flow
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   263
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   264
\begin{center}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   265
\begin{tikzpicture}[node distance=2mm and 4mm,
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   266
 block/.style={rectangle, minimum size=1cm, draw=black, line width=1mm},
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   267
 point/.style={rectangle, inner sep=0mm, minimum size=0mm, fill=red},
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   268
 skip loop/.style={black, line width=1mm, to path={-- ++(0,-10mm) -| (\tikztotarget)}}]
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   269
\node (A1) [point] {};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   270
\node (b) [block, right=of A1] {code of $b$};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   271
\node (A2) [point, right=of b] {};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   272
\node (cs1) [block, right=of A2] {code of $cs_1$};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   273
\node (A3) [point, right=of cs1] {};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   274
\node (cs2) [block, right=of A3] {code of $cs_2$};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   275
\node (A4) [point, right=of cs2] {};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   276
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   277
\draw (A1) edge [->, black, line width=1mm] (b);
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   278
\draw (b) edge [->, black, line width=1mm] (A2);
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   279
\draw (A2) edge [skip loop] (A3);
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   280
\draw (A3) edge [->, black, line width=1mm] (cs2);
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   281
\draw (cs2) edge [->,black, line width=1mm] (A4);
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   282
\node [below=of cs1] {\raisebox{-5mm}{\small{}conditional jump}};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   283
\end{tikzpicture}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   284
\end{center}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   285
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   286
\noindent where we now need a conditional jump (if the
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   287
if-condition is false) from the end of the code for the 
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   288
boolean to the beginning of the instructions $cs_2$. Once we 
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   289
are finished with running $cs_2$ we can continue with whatever
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   290
code comes after the if-statement.
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   291
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   292
The \pcode{goto} and conditional jumps need addresses to where
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   293
the jump should go. Since we are generating assembly code for
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   294
the JVM, we do not actually have to give addresses, but need
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   295
to attach labels to our code. These labels specify a target
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   296
for a jump. Therefore the labels need to be unique, as
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   297
otherwise it would be ambiguous where a jump should go. 
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   298
A labels, say \pcode{L}, is attached to code like
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   299
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   300
\begin{lstlisting}[mathescape,numbers=none]
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   301
L:
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   302
  $instr_1$
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   303
  $instr_2$
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   304
    $\vdots$
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   305
\end{lstlisting}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   306
 
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   307
Recall the ``trick'' with compiling boolean expressions: the 
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   308
\textit{compile}-function for boolean expressions takes three
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   309
arguments: an abstract syntax tree, an environment for 
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   310
variable indices and also the label, $lab$, to where an conditional 
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   311
jump needs to go. The clause for the expression $a_1 = a_2$, 
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   312
for example, is as follows:
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   313
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   314
\begin{center}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   315
\begin{tabular}{lcl}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   316
$\textit{compile}(a_1 = a_2, E, lab)$ & $\dn$\\ 
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   317
\multicolumn{3}{l}{$\qquad\textit{compile}(a_1, E) \;@\;\textit{compile}(a_2, E)\;@\; \pcode{if_icmpne}\;lab$}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   318
\end{tabular}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   319
\end{center}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   320
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   321
\noindent 
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   322
We are generating code for the subexpressions $a_1$ and $a_2$. 
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   323
This will mean after running the corresponding code there will
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   324
be two integers on top of the stack. If they are equal, we do 
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   325
not have to do anything and just continue with the next 
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   326
instructions (see control-flow of ifs above). However if they 
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   327
are \emph{not} equal, then we need to (conditionally) jump to 
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   328
the label $lab$. This can be done with the instruction
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   329
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   330
\begin{lstlisting}[mathescape,numbers=none]
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   331
if_icmpne $lab$
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   332
\end{lstlisting}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   333
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   334
\noindent Other jump instructions for boolean operators are
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   335
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   336
\begin{center}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   337
\begin{tabular}{l@{\hspace{10mm}}c@{\hspace{10mm}}l}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   338
$=$ & $\Rightarrow$ & \pcode{if_icmpne}\\
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   339
$\not=$ & $\Rightarrow$ & \pcode{if_icmpeq}\\
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   340
$<$ & $\Rightarrow$ & \pcode{if_icmpge}\\
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   341
$\le$ & $\Rightarrow$ & \pcode{if_icmpgt}\\
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   342
\end{tabular}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   343
\end{center}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   344
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   345
\noindent and so on. I leave it to you to extend the
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   346
\textit{compile}-function for the other boolean expressions.
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   347
Note that we need to jump whenever the boolean is \emph{not}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   348
true, which means we have to ``negate'' the jump---equals
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   349
becomes not-equal, less becomes greater-or-equal. If you do
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   350
not like this design (it can be the source of some nasty,
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   351
hard-to-detect errors), you can also change the layout of the
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   352
code and first give the code for the else-branch and then for
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   353
the if-branch.
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   354
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   355
We are now ready to give the compile function for 
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   356
if-statments--remember this function returns for staments a 
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   357
pair consisting of the code and an environment:
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   358
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   359
\begin{center}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   360
\begin{tabular}{lcl}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   361
$\textit{compile}(\pcode{if}\;b\;\pcode{then}\; cs_1\;\pcode{else}\; cs_2, E)$ & $\dn$\\ 
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   362
\multicolumn{3}{l}{$\qquad l_\textit{ifelse}\;$ (fresh label)}\\
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   363
\multicolumn{3}{l}{$\qquad l_\textit{ifend}\;$ (fresh label)}\\
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   364
\multicolumn{3}{l}{$\qquad (is_1, E') = \textit{compile}(cs_1, E)$}\\
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   365
\multicolumn{3}{l}{$\qquad (is_2, E'') = \textit{compile}(cs_2, E')$}\\
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   366
\multicolumn{3}{l}{$\qquad(\textit{compile}(b, E, l_\textit{ifelse})$}\\
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   367
\multicolumn{3}{l}{$\qquad\phantom{(}@\;is_1$}\\
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   368
\multicolumn{3}{l}{$\qquad\phantom{(}@\; \pcode{goto}\;l_\textit{ifend}$}\\
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   369
\multicolumn{3}{l}{$\qquad\phantom{(}@\;l_\textit{ifelse}:$}\\
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   370
\multicolumn{3}{l}{$\qquad\phantom{(}@\;is_2$}\\
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   371
\multicolumn{3}{l}{$\qquad\phantom{(}@\;l_\textit{ifend}:, E'')$}\\
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   372
\end{tabular}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   373
\end{center}
327
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   374
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   375
\end{document}
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   376
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   377
%%% Local Variables: 
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   378
%%% mode: latex  
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   379
%%% TeX-master: t
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   380
%%% End: