handouts/ho07.tex
author Christian Urban <christian.urban@kcl.ac.uk>
Tue, 21 Oct 2025 17:09:56 +0200
changeset 1015 e8ba0237f005
parent 960 c7009356ddd8
permissions -rw-r--r--
updated
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
601
208b0f67a3d0 updated
Christian Urban <urbanc@in.tum.de>
parents: 600
diff changeset
     1
% !TEX program = xelatex
327
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     2
\documentclass{article}
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     3
\usepackage{../style}
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
     4
\usepackage{../langs}
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
     5
\usepackage{../grammar}
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
     6
\usepackage{../graphics}
714
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
     7
\usetikzlibrary{calc,shapes,arrows}
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
     8
\usepackage{framed}
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
     9
\usepackage[belowskip=7pt,aboveskip=0pt]{caption}
705
bfc8703b1527 updated
Christian Urban <urbanc@in.tum.de>
parents: 692
diff changeset
    10
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
    11
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
    12
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
    13
327
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    14
\begin{document}
941
66adcae6c762 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 714
diff changeset
    15
\fnote{\copyright{} Christian Urban, King's College London, 2017, 2018, 2019, 2020, 2023}
327
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    16
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
    17
\section*{Handout 7 (Compilation)}
327
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
    18
668
9ce78065f68d updated
Christian Urban <urbanc@in.tum.de>
parents: 601
diff changeset
    19
The purpose of a compiler is to transform a program a human can read and
9ce78065f68d updated
Christian Urban <urbanc@in.tum.de>
parents: 601
diff changeset
    20
write into code the machine can run as fast as possible. The fastest
9ce78065f68d updated
Christian Urban <urbanc@in.tum.de>
parents: 601
diff changeset
    21
code would be machine code the CPU can run directly, but it is often
709
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
    22
good enough for improving the speed of a program to target a virtual
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
    23
machine instead. This produces not the fastest possible code, but code
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
    24
that is often pretty fast. This way of producing code has also the
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
    25
advantage that the virtual machine takes care of things a compiler would
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
    26
normally need to take care of (hairy things like explicit memory
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
    27
management).
452
b93f4d2aeee1 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 394
diff changeset
    28
b93f4d2aeee1 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 394
diff changeset
    29
As a first example in this module we will implement a compiler for the
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
    30
very simple WHILE-language that we parsed in the last lecture. The
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
    31
compiler will target the Java Virtual Machine (JVM), but not directly.
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
    32
Pictorially the compiler will work as follows:
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
    33
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
    34
\begin{center}
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
    35
  \begin{tikzpicture}[scale=1,font=\bf,
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
    36
                      node/.style={
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
    37
                      rectangle,rounded corners=3mm,
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
    38
                      ultra thick,draw=black!50,minimum height=18mm,
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
    39
                      minimum width=20mm,
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
    40
                      top color=white,bottom color=black!20}]
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
    41
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
    42
  \node (0) at (-3,0) {};
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
    43
  \node (A) at (0,0) [node,text width=1.6cm,text centered] {our compiler};
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
    44
  \node (B) at (3.5,0) [node,text width=1.6cm,text centered] {Jasmin / Krakatau};
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
    45
  \node (C) at (7.5,0) [node] {JVM};
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
    46
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
    47
  \draw [->,line width=2.5mm] (0) -- node [above,pos=0.35] {*.while} (A);
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
    48
  \draw [->,line width=2.5mm] (A) -- node [above,pos=0.35] {*.j} (B);
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
    49
  \draw [->,line width=2.5mm] (B) -- node [above,pos=0.35] {*.class} (C);
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
    50
  \end{tikzpicture}
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
    51
  \end{center}
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
    52
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
    53
\noindent
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
    54
The input will be WHILE-programs; the output will be assembly files
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
    55
(with the file extension *.j). Assembly files essentially contain
941
66adcae6c762 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 714
diff changeset
    56
human-readable low-level code, meaning they are not just bits and
66adcae6c762 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 714
diff changeset
    57
bytes, but rather something you can read and understand---with a bit
66adcae6c762 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 714
diff changeset
    58
of practice of course. An \emph{assembler} will then translate the
66adcae6c762 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 714
diff changeset
    59
assembly files into unreadable class- or binary-files the JVM or CPU
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
    60
can run, i.e.~bits and bytes.  Unfortunately, the Java ecosystem does not come with an
941
66adcae6c762 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 714
diff changeset
    61
assembler which would be handy for our compiler-endeavour (unlike
66adcae6c762 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 714
diff changeset
    62
Microsoft's Common Language Infrastructure for the .Net platform which
66adcae6c762 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 714
diff changeset
    63
has an assembler out-of-the-box). As a substitute we shall use the
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
    64
3rd-party program Jasmin, or alternatively Krakatau (Jasmin is the preferred
941
66adcae6c762 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 714
diff changeset
    65
option---a \texttt{jasmin.jar}-file is available on KEATS):
690
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
    66
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
    67
\begin{itemize}
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
    68
  \item \url{http://jasmin.sourceforge.net}
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
    69
  \item \url{https://github.com/Storyyeller/Krakatau}
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
    70
\end{itemize}
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
    71
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
    72
\noindent
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
    73
The first is a Java program and the second a program written in Python.
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
    74
Each of them allow us to generate \emph{assembly} files that are still
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
    75
readable by humans, as opposed to class-files which are pretty much just
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
    76
(horrible) zeros and ones. Jasmin (respectively Krakatau) will then take
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
    77
our assembly files as input and generate the corresponding class-files for
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
    78
us.
690
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
    79
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
    80
What is good about the JVM is that it is a stack-based virtual machine,
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
    81
a fact which will make it easy to generate code for arithmetic
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
    82
expressions. For example when compiling the expression $1 + 2$ we need
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
    83
to generate the following three instructions
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    84
668
9ce78065f68d updated
Christian Urban <urbanc@in.tum.de>
parents: 601
diff changeset
    85
\begin{lstlisting}[language=JVMIS,numbers=none]
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    86
ldc 1
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    87
ldc 2
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
    88
iadd
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    89
\end{lstlisting}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
    90
709
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
    91
\noindent The first instruction loads the constant $1$ onto the stack,
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
    92
the next one loads $2$, the third instruction adds both numbers together
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
    93
replacing the top two elements of the stack with the result $3$. For
943
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
    94
simplicity, we will consider only arithmetic operations involving
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
    95
integer numbers. This means our main JVM instructions for arithmetic
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
    96
will be \instr{iadd}, \instr{isub}, \instr{imul}, \instr{idiv} and so on.
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
    97
The \code{i} stands for integer instructions in the JVM (alternatives
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
    98
are \code{d} for doubles, \code{l} for longs and \code{f} for floats
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
    99
etc).
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   100
600
d488a3e7b0ec updated
Christian Urban <urbanc@in.tum.de>
parents: 452
diff changeset
   101
Recall our grammar for arithmetic expressions (\meta{E} is the
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   102
starting symbol):
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   103
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   104
601
208b0f67a3d0 updated
Christian Urban <urbanc@in.tum.de>
parents: 600
diff changeset
   105
\begin{plstx}[rhs style=, margin=3cm]
208b0f67a3d0 updated
Christian Urban <urbanc@in.tum.de>
parents: 600
diff changeset
   106
: \meta{E} ::= \meta{T} $+$ \meta{E}
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   107
         | \meta{T} $-$ \meta{E}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   108
         | \meta{T}\\
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   109
: \meta{T} ::= \meta{F} $*$ \meta{T}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   110
          | \meta{F} $\backslash$ \meta{T}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   111
          | \meta{F}\\
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   112
: \meta{F} ::= ( \meta{E} )
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   113
          | \meta{Id}
601
208b0f67a3d0 updated
Christian Urban <urbanc@in.tum.de>
parents: 600
diff changeset
   114
          | \meta{Num}\\
208b0f67a3d0 updated
Christian Urban <urbanc@in.tum.de>
parents: 600
diff changeset
   115
\end{plstx}
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   116
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   117
376
af65ffff9cdd updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 375
diff changeset
   118
\noindent where \meta{Id} stands for variables and \meta{Num}
668
9ce78065f68d updated
Christian Urban <urbanc@in.tum.de>
parents: 601
diff changeset
   119
for numbers. For the moment let us omit variables from arithmetic
9ce78065f68d updated
Christian Urban <urbanc@in.tum.de>
parents: 601
diff changeset
   120
expressions. Our parser will take this grammar and given an input
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   121
program produce an abstract syntax tree. For example we obtain for
709
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   122
the expression $1 + ((2 * 3) + (4 - 3))$ the following tree.
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   123
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   124
\begin{center}
601
208b0f67a3d0 updated
Christian Urban <urbanc@in.tum.de>
parents: 600
diff changeset
   125
\begin{tikzpicture}
208b0f67a3d0 updated
Christian Urban <urbanc@in.tum.de>
parents: 600
diff changeset
   126
\Tree [.$+$ [.$1$ ] [.$+$ [.$*$ $2$ $3$ ] [.$-$ $4$ $3$ ]]]
208b0f67a3d0 updated
Christian Urban <urbanc@in.tum.de>
parents: 600
diff changeset
   127
\end{tikzpicture}
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   128
\end{center}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   129
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   130
\noindent To generate JVM code for this expression, we need to traverse
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   131
this tree in \emph{post-order} fashion and emit code for each
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   132
node---this traversal in \emph{post-order} fashion will produce code for
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   133
a stack-machine (which is what the JVM is). Doing so for the tree above
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   134
generates the instructions
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   135
668
9ce78065f68d updated
Christian Urban <urbanc@in.tum.de>
parents: 601
diff changeset
   136
\begin{lstlisting}[language=JVMIS,numbers=none]
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   137
ldc 1
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   138
ldc 2
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   139
ldc 3
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   140
imul
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   141
ldc 4
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   142
ldc 3
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   143
isub
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   144
iadd
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   145
iadd
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   146
\end{lstlisting}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   147
668
9ce78065f68d updated
Christian Urban <urbanc@in.tum.de>
parents: 601
diff changeset
   148
\noindent If we ``run'' these instructions, the result $8$ will be on
9ce78065f68d updated
Christian Urban <urbanc@in.tum.de>
parents: 601
diff changeset
   149
top of the stack (I leave this to you to verify; the meaning of each
9ce78065f68d updated
Christian Urban <urbanc@in.tum.de>
parents: 601
diff changeset
   150
instruction should be clear). The result being on the top of the stack
690
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   151
will be an important convention we always observe in our compiler. Note,
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   152
that a different bracketing of the expression, for example $(1 + (2 *
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   153
3)) + (4 - 3)$, produces a different abstract syntax tree and thus also
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   154
a different list of instructions.
709
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   155
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   156
Generating code in this post-order-traversal fashion is rather easy to
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   157
implement: it can be done with the following recursive
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   158
\textit{compile}-function, which takes the abstract syntax tree as an
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   159
argument:
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   160
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   161
\begin{center}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   162
\begin{tabular}{lcl}
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   163
$\textit{compile}(n)$ & $\dn$ & $\instr{ldc}\; n$\\
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   164
$\textit{compile}(a_1 + a_2)$ & $\dn$ &
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   165
$\textit{compile}(a_1) \;@\;\textit{compile}(a_2)\;@\; \instr{iadd}$\\
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   166
$\textit{compile}(a_1 - a_2)$ & $\dn$ &
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   167
$\textit{compile}(a_1) \;@\; \textit{compile}(a_2)\;@\; \instr{isub}$\\
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   168
$\textit{compile}(a_1 * a_2)$ & $\dn$ &
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   169
$\textit{compile}(a_1) \;@\; \textit{compile}(a_2)\;@\; \instr{imul}$\\
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   170
$\textit{compile}(a_1 \backslash a_2)$ & $\dn$ &
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   171
$\textit{compile}(a_1) \;@\; \textit{compile}(a_2)\;@\; \instr{idiv}$\\
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   172
\end{tabular}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   173
\end{center}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   174
709
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   175
\noindent
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   176
This is all fine, but our arithmetic expressions can contain variables
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   177
and we have not considered them yet. To fix this we will represent our
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   178
variables as \emph{local variables} of the JVM. Essentially, local
709
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   179
variables are an array or pointers to memory cells, containing in our
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   180
case only integers. Looking up a variable can be done with the
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   181
instruction
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   182
668
9ce78065f68d updated
Christian Urban <urbanc@in.tum.de>
parents: 601
diff changeset
   183
\begin{lstlisting}[language=JVMIS,mathescape,numbers=none]
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   184
iload $index$
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   185
\end{lstlisting}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   186
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   187
\noindent
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   188
which places the content of the local variable $index$ onto
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   189
the stack. Storing the top of the stack into a local variable
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   190
can be done by the instruction
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   191
668
9ce78065f68d updated
Christian Urban <urbanc@in.tum.de>
parents: 601
diff changeset
   192
\begin{lstlisting}[language=JVMIS,mathescape,numbers=none]
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   193
istore $index$
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   194
\end{lstlisting}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   195
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   196
\noindent Note that this also pops off the top of the stack. One problem
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   197
we have to overcome, however, is that local variables are addressed, not
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   198
by identifiers (like \texttt{x}, \texttt{foo} and so on), but by numbers
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   199
(starting from $0$). Therefore our compiler needs to maintain a kind of
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   200
environment where variables are associated to numbers. This association
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   201
needs to be unique: if we muddle up the numbers, then we essentially
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   202
confuse variables and the consequence will usually be an erroneous
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   203
result. Our extended \textit{compile}-function for arithmetic
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   204
expressions will therefore take two arguments: the abstract syntax tree
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   205
and an environment, $E$, that maps identifiers to index-numbers.
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   206
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   207
\begin{center}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   208
\begin{tabular}{lcl}
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   209
$\textit{compile}(n, E)$ & $\dn$ & $\instr{ldc}\;n$\\
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   210
$\textit{compile}(a_1 + a_2, E)$ & $\dn$ &
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   211
$\textit{compile}(a_1, E) \;@\;\textit{compile}(a_2, E)\;@\; \instr{iadd}$\\
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   212
$\textit{compile}(a_1 - a_2, E)$ & $\dn$ &
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   213
$\textit{compile}(a_1, E) \;@\; \textit{compile}(a_2, E)\;@\; \instr{isub}$\\
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   214
$\textit{compile}(a_1 * a_2, E)$ & $\dn$ &
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   215
$\textit{compile}(a_1, E) \;@\; \textit{compile}(a_2, E)\;@\; \instr{imul}$\\
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   216
$\textit{compile}(a_1 \backslash a_2, E)$ & $\dn$ &
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   217
$\textit{compile}(a_1, E) \;@\; \textit{compile}(a_2, E)\;@\; \instr{idiv}$\\
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   218
$\textit{compile}(x, E)$ & $\dn$ & $\instr{iload}\;E(x)$\\
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   219
\end{tabular}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   220
\end{center}
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   221
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   222
\noindent In the last line we generate the code for variables where
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   223
$E(x)$ stands for looking up the environment to which index the variable
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   224
$x$ maps to. This is similar to the interpreter we saw earlier in the
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   225
module, which also needs an environment: the difference is that the
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   226
interpreter maintains a mapping from variables to current values (what
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   227
is the currently the value of a variable?), while compilers need a
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   228
mapping from variables to memory locations (where can I find the current
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   229
value for the variable in memory?).
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   230
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   231
There is a similar \textit{compile}-function for boolean
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   232
expressions, but it includes a ``trick'' to do with
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   233
\pcode{if}- and \pcode{while}-statements. To explain the issue
376
af65ffff9cdd updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 375
diff changeset
   234
let us first describe the compilation of statements of the
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   235
WHILE-language. The clause for \pcode{skip} is trivial, since
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   236
we do not have to generate any instruction
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   237
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   238
\begin{center}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   239
\begin{tabular}{lcl}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   240
$\textit{compile}(\pcode{skip}, E)$ & $\dn$ & $([], E)$\\
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   241
\end{tabular}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   242
\end{center}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   243
668
9ce78065f68d updated
Christian Urban <urbanc@in.tum.de>
parents: 601
diff changeset
   244
\noindent whereby $[]$ is the empty list of instructions. Note that
376
af65ffff9cdd updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 375
diff changeset
   245
the \textit{compile}-function for statements returns a pair, a
af65ffff9cdd updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 375
diff changeset
   246
list of instructions (in this case the empty list) and an
af65ffff9cdd updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 375
diff changeset
   247
environment for variables. The reason for the environment is
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   248
that assignments in the WHILE-language might change the
376
af65ffff9cdd updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 375
diff changeset
   249
environment---clearly if a variable is used for the first
af65ffff9cdd updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 375
diff changeset
   250
time, we need to allocate a new index and if it has been used
690
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   251
before, then we need to be able to retrieve the associated index.
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   252
This is reflected in the clause for compiling assignments, say
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   253
$x := a$:
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   254
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   255
\begin{center}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   256
\begin{tabular}{lcl}
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   257
$\textit{compile}(x := a, E)$ & $\dn$ &
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   258
$(\textit{compile}(a, E) \;@\;\instr{istore}\;index, E')$
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   259
\end{tabular}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   260
\end{center}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   261
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   262
\noindent We first generate code for the right-hand side of the
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   263
assignment (that is the arithmetic expression $a$) and then add an
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   264
\instr{istore}-instruction at the end. By convention running the code
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   265
for the arithmetic expression $a$ will leave the result on top of the
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   266
stack. After that the \instr{istore}-instruction, the result will be
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   267
stored in the index corresponding to the variable $x$. If the variable
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   268
$x$ has been used before in the program, we just need to look up what
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   269
the index is and return the environment unchanged (that is in this case
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   270
$E' = E$). However, if this is the first encounter of the variable $x$
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   271
in the program, then we have to augment the environment and assign $x$
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   272
with the largest index in $E$ plus one (that is $E' = E(x \mapsto
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   273
largest\_index + 1)$). To sum up, for the assignment $x := x + 1$ we
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   274
generate the following code snippet
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   275
668
9ce78065f68d updated
Christian Urban <urbanc@in.tum.de>
parents: 601
diff changeset
   276
\begin{lstlisting}[language=JVMIS,mathescape,numbers=none]
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   277
iload $n_x$
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   278
ldc 1
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   279
iadd
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   280
istore $n_x$
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   281
\end{lstlisting}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   282
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   283
\noindent
692
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   284
where $n_x$ is the index (or pointer to the memory) for the variable
709
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   285
$x$. The Scala code for looking-up the index for the variable is as follow:
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   286
668
9ce78065f68d updated
Christian Urban <urbanc@in.tum.de>
parents: 601
diff changeset
   287
\begin{center}
9ce78065f68d updated
Christian Urban <urbanc@in.tum.de>
parents: 601
diff changeset
   288
\begin{tabular}{lcl}
690
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   289
$index \;=\; E\textit{.getOrElse}(x, |E|)$
668
9ce78065f68d updated
Christian Urban <urbanc@in.tum.de>
parents: 601
diff changeset
   290
\end{tabular}
9ce78065f68d updated
Christian Urban <urbanc@in.tum.de>
parents: 601
diff changeset
   291
\end{center}
9ce78065f68d updated
Christian Urban <urbanc@in.tum.de>
parents: 601
diff changeset
   292
9ce78065f68d updated
Christian Urban <urbanc@in.tum.de>
parents: 601
diff changeset
   293
\noindent
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   294
This implements the idea that in case the environment $E$ contains an
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   295
index for $x$, we return it. Otherwise we ``create'' a new index by
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   296
returning the size $|E|$ of the environment (that will be an index that
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   297
is guaranteed not to be used yet). In all this we take advantage of the
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   298
JVM which provides us with a potentially limitless supply of places
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   299
where we can store values of variables.
668
9ce78065f68d updated
Christian Urban <urbanc@in.tum.de>
parents: 601
diff changeset
   300
692
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   301
A bit more complicated is the generation of code for
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   302
\pcode{if}-statements, say
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   303
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   304
\begin{lstlisting}[mathescape,language={WHILE},numbers=none]
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   305
if $b$ then $cs_1$ else $cs_2$
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   306
\end{lstlisting}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   307
692
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   308
\noindent where $b$ is a boolean expression and where both $cs_{1/2}$
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   309
are the statements for each of the \pcode{if}-branches. Let us assume we
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   310
already generated code for $b$ and and the two if-branches $cs_{1/2}$.
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   311
Then in the true-case the control-flow of the program needs to behave as
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   312
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   313
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   314
\begin{center}
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   315
\begin{tikzpicture}[node distance=2mm and 4mm,line cap=round,
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   316
 block/.style={rectangle, minimum size=1cm, draw=black, line width=1mm,
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   317
               top color=white,bottom color=black!20},
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   318
 point/.style={rectangle, inner sep=0mm, minimum size=0mm, fill=red},
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   319
 skip loop/.style={black, line width=1mm, to path={-- ++(0,-10mm) -| (\tikztotarget)}}]
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   320
\node (A1) [point] {};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   321
\node (b) [block, right=of A1] {code of $b$};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   322
\node (A2) [point, right=of b] {};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   323
\node (cs1) [block, right=of A2] {code of $cs_1$};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   324
\node (A3) [point, right=of cs1] {};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   325
\node (cs2) [block, right=of A3] {code of $cs_2$};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   326
\node (A4) [point, right=of cs2] {};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   327
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   328
\draw (A1) edge [->, black, line width=1mm] (b);
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   329
\draw (b) edge [->, black, line width=1mm] (cs1);
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   330
\draw (cs1) edge [->, black, line width=1mm,shorten >= -0.5mm] (A3);
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   331
\draw (A3) edge [->, black, skip loop] (A4);
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   332
\node [below=of cs2] {\raisebox{-5mm}{\small{}jump}};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   333
\end{tikzpicture}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   334
\end{center}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   335
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   336
\noindent where we start with running the code for $b$; since
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   337
we are in the true case we continue with running the code for
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   338
$cs_1$. After this however, we must not run the code for
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   339
$cs_2$, but always jump to after the last instruction of $cs_2$
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   340
(the code for the \pcode{else}-branch). Note that this jump is
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   341
unconditional, meaning we always have to jump to the end of
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   342
$cs_2$. The corresponding instruction of the JVM is
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   343
\instr{goto}. In case $b$ turns out to be false we need the
943
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   344
control-flow to be:
370
a65767fe5d71 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 369
diff changeset
   345
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   346
\begin{center}
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   347
\begin{tikzpicture}[node distance=2mm and 4mm,line cap=round,
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   348
 block/.style={rectangle, minimum size=1cm, draw=black, line width=1mm,
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   349
               top color=white,bottom color=black!20},
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   350
 point/.style={rectangle, inner sep=0mm, minimum size=0mm, fill=red},
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   351
 skip loop/.style={black, line width=1mm, to path={-- ++(0,-10mm) -| (\tikztotarget)}}]
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   352
\node (A1) [point] {};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   353
\node (b) [block, right=of A1] {code of $b$};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   354
\node (A2) [point, right=of b] {};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   355
\node (cs1) [block, right=of A2] {code of $cs_1$};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   356
\node (A3) [point, right=of cs1] {};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   357
\node (cs2) [block, right=of A3] {code of $cs_2$};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   358
\node (A4) [point, right=of cs2] {};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   359
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   360
\draw (A1) edge [->, black, line width=1mm] (b);
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   361
\draw (b) edge [->, black, line width=1mm,shorten >= -0.5mm] (A2);
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   362
\draw (A2) edge [skip loop] (A3);
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   363
\draw (A3) edge [->, black, line width=1mm] (cs2);
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   364
\draw (cs2) edge [->,black, line width=1mm] (A4);
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   365
\node [below=of cs1] {\raisebox{-5mm}{\small{}conditional jump}};
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   366
\end{tikzpicture}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   367
\end{center}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   368
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   369
\noindent where we now need a conditional jump (if the
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   370
if-condition is false) from the end of the code for the
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   371
boolean to the beginning of the instructions $cs_2$. Once we
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   372
are finished with running $cs_2$ we can continue with whatever
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   373
code comes after the if-statement.
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   374
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   375
The \instr{goto} and the conditional jumps need addresses to
376
af65ffff9cdd updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 375
diff changeset
   376
where the jump should go. Since we are generating assembly
af65ffff9cdd updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 375
diff changeset
   377
code for the JVM, we do not actually have to give (numeric)
af65ffff9cdd updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 375
diff changeset
   378
addresses, but can just attach (symbolic) labels to our code.
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   379
These labels specify a target for a jump---essentially they mark
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   380
a location in our code. Therefore the labels
376
af65ffff9cdd updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 375
diff changeset
   381
need to be unique, as otherwise it would be ambiguous where a
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   382
jump should go to. A label, say \pcode{L}, is attached to assembly
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   383
code like
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   384
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   385
\begin{lstlisting}[mathescape,numbers=none]
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   386
  $\textit{instr\_1}$
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   387
L:
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   388
  $\textit{instr\_2}$
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   389
  $\textit{instr\_3}$
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   390
    $\vdots$
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   391
\end{lstlisting}
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   392
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   393
\noindent where the label needs to be followed by a colon. The task of
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   394
the assembler (in our case Jasmin or Krakatau) is to resolve the labels
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   395
to actual (numeric) addresses, for example jump 10 instructions forward,
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   396
or 20 instructions backwards, or jump to this particular address.
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   397
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   398
Recall the ``trick'' with compiling boolean expressions: the
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   399
\textit{compile}-function for boolean expressions takes three
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   400
arguments: an abstract syntax tree, an environment for
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   401
variable indices and also the label, which I called $lab$, to where an conditional
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   402
jump needs to go. The clause for the expression $a_1 = a_2$,
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   403
for example, is as follows:
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   404
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   405
\begin{center}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   406
\begin{tabular}{lcl}
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   407
$\textit{compile}(a_1 = a_2, E, lab)$ & $\dn$\\
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   408
\multicolumn{3}{l}{$\qquad\textit{compile}(a_1, E) \;@\;\textit{compile}(a_2, E)\;@\; \instr{if_icmpne}\;lab$}
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   409
\end{tabular}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   410
\end{center}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   411
376
af65ffff9cdd updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 375
diff changeset
   412
\noindent where we are first generating code for the
af65ffff9cdd updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 375
diff changeset
   413
subexpressions $a_1$ and $a_2$. This will mean after running
af65ffff9cdd updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 375
diff changeset
   414
the corresponding code there will be two integers on top of
af65ffff9cdd updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 375
diff changeset
   415
the stack. If they are equal, we do not have to do anything
af65ffff9cdd updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 375
diff changeset
   416
(except for popping them off from the stack) and just continue
af65ffff9cdd updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 375
diff changeset
   417
with the next instructions (see control-flow of ifs above).
af65ffff9cdd updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 375
diff changeset
   418
However if they are \emph{not} equal, then we need to
af65ffff9cdd updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 375
diff changeset
   419
(conditionally) jump to the label $lab$. This can be done with
af65ffff9cdd updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 375
diff changeset
   420
the instruction
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   421
692
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   422
\begin{lstlisting}[mathescape,numbers=none,language=JVMIS]
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   423
if_icmpne $lab$
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   424
\end{lstlisting}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   425
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   426
To sum up, the third argument in the compile function for booleans
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   427
specifies where to jump, in case the condition is \emph{not} true. I
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   428
leave it to you to extend the \textit{compile}-function for the other
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   429
boolean expressions. Note that we need to jump whenever the boolean is
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   430
\emph{not} true, which means we have to ``negate'' the jump
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   431
condition---equals becomes not-equal, less becomes greater-or-equal.
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   432
Other jump instructions for boolean operators are
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   433
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   434
\begin{center}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   435
\begin{tabular}{l@{\hspace{10mm}}c@{\hspace{10mm}}l}
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   436
$\not=$ & $\Rightarrow$ & \instr{if_icmpeq}\\
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   437
$<$ & $\Rightarrow$ & \instr{if_icmpge}\\
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   438
$\le$ & $\Rightarrow$ & \instr{if_icmpgt}\\
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   439
\end{tabular}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   440
\end{center}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   441
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   442
\noindent and so on.   If you do not like this design (it can be the
692
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   443
source of some nasty, hard-to-detect errors), you can also change the
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   444
layout of the code and first give the code for the else-branch and then
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   445
for the if-branch. However in the case of while-loops this
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   446
``upside-down-inside-out'' way of generating code still seems the most
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   447
convenient.
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   448
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   449
We are now ready to give the compile function for
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   450
if-statements---remember this function returns for statements a
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   451
pair consisting of the code and an environment:
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   452
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   453
\begin{center}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   454
\begin{tabular}{lcl}
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   455
$\textit{compile}(\pcode{if}\;b\;\pcode{then}\; cs_1\;\pcode{else}\; cs_2, E)$ & $\dn$\\
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   456
\multicolumn{3}{l}{$\qquad L_\textit{ifelse}\;$ (fresh label)}\\
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   457
\multicolumn{3}{l}{$\qquad L_\textit{ifend}\;$ (fresh label)}\\
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   458
\multicolumn{3}{l}{$\qquad (is_1, E') = \textit{compile}(cs_1, E)$}\\
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   459
\multicolumn{3}{l}{$\qquad (is_2, E'') = \textit{compile}(cs_2, E')$}\\
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   460
\multicolumn{3}{l}{$\qquad(\textit{compile}(b, E, L_\textit{ifelse})$}\\
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   461
\multicolumn{3}{l}{$\qquad\phantom{(}@\;is_1$}\\
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   462
\multicolumn{3}{l}{$\qquad\phantom{(}@\; \pcode{goto}\;L_\textit{ifend}$}\\
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   463
\multicolumn{3}{l}{$\qquad\phantom{(}@\;L_\textit{ifelse}:$}\\
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   464
\multicolumn{3}{l}{$\qquad\phantom{(}@\;is_2$}\\
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   465
\multicolumn{3}{l}{$\qquad\phantom{(}@\;L_\textit{ifend}:, E'')$}\\
372
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   466
\end{tabular}
d6af4b1239de updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 370
diff changeset
   467
\end{center}
327
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
   468
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   469
\noindent In the first two lines we generate two fresh labels
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   470
for the jump addresses (just before the else-branch and just
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   471
after). In the next two lines we generate the instructions for
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   472
the two branches, $is_1$ and $is_2$. The final code will
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   473
be first the code for $b$ (including the label
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   474
just-before-the-else-branch), then the \pcode{goto} for after
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   475
the else-branch, the label $L_\textit{--ifelse}$, followed by
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   476
the instructions for the else-branch, followed by the
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   477
after-the-else-branch label. Consider for example the
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   478
if-statement:
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   479
690
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   480
\begin{lstlisting}[mathescape,numbers=none,language=While]
943
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   481
if 1 == 1 then x := 2 else y := 3
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   482
\end{lstlisting}
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   483
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   484
\noindent
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   485
The generated code is as follows:
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   486
690
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   487
\begin{lstlisting}[language=JVMIS,mathescape,numbers=left]
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   488
   ldc 1
377
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   489
   ldc 1
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   490
   if_icmpne L_ifelse $\quad\tikz[remember picture] \node (C) {\mbox{}};$
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   491
   ldc 2
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   492
   istore 0
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   493
   goto L_ifend $\quad\tikz[remember picture] \node (A) {\mbox{}};$
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   494
L_ifelse: $\quad\tikz[remember picture] \node[] (D) {\mbox{}};$
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   495
   ldc 3
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   496
   istore 1
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   497
L_ifend: $\quad\tikz[remember picture] \node[] (B) {\mbox{}};$
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   498
\end{lstlisting}
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   499
601
208b0f67a3d0 updated
Christian Urban <urbanc@in.tum.de>
parents: 600
diff changeset
   500
\begin{tikzpicture}[remember picture,overlay]
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   501
  \draw[->,very thick] (A) edge [->,to path={-- ++(10mm,0mm)
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   502
           -- ++(0mm,-17.3mm) |- (\tikztotarget)},line width=1mm] (B.east);
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   503
  \draw[->,very thick] (C) edge [->,to path={-- ++(10mm,0mm)
601
208b0f67a3d0 updated
Christian Urban <urbanc@in.tum.de>
parents: 600
diff changeset
   504
           -- ++(0mm,-17.3mm) |- (\tikztotarget)},line width=1mm] (D.east);
208b0f67a3d0 updated
Christian Urban <urbanc@in.tum.de>
parents: 600
diff changeset
   505
\end{tikzpicture}
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   506
377
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   507
\noindent The first three lines correspond to the the boolean
943
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   508
expression \texttt{1 == 1}. The jump for when this boolean expression
377
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   509
is false is in Line~3. Lines 4-6 corresponds to the if-branch;
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   510
the else-branch is in Lines 8 and 9.
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   511
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   512
Note carefully how the environment $E$ is threaded through the recursive
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   513
calls of \textit{compile}. The function receives an environment $E$, but
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   514
it might extend it when compiling the if-branch, yielding $E'$. This
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   515
happens for example in the if-statement above whenever the variable
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   516
\code{x} has not been used before. Similarly with the environment $E''$
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   517
for the second call to \textit{compile}. $E''$ is also the environment
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   518
that needs to be returned as part of the answer.
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   519
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   520
The compilation of the while-loops of the form
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   521
\pcode{while} $b$ \pcode{do} $cs$ is very similar. In case
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   522
the condition is true and we need to do another iteration,
377
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   523
and the control-flow needs to be as follows
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   524
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   525
\begin{center}
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   526
\begin{tikzpicture}[node distance=2mm and 4mm,line cap=round,
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   527
 block/.style={rectangle, minimum size=1cm, draw=black, line width=1mm,
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   528
               top color=white,bottom color=black!20},
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   529
 point/.style={rectangle, inner sep=0mm, minimum size=0mm, fill=red},
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   530
 skip loop/.style={black, line width=1mm, to path={-- ++(0,-10mm) -| (\tikztotarget)}}]
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   531
\node (A0) [point, left=of A1] {};
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   532
\node (A1) [point] {};
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   533
\node (b) [block, right=of A1] {code of $b$};
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   534
\node (A2) [point, right=of b] {};
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   535
\node (cs1) [block, right=of A2] {code of $cs$};
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   536
\node (A3) [point, right=of cs1] {};
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   537
\node (A4) [point, right=of A3] {};
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   538
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   539
\draw (A0) edge [->, black, line width=1mm] (b);
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   540
\draw (b) edge [->, black, line width=1mm] (cs1);
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   541
\draw (cs1) edge [->, black, line width=1mm,shorten >= -0.5mm] (A3);
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   542
\draw (A3) edge [->,skip loop] (A1);
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   543
\end{tikzpicture}
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   544
\end{center}
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   545
377
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   546
\noindent Whereas if the condition is \emph{not} true, we
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   547
need to jump out of the loop, which gives the following
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   548
control flow.
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   549
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   550
\begin{center}
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   551
\begin{tikzpicture}[node distance=2mm and 4mm,line cap=round,
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   552
 block/.style={rectangle, minimum size=1cm, draw=black, line width=1mm,
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   553
               top color=white,bottom color=black!20},
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   554
 point/.style={rectangle, inner sep=0mm, minimum size=0mm, fill=red},
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   555
 skip loop/.style={black, line width=1mm, to path={-- ++(0,-10mm) -| (\tikztotarget)}}]
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   556
\node (A0) [point, left=of A1] {};
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   557
\node (A1) [point] {};
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   558
\node (b) [block, right=of A1] {code of $b$};
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   559
\node (A2) [point, right=of b] {};
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   560
\node (cs1) [block, right=of A2] {code of $cs$};
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   561
\node (A3) [point, right=of cs1] {};
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   562
\node (A4) [point, right=of A3] {};
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   563
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   564
\draw (A0) edge [->, black, line width=1mm] (b);
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   565
\draw (b) edge [->, black, line width=1mm,shorten >= -0.5mm] (A2);
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   566
\draw (A2) edge [skip loop] (A3);
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   567
\draw (A3) edge [->, black, line width=1mm] (A4);
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   568
\end{tikzpicture}
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   569
\end{center}
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   570
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   571
\noindent Again we can use the \textit{compile}-function for
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   572
boolean expressions to insert the appropriate jump to the
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   573
end of the loop (label $L_\textit{--wend}$ below).
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   574
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   575
\begin{center}
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   576
\begin{tabular}{lcl}
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   577
$\textit{compile}(\pcode{while}\; b\; \pcode{do} \;cs, E)$ & $\dn$\\
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   578
\multicolumn{3}{l}{$\qquad L_\textit{--wbegin}\;$ (fresh label)}\\
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   579
\multicolumn{3}{l}{$\qquad L_\textit{--wend}\;$ (fresh label)}\\
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   580
\multicolumn{3}{l}{$\qquad (is, E') = \textit{compile}(cs_1, E)$}\\
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   581
\multicolumn{3}{l}{$\qquad(L_\textit{--wbegin}$}\\
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   582
\multicolumn{3}{l}{$\qquad\phantom{(}@\;\textit{compile}(b, E, L_\textit{--wend})$}\\
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   583
\multicolumn{3}{l}{$\qquad\phantom{(}@\;is$}\\
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   584
\multicolumn{3}{l}{$\qquad\phantom{(}@\; \text{goto}\;L_\textit{--wbegin}$}\\
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   585
\multicolumn{3}{l}{$\qquad\phantom{(}@\;L_\textit{--wend}, E')$}\\
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   586
\end{tabular}
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   587
\end{center}
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   588
377
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   589
\noindent I let you go through how this clause works. As an example
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   590
you can consider the while-loop
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   591
690
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   592
\begin{lstlisting}[mathescape,numbers=none,language=While]
377
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   593
while x <= 10 do x := x + 1
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   594
\end{lstlisting}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   595
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   596
\noindent yielding the following code
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   597
709
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   598
\begin{lstlisting}[language=JVMIS2,mathescape,numbers=left]
377
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   599
L_wbegin: $\quad\tikz[remember picture] \node[] (LB) {\mbox{}};$
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   600
   iload 0
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   601
   ldc 10
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   602
   if_icmpgt L_wend $\quad\tikz[remember picture] \node (LC) {\mbox{}};$
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   603
   iload 0
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   604
   ldc 1
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   605
   iadd
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   606
   istore 0
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   607
   goto L_wbegin $\quad\tikz[remember picture] \node (LA) {\mbox{}};$
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   608
L_wend: $\quad\tikz[remember picture] \node[] (LD) {\mbox{}};$
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   609
\end{lstlisting}
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   610
601
208b0f67a3d0 updated
Christian Urban <urbanc@in.tum.de>
parents: 600
diff changeset
   611
\begin{tikzpicture}[remember picture,overlay]
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   612
  \draw[->,very thick] (LA) edge [->,to path={-- ++(10mm,0mm)
377
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   613
           -- ++(0mm,17.3mm) |- (\tikztotarget)},line width=1mm] (LB.east);
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   614
  \draw[->,very thick] (LC) edge [->,to path={-- ++(10mm,0mm)
601
208b0f67a3d0 updated
Christian Urban <urbanc@in.tum.de>
parents: 600
diff changeset
   615
           -- ++(0mm,-17.3mm) |- (\tikztotarget)},line width=1mm] (LD.east);
208b0f67a3d0 updated
Christian Urban <urbanc@in.tum.de>
parents: 600
diff changeset
   616
\end{tikzpicture}
377
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   617
690
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   618
\noindent
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   619
As said, I leave it to you to check that the code really implements
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   620
the usual controlflow of while-loops.
377
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   621
709
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   622
Next we need to consider the WHILE-statement \pcode{write x}, which can
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   623
be used to print out the content of a variable. For this we shall use a
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   624
Java library function. In order to avoid having to generate a lot of
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   625
code for each \pcode{write}-command, we use a separate helper-method and
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   626
just call this method with an appropriate argument (which of course
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   627
needs to be placed onto the stack). The code of the helper-method is as
377
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   628
follows.
374
0e25fb72d339 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 373
diff changeset
   629
709
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   630
\begin{lstlisting}[language=JVMIS,numbers=left,basicstyle=\ttfamily\small]
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   631
.method public static write(I)V
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   632
    .limit locals 1
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   633
    .limit stack 2
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   634
    getstatic java/lang/System/out Ljava/io/PrintStream;
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   635
    iload 0
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   636
    invokevirtual java/io/PrintStream/println(I)V
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   637
    return
373
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   638
.end method
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   639
\end{lstlisting}
b018234c9126 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 372
diff changeset
   640
709
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   641
\noindent The first line marks the beginning of the method, called
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   642
\pcode{write}. It takes a single integer argument indicated by the
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   643
\pcode{(I)} and returns no result, indicated by the \pcode{V} (for
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   644
void). Since the method has only one argument, we only need a single
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   645
local variable (Line~2) and a stack with two cells will be sufficient
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   646
(Line 3). Line 4 instructs the JVM to get the value of the member
943
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   647
\pcode{out} from the class \pcode{java/lang/System}. It expects the
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   648
value to be of type \pcode{java/io/PrintStream}. A reference to this
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   649
value will be placed on the stack.\footnote{Note the syntax
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   650
  \texttt{Ljava\ldots{};} for the \texttt{PrintStream} type is not an
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   651
  typo. Somehow the designers of Jasmin decided that this syntax is
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   652
  pleasing to the eye. So if you wanted to have strings in your Jasmin
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   653
  code, you would need to write \texttt{Ljava/lang/String;}\;. If you
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   654
  want arrays of one dimension, then use \texttt{[\ldots}; two
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   655
  dimensions, use \texttt{[[\ldots} and so on. Looks all very ugly to
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   656
  my eyes.} Line~5 copies the integer we want to print out onto the
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   657
stack. In the line after that we call the method \pcode{println} (from
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   658
the class \pcode{java/io/PrintStream}). We want to print out an
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   659
integer and do not expect anything back (that is why the type
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   660
annotation is \pcode{(I)V}).  The \pcode{return}-instruction in the
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   661
next line changes the control-flow back to the place from where
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   662
\pcode{write} was called. This method needs to be part of a header
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   663
that is included in any code we generate. The helper-method
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   664
\pcode{write} can be invoked with the two instructions
374
0e25fb72d339 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 373
diff changeset
   665
0e25fb72d339 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 373
diff changeset
   666
\begin{lstlisting}[mathescape,language=JVMIS]
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   667
iload $E(x)$
374
0e25fb72d339 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 373
diff changeset
   668
invokestatic XXX/XXX/write(I)V
0e25fb72d339 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 373
diff changeset
   669
\end{lstlisting}
0e25fb72d339 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 373
diff changeset
   670
0e25fb72d339 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 373
diff changeset
   671
\noindent where we first place the variable to be printed on
377
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   672
top of the stack and then call \pcode{write}. The \pcode{XXX}
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   673
need to be replaced by an appropriate class name (this will be
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 376
diff changeset
   674
explained shortly).
374
0e25fb72d339 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 373
diff changeset
   675
0e25fb72d339 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 373
diff changeset
   676
709
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   677
By generating code for a WHILE-program, we end up with a list of (JVM
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   678
assembly) instructions. Unfortunately, there is a bit more boilerplate
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   679
code needed before these instructions can be run. Essentially we have to
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   680
enclose them inside a Java \texttt{main}-method. The corresponding code
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   681
is shown in Figure~\ref{boiler}. This boilerplate code is very specific
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   682
to the JVM. If we target any other virtual machine or a machine
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   683
language, then we would need to change this code.  Interesting are the
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   684
Lines 5 and 6 where we hardwire that the stack of our programs will
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   685
never be larger than 200 and that the maximum number of variables is
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   686
also 200. This seem to be conservative default values that allow is to
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   687
run some simple WHILE-programs. In a real compiler, we would of course
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   688
need to work harder and find out appropriate values for the stack and
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   689
local variables.
374
0e25fb72d339 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 373
diff changeset
   690
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   691
\begin{figure}[t]
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   692
\begin{framed}
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   693
\begin{lstlisting}[mathescape,language=JVMIS,numbers=left]
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   694
.class public XXX.XXX
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   695
.super java/lang/Object
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   696
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   697
.method public static main([Ljava/lang/String;)V
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   698
    .limit locals 200
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   699
    .limit stack 200
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   700
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   701
      $\textit{\ldots{}here comes the compiled code\ldots}$
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   702
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   703
    return
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   704
.end method
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   705
\end{lstlisting}
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   706
\end{framed}
709
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   707
\caption{The boilerplate code needed for running generated code. It
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   708
  hardwires limits for stack space and for the number of local
709
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   709
  variables.\label{boiler}}
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   710
\end{figure}
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   711
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   712
375
bf36664a3196 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 374
diff changeset
   713
To sum up, in Figure~\ref{test} is the complete code generated
601
208b0f67a3d0 updated
Christian Urban <urbanc@in.tum.de>
parents: 600
diff changeset
   714
for the slightly nonsensical program
375
bf36664a3196 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 374
diff changeset
   715
bf36664a3196 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 374
diff changeset
   716
\begin{lstlisting}[mathescape,language=While]
bf36664a3196 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 374
diff changeset
   717
x := 1 + 2;
bf36664a3196 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 374
diff changeset
   718
write x
bf36664a3196 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 374
diff changeset
   719
\end{lstlisting}
bf36664a3196 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 374
diff changeset
   720
692
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   721
\noindent I let you read the code and make sure the code behaves as
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   722
expected. Having this code at our disposal, we need the assembler to
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   723
translate the generated code into JVM bytecode (a class file). This
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   724
bytecode is then understood by the JVM and can be run by just invoking
709
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   725
the \pcode{java}-program. Again I let you do the work.
375
bf36664a3196 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 374
diff changeset
   726
bf36664a3196 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 374
diff changeset
   727
bf36664a3196 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 374
diff changeset
   728
\begin{figure}[p]
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   729
\begin{framed}
709
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
   730
\lstinputlisting[language=JVMIS,mathescape,basicstyle=\ttfamily\small]{../progs/test-small.j}
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   731
\begin{tikzpicture}[remember picture,overlay]
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   732
  \draw[|<->|,very thick] (LA.north) -- (LB.south)
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   733
     node[left=-0.5mm,midway] {\footnotesize\texttt{x\,:=\,1\,+\,2}};
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   734
  \draw[|<->|,very thick] (LC.north) -- (LD.south)
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   735
     node[left=-0.5mm,midway] {\footnotesize\texttt{write x}};
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   736
\end{tikzpicture}
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   737
\end{framed}
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   738
\caption{The generated code for the test program \texttt{x := 1 + 2; write
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   739
x}. This code can be processed by a Java assembler producing a
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   740
class-file, which can then be run by the {\tt{}java}-program.\label{test}}
375
bf36664a3196 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 374
diff changeset
   741
\end{figure}
374
0e25fb72d339 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents: 373
diff changeset
   742
690
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   743
\subsection*{Arrays}
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   744
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   745
Maybe a useful addition to the WHILE-language would be arrays. This
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   746
would allow us to generate more interesting WHILE-programs by
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   747
translating BF*** programs into equivalent WHILE-code. Yeah! Therefore in this
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   748
section let us have a look at how we can support the following three
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   749
constructions
690
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   750
943
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   751
\begin{itemize}
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   752
\item \lstinline|new(arr[15000])|
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   753
\item \lstinline|x := 3 + arr[3 + y]|
5365ef60707e updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 941
diff changeset
   754
\item \lstinline|arr[42 * n] := ...|
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   755
\end{itemize}
690
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   756
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   757
\noindent
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   758
The first construct is for creating new arrays. In this instance the
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   759
name of the array is \pcode{arr} and it can hold 15000 integers. We do
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   760
not support ``dynamic'' arrays, that is the size of our arrays will
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   761
always be fixed. The second construct is for referencing an array cell
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   762
inside an arithmetic expression---we need to be able to look up the
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   763
contents of an array at an index determined by an arithmetic expression.
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   764
Similarly in the line below, we need to be able to update the content of
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   765
an array at a calculated index.
691
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   766
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   767
For creating a new array we can generate the following three JVM
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   768
instructions:
690
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   769
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   770
\begin{lstlisting}[mathescape,language=JVMIS]
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   771
ldc number
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   772
newarray int
690
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   773
astore loc_var
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   774
\end{lstlisting}
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   775
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   776
\noindent
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   777
First we need to put the size of the array onto the stack. The next
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   778
instruction creates the array. In this case the array contains
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   779
\texttt{int}s. With the last instruction we can store the array as a
691
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   780
local variable (like the ``simple'' variables from the previous
692
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   781
section). The use of a local variable for each array allows us to have
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   782
multiple arrays in a WHILE-program. For looking up an element in an
692
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   783
array we can use the following JVM code
690
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   784
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   785
\begin{lstlisting}[mathescape,language=JVMIS]
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   786
aload loc_var
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   787
$\textit{index\_aexp}$
690
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   788
iaload
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   789
\end{lstlisting}
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   790
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   791
\noindent
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   792
The first instruction loads the ``pointer'', or local variable, to the
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   793
array onto the stack. Then we have some instructions calculating the
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   794
index where we want to look up the array. The idea is that these
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   795
instructions will leave a concrete number on the top of the stack, which
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   796
will be the index into the array we need. Finally we need to tell the
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   797
JVM to load the corresponding element onto the stack. Updating an array
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   798
at an index with a value is as follows.
691
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   799
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   800
\begin{lstlisting}[mathescape,language=JVMIS]
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   801
aload loc_var
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   802
$\textit{index\_aexp}$
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   803
$\textit{value\_aexp}$
691
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   804
iastore
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   805
\end{lstlisting}
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   806
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   807
\noindent
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   808
Again the first instruction loads the local variable of
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   809
the array onto the stack. Then we have some instructions calculating
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   810
the index where we want to update the array. After that come the
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   811
instructions for with which value we want to update the array. The last
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   812
line contains the instruction for updating the array.
691
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   813
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   814
Next we need to modify our grammar rules for our WHILE-language: it
692
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   815
seems best to extend the rule for factors in arithmetic expressions with
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   816
a rule for looking up an array.
691
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   817
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   818
\begin{plstx}[rhs style=, margin=3cm]
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   819
: \meta{E} ::= \meta{T} $+$ \meta{E}
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   820
         | \meta{T} $-$ \meta{E}
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   821
         | \meta{T}\\
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   822
: \meta{T} ::= \meta{F} $*$ \meta{T}
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   823
          | \meta{F} $\backslash$ \meta{T}
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   824
          | \meta{F}\\
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   825
: \meta{F} ::= ( \meta{E} )
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   826
          | $\underbrace{\meta{Id}\,[\,\meta{E}\,]}_{new}$
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   827
          | \meta{Id}
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   828
          | \meta{Num}\\
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   829
\end{plstx}
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   830
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   831
\noindent
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   832
There is no problem with left-recursion as the \meta{E} is ``protected''
692
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   833
by an identifier and the brackets. There are two new rules for statements,
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   834
one for creating an array and one for array assignment:
691
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   835
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   836
\begin{plstx}[rhs style=, margin=2cm, one per line]
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   837
: \meta{Stmt} ::=  \ldots
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   838
              | \texttt{new}(\meta{Id}\,[\,\meta{Num}\,])
691
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   839
              | \meta{Id}\,[\,\meta{E}\,]\,:=\,\meta{E}\\
991849dfbcb1 updated
Christian Urban <urbanc@in.tum.de>
parents: 690
diff changeset
   840
\end{plstx}
690
8d57433c7b5e updated
Christian Urban <urbanc@in.tum.de>
parents: 668
diff changeset
   841
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   842
With this in place we can turn back to the idea of creating
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   843
WHILE-programs by translating BF-programs. This is a relatively easy
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   844
task because BF has only eight instructions (we will actually implement
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   845
seven because we can omit the read-in instruction from BF). What makes
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   846
this translation easy is that BF-loops can be straightforwardly
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   847
represented as while-loops. The Scala code for the translation is as
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   848
follows:
692
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   849
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   850
\begin{lstlisting}[language=Scala,numbers=left]
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   851
def instr(c: Char) : String = c match {
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   852
  case '>' => "ptr := ptr + 1;"
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   853
  case '<' => "ptr := ptr - 1;"
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   854
  case '+' => "mem[ptr] := mem [ptr] + 1;"
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   855
  case '-' => "mem [ptr] := mem [ptr] - 1;"
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   856
  case '.' => "x := mem [ptr]; write x;"
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   857
  case '['  => "while (mem [ptr] != 0) do {"
692
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   858
  case ']'  => "skip};"
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   859
  case _ => ""
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   860
}
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   861
\end{lstlisting}
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   862
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   863
\noindent
692
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   864
The idea behind the translation is that BF-programs operate on an array,
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   865
called here \texttt{mem}. The BF-memory pointer into this array is
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   866
represented as the variable \texttt{ptr}. As usual the BF-instructions
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   867
\code{>} and \code{<} increase, respectively decrease, \texttt{ptr}. The
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   868
instructions \code{+} and \code{-} update a cell in \texttt{mem}. In
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   869
Line 6 we need to first assign a \texttt{mem}-cell to an auxiliary
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   870
variable since we have not changed our write functions in order to cope
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   871
with writing out any array-content directly. Lines 7 and 8 are for
692
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   872
translating BF-loops. Line 8 is interesting in the sense that we need to
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   873
generate a \code{skip} instruction just before finishing with the
692
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
   874
closing \code{"\}"}. The reason is that we are rather pedantic about
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   875
semicolons in our WHILE-grammar: the last command cannot have a
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   876
semicolon---adding a \code{skip} works around this snag.
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   877
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   878
Putting this all together and we can generate WHILE-programs with more
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   879
than 15K JVM-instructions; run the compiled JVM code for such
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   880
programs and marvel at the output\ldots\medskip
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   881
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   882
\noindent
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   883
\ldots{}Hooooray, after a few more tweaks we can finally run the
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   884
BF-mandelbrot program on the JVM (after nearly 10 minutes of parsing the
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   885
corresponding WHILE-program; the size of the resulting class file is
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   886
around 32K---not too bad). The generation of the picture completes
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   887
within 20 or so seconds. Try replicating this with an interpreter! The
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   888
good point is that we now have a sufficiently complicated program in our
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   889
WHILE-language in order to do some benchmarking (your task!). Having done
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   890
this, we now face the question about what to do next in our compiler\ldots?
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   891
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   892
\subsection*{Optimisations \& Co}
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   893
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   894
Every compiler that deserves its name has to perform some optimisations
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   895
on the code: if we put in the extra effort of writing a compiler for a
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   896
language, then obviously we want to have our code to run as fast as
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   897
possible. So we should look into this in more detail.
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   898
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   899
There is actually one aspect in our generated code where we can make
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   900
easily efficiency gains. This has to do with some of the quirks of the
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   901
JVM. Whenever we push a constant onto the stack, we used the JVM
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   902
instruction \instr{ldc some_const}. This is a rather generic instruction
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   903
in the sense that it works not just for integers but also for strings,
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   904
objects and so on. What this instruction does is putting the constant
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   905
into a \emph{constant pool} and then uses an index into this constant
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   906
pool. This means \instr{ldc} will be represented by at least two bytes
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   907
in the class file. While this is a sensible strategy for ``large''
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   908
constants like strings, it is a bit of overkill for small integers
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   909
(which many integers will be when compiling a BF-program). To counter
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   910
this ``waste'', the JVM has specific instructions for small integers,
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   911
for example
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   912
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   913
\begin{itemize}
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   914
\item \instr{iconst_0},\ldots, \instr{iconst_5}
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   915
\item \instr{bipush n}
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   916
\end{itemize}
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
   917
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   918
\noindent
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   919
where the \code{n} is \instr{bipush} is between -128 and 128.   By
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   920
having dedicated instructions such as \instr{iconst_0} to
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   921
\instr{iconst_5} (and \instr{iconst_m1}), we can make the generated code
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   922
size smaller as these instructions only require 1 byte (as opposed the
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   923
generic \instr{ldc} which needs 1 byte plus another for the index into
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   924
the constant pool). While in theory the use of such special instructions
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   925
should make the code only smaller, it actually makes the code also run
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   926
faster. Probably because the JVM has to process less code and uses a
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   927
specific instruction for the underlying CPU.  The story with
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   928
\instr{bipush} is slightly different, because it also uses two
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   929
bytes---so it does not necessarily result in a reduction of code size.
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   930
Instead, it probably uses a specific instruction in the underlying CPU
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   931
that makes the JVM code run faster.\footnote{This is all ``probable''
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   932
because I have not read the 700 pages of JVM documentation by Oracle and
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   933
also have no clue how the JVM is implemented.} This means when
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   934
generating code for pushing constants onto the stack, we can use the
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   935
following Scala helper-function
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   936
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   937
\begin{lstlisting}[language=Scala]
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   938
def compile_num(i: Int) =
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   939
  if (0 <= i && i <= 5) i"iconst_$i" else
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   940
  if (-128 <= i && i <= 127) i"bipush $i"
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   941
  else i"ldc $i"
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   942
\end{lstlisting}
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   943
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   944
\noindent
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   945
It generates the more efficient instructions when pushing a small integer
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   946
constant onto the stack. The default is \instr{ldc} for any other constants.
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   947
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   948
The JVM also has such special instructions for
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   949
loading and storing the first three local variables. The assumption is
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   950
that most operations and arguments in a method will only use very few
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   951
local variables. So we can use the following instructions:
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   952
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   953
\begin{itemize}
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   954
\item \instr{iload_0},\ldots, \instr{iload_3}
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   955
\item \instr{istore_0},\ldots, \instr{istore_3}
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   956
\item \instr{aload_0},\ldots, \instr{aload_3}
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   957
\item \instr{astore_0},\ldots, \instr{astore_3}
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   958
\end{itemize}
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   959
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   960
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   961
\noindent Having implemented these optimisations, the code size of the
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   962
BF-Mandelbrot program reduces and also the class-file runs faster (the
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   963
parsing part is still very slow). According to my very rough
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   964
experiments:
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   965
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   966
\begin{center}
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   967
\begin{tabular}{lll}
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   968
& class-size & runtime\\\hline
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   969
Mandelbrot:\\
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   970
\hspace{5mm}unoptimised: & 33296 & 21 secs\\
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   971
\hspace{5mm}optimised:   & 21787 & 16 secs\\
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   972
\end{tabular}
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   973
\end{center}
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   974
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   975
\noindent
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   976
Quite good! Such optimisations are called \emph{peephole optimisations},
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   977
because they involve changing one or a small set of instructions into an
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   978
equivalent set that has better performance.
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   979
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   980
If you look careful at our generated code you will quickly find another
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   981
source of inefficiency in programs like
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   982
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   983
\begin{lstlisting}[mathescape,language=While]
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   984
x := ...;
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   985
write x
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   986
\end{lstlisting}
710
183663740fb7 updated
Christian Urban <urbanc@in.tum.de>
parents: 709
diff changeset
   987
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   988
\noindent
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   989
where our code first calculates the new result the for \texttt{x} on the
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   990
stack, then pops off the result into a local variable, and after that
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
   991
loads the local variable back onto the stack for writing out a number.
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   992
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   993
\begin{lstlisting}[mathescape,language=JVMIS]
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
   994
...
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   995
istore 0
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   996
iload 0
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   997
...
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   998
\end{lstlisting}
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
   999
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1000
\noindent
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
  1001
If we can detect such situations, then we can leave the value of
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
  1002
\texttt{x} on the stack with for example the much cheaper instruction
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
  1003
\instr{dup}. Now the problem with this optimisation is that it is quite
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
  1004
easy for the snippet above, but what about instances where there is
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
  1005
further WHILE-code in \emph{between} these two statements? Sometimes we
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
  1006
will be able to optimise, sometimes we will not. The compiler needs to
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1007
find out which situation applies. This can quickly become  much more
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
  1008
complicated. So we leave this kind of optimisations here and look at
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
  1009
something more interesting and possibly surprising.
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
  1010
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1011
As you might have seen, the compiler writer has a lot of freedom about
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1012
how to generate code from what the programmer wrote as program. The only
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1013
condition is that generated code should behave as expected by the
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1014
programmer. Then all is fine with the code above\ldots mission
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1015
accomplished! But sometimes the compiler writer is expected to go an
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1016
extra mile, or even miles and change(!) the meaning of a program.
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1017
Suppose we are given the following WHILE-program:
692
8c7ccdebcb89 updated
Christian Urban <urbanc@in.tum.de>
parents: 691
diff changeset
  1018
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
  1019
\begin{lstlisting}[mathescape,language=While]
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
  1020
new(arr[10]);
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
  1021
arr[14] := 3 + arr[13]
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
  1022
\end{lstlisting}
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
  1023
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1024
\noindent
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
  1025
Admittedly this is a contrived program, and probably not meant to be
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
  1026
like this by any sane programmer, but it is supposed to make the
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1027
following point: The program generates an array of size 10, and then
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1028
tries to access the non-existing element at index 13 and even updating
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1029
the element with index 14. Obviously this is baloney. Still, our
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1030
compiler generates code for this program without any questions asked. We
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1031
can even run this code on the JVM\ldots of course the result is an
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1032
exception trace where the JVM yells at us for doing naughty
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1033
things.\footnote{Still this is much better than C, for example, where
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1034
such errors are not prevented and as a result insidious attacks can be
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1035
mounted against such kind C-programs. I assume everyone has heard about
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1036
\emph{Buffer Overflow Attacks}.} Now what should we do in such
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1037
situations? Over- and underflows of indices are notoriously difficult to
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1038
detect statically (at compiletime). So it might seem raising an
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1039
exception at run-time like the JVM is the best compromise.
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
  1040
711
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
  1041
Well, imagine we do not want to rely in our compiler on the JVM for
6f3f3dd01786 updated
Christian Urban <urbanc@in.tum.de>
parents: 710
diff changeset
  1042
producing an annoying, but safe exception trace, rather we want to
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1043
handle such situations ourselves according to what we think should
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1044
happen in such cases. Let us assume we want to handle them in the
708
4980f421b3b0 updated
Christian Urban <urbanc@in.tum.de>
parents: 705
diff changeset
  1045
following way: if the programmer access a field out-of-bounds, we just
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1046
return a default 0, and if a programmer wants to update an out-of-bounds
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1047
field, we want to ``quietly'' ignore this update. One way to achieve
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1048
this would be to rewrite the WHILE-programs and insert the necessary
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1049
if-conditions for safely reading and writing arrays. Another way
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1050
is to modify the code we generate.
709
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
  1051
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1052
\begin{lstlisting}[mathescape,language=JVMIS2]
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1053
  $\textit{index\_aexp}$
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1054
  aload loc_var
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1055
  dup2
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1056
  arraylength
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1057
  if_icmple L1
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1058
  pop2
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1059
  iconst_0
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1060
  goto L2
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1061
L1:
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1062
  swap
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1063
  iaload
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1064
L2:
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1065
\end{lstlisting}
709
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
  1066
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1067
\textbf{TBD}
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1068
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1069
 \begin{lstlisting}[mathescape,language=JVMIS2]
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1070
  $\textit{index\_aexp}$
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1071
  aload loc_var
712
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1072
  dup2
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1073
  arraylength
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1074
  if_icmple L1
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1075
  pop2
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1076
  goto L2
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1077
L1:
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1078
  swap
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1079
  $\textit{value\_aexp}$
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1080
  iastore
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1081
L2:
e71eb9ce2373 updated
Christian Urban <urbanc@in.tum.de>
parents: 711
diff changeset
  1082
\end{lstlisting}
709
c112a6cb5e52 updated
Christian Urban <urbanc@in.tum.de>
parents: 708
diff changeset
  1083
714
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1084
\begin{figure}[p]
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1085
\begin{center}
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1086
\begin{tikzpicture}[every text node part/.style={align=left},
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1087
                    stack/.style={rectangle split,rectangle split parts = 5,
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1088
                                  fill=black!20,draw,text width=1.6cm,line width=0.5mm}]
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1089
\node (A)  {};
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1090
\node[stack,right = 80pt] (0) at (A.east) {$\textit{index}$\nodepart{two} \ldots\phantom{l}};
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1091
\node[stack,right = 60pt] (1) at (0.east)
714
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1092
   {array\nodepart{two}
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1093
    $\textit{index}$\nodepart{three} \ldots\phantom{l}};
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1094
\node[stack,below = 40pt] (2) at (1.south)
714
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1095
   {array\nodepart{two}
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1096
    $\textit{index}$ \nodepart{three}
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1097
    array \nodepart{four}
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1098
    $\textit{index}$\nodepart{five} \ldots\phantom{l}};
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1099
\node[stack,left = 90pt] (3) at (2.west)
714
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1100
   {array\_len\nodepart{two}
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1101
    $\textit{index}$ \nodepart{three}
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1102
    array \nodepart{four}
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1103
    $\textit{index}$\nodepart{five} \ldots\phantom{l}};
714
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1104
\node[stack,below right of = 3,node distance = 130pt,rectangle split parts = 3] (4b) at (3.south)
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1105
   {array\nodepart{two}
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1106
    $\textit{index}$\nodepart{three} \ldots\phantom{l}};
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1107
\node[stack,below left of = 3,node distance = 130pt,rectangle split parts = 3] (4a) at (3.south)
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1108
   {array\nodepart{two}
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1109
    $\textit{index}$\nodepart{three} \ldots\phantom{l}};
714
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1110
\node[stack,below of = 4a,node distance = 70pt,rectangle split parts = 3] (5a) at (4a.south)
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1111
   {$\textit{index}$\nodepart{two}
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1112
    array\nodepart{three} \ldots\phantom{l}};
714
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1113
\node[stack,below of = 5a,node distance = 60pt,rectangle split parts = 2] (6a) at (5a.south)
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1114
   {$\textit{array\_elem}$\nodepart{two} \ldots\phantom{l}};
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1115
\node[stack,below of = 4b,node distance = 65pt,rectangle split parts = 2] (5b) at (4b.south)
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1116
   {\ldots\phantom{l}};
714
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1117
\node[stack,below of = 5b,node distance = 60pt,rectangle split parts = 2] (6b) at (5b.south)
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1118
   {0\nodepart{two} \ldots\phantom{l}};
714
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1119
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1120
\draw [|->,line width=2.5mm] (A) -- node [above,pos=0.45] {$\textit{index\_aexp}$} (0);
714
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1121
\draw [->,line width=2.5mm] (0) -- node [above,pos=0.35] {\instr{aload}} (1);
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1122
\draw [->,line width=2.5mm] (1) -- node [right,pos=0.35] {\instr{dup2}} (2);
714
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1123
\draw [->,line width=2.5mm] (2) -- node [above,pos=0.40] {\instr{arraylength}} (3);
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1124
\path[->,draw,line width=2.5mm]
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1125
  let \p1=(3.south), \p2=(4a.north) in (3.south) -- +(0,0.5*\y2-0.5*\y1) node [right,pos=0.50] {\instr{if_icmple}} -| (4a.north);
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1126
\path[->,draw,line width=2.5mm]
714
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1127
  let \p1=(3.south), \p2=(4b.north) in (3.south) -- +(0,0.5*\y2-0.5*\y1) -| (4b.north);
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1128
\draw [->,line width=2.5mm] (4a) -- node [right,pos=0.35] {\instr{swap}} (5a);
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1129
\draw [->,line width=2.5mm] (4b) -- node [right,pos=0.35] {\instr{pop2}} (5b);
714
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1130
\draw [->,line width=2.5mm] (5a) -- node [right,pos=0.35] {\instr{iaload}} (6a);
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1131
\draw [->,line width=2.5mm] (5b) -- node [right,pos=0.35] {\instr{iconst_0}} (6b);
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1132
\end{tikzpicture}
714
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1133
\end{center}
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1134
\end{figure}
8a50ccea59e8 updated
Christian Urban <urbanc@in.tum.de>
parents: 713
diff changeset
  1135
713
0ea14d84efe3 updated
Christian Urban <urbanc@in.tum.de>
parents: 712
diff changeset
  1136
goto\_w problem solved for too large jumps
327
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
  1137
\end{document}
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
  1138
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1139
%%% Local Variables:
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1140
%%% mode: latex
327
9470cd124667 updated
Christian Urban <christian dot urban at kcl dot ac dot uk>
parents:
diff changeset
  1141
%%% TeX-master: t
960
c7009356ddd8 updated
Christian Urban <christian.urban@kcl.ac.uk>
parents: 943
diff changeset
  1142
%%% End: