14 |
14 |
15 \section*{Handout 9 (LLVM, SSA and CPS)} |
15 \section*{Handout 9 (LLVM, SSA and CPS)} |
16 |
16 |
17 Reflecting on our tiny compiler targetting the JVM, the code generation |
17 Reflecting on our tiny compiler targetting the JVM, the code generation |
18 part was actually not so hard, no? Pretty much just some post-traversal |
18 part was actually not so hard, no? Pretty much just some post-traversal |
19 of the abstract syntax tree, yes? One of the main reason for this ease is |
19 of the abstract syntax tree, yes? One of the main reason for this ease |
20 that the JVM is a stack-based virtual machine and it is therefore not |
20 is that the JVM is a stack-based virtual machine and it is therefore not |
21 hard to translate arithmetic expressions into a sequence of instructions |
21 hard to translate arithmetic expressions into a sequence of instructions |
22 manipulating the stack. The problem is that ``real'' CPUs, although |
22 manipulating the stack. The problem is that ``real'' CPUs, although |
23 supporting stack operations, are not really designed to be \emph{stack |
23 supporting stack operations, are not really designed to be \emph{stack |
24 machines}. The design of CPUs is more like, here is a chunk of |
24 machines}. The design of CPUs is more like, here is a chunk of |
25 memory---compiler, or better compiler writers, do something with it. |
25 memory---compiler, or better compiler writers, do something with it. |
26 Consequently, modern compilers need to go the extra mile in order to generate |
26 Consequently, modern compilers need to go the extra mile in order to |
27 code that is much easier and faster to process by CPUs. |
27 generate code that is much easier and faster to process by CPUs. To make |
28 |
28 this all tractable for this module, we target the LLVM Intermediate |
29 |
29 Language. In this way we can take advantage of the tools coming with |
30 Another reason why it makes sense to go the extra mile is that stack |
30 LLVM. For example we do not have to worry about things like register |
31 instructions are very difficult to optimise---you cannot just re-arrange |
31 allocations.\bigskip |
32 instructions without messing about with what is calculated on the stack. |
|
33 Also it is hard to find out if all the calculations on the stack are |
|
34 actually necessary and not by chance dead code. The JVM has for all this |
|
35 sophisticated machinery to make such ``high-level'' code still run fast, |
|
36 but let's say that for the sake of argument we do not want to rely on |
|
37 it. We want to generate fast code ourselves. This means we have to work |
|
38 around the intricacies of what instructions CPUs can actually process. |
|
39 To make this all tractable for this module, we target the LLVM |
|
40 Intermediate Language. In this way we can take advantage of the tools |
|
41 coming with LLVM. For example we do not have to worry about things like |
|
42 register allocations.\bigskip |
|
43 |
32 |
44 \noindent LLVM\footnote{\url{http://llvm.org}} is a beautiful example |
33 \noindent LLVM\footnote{\url{http://llvm.org}} is a beautiful example |
45 that projects from Academia can make a difference in the world. LLVM |
34 that projects from Academia can make a difference in the world. LLVM |
46 started in 2000 as a project by two researchers at the University of |
35 started in 2000 as a project by two researchers at the University of |
47 Illinois at Urbana-Champaign. At the time the behemoth of compilers was |
36 Illinois at Urbana-Champaign. At the time the behemoth of compilers was |
122 result of type \texttt{i32}. Each arithmetic operation, like addition or |
111 result of type \texttt{i32}. Each arithmetic operation, like addition or |
123 multiplication, are also prefixed with the type they operate on. |
112 multiplication, are also prefixed with the type they operate on. |
124 Obviously these types need to match up\ldots{} but since we have in our |
113 Obviously these types need to match up\ldots{} but since we have in our |
125 programs only integers, \texttt{i32} everywhere will do. |
114 programs only integers, \texttt{i32} everywhere will do. |
126 |
115 |
127 Conveniently, you can use the program \texttt{lli}, which comes with LLVM, to interprete |
116 Conveniently, you can use the program \texttt{lli}, which comes with |
128 programs written in the LLVM-IR. So you can easily check whether the |
117 LLVM, to interpret programs written in the LLVM-IR. So you can easily |
129 code you produced actually works. To get a running program that does |
118 check whether the code you produced actually works. To get a running |
130 something interesting you need to add some boilerplate about printing out |
119 program that does something interesting you need to add some boilerplate |
131 numbers and a main-function that is the entrypoint for the program (see Figure~\ref{lli}). |
120 about printing out numbers and a main-function that is the entrypoint |
132 |
121 for the program (see Figure~\ref{lli}). You can generate a binary for |
133 \begin{figure}[t] |
122 this program using \texttt{llc}-compiler in order to generate an object |
134 \lstinputlisting[language=LLVM]{../progs/sqr.ll} |
123 file and then use gcc (clang) for generating a binary: |
135 \caption{\label{lli}} |
124 |
|
125 \begin{lstlisting}[language=bash,numbers=none] |
|
126 llc -filetype=obj sqr.ll |
|
127 gcc sqr.o -o a.out |
|
128 ./a.out |
|
129 \end{lstlisting} |
|
130 |
|
131 \begin{figure}[t]\small |
|
132 \lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll} |
|
133 \caption{An LLVM-IR program for calculating the square function. The |
|
134 interesting function is \texttt{sqr} in Lines 13 -- 16. The main |
|
135 function calls \texttt{sqr} and prints out the result. The other |
|
136 code is boilerplate for printing out integers.\label{lli}} |
136 \end{figure} |
137 \end{figure} |
137 |
138 |
138 \begin{figure}[t] |
139 |
|
140 |
|
141 \subsection*{Our Own Intermediate Language} |
|
142 |
|
143 Remember compilers have to solve the problem of bridging the gap between |
|
144 ``high-level'' programs and ``low-level'' hardware. If the gap is tool |
|
145 wide then a good strategy is to lay a stepping stone somewhere in |
|
146 between. The LLVM-IR itself is such a stepping stone to make the task of |
|
147 generating code easier. Like a real compiler we will use another |
|
148 stepping stone which I call \emph{K-language}. For this remember |
|
149 expressions (and boolean expressions) in the Fun language are given by |
|
150 the code on top of Figure~\ref{absfun} |
|
151 |
|
152 |
|
153 \begin{figure}[p]\small |
139 \begin{lstlisting}[language=Scala,numbers=none] |
154 \begin{lstlisting}[language=Scala,numbers=none] |
140 abstract class Exp extends Serializable |
155 // Fun-language (expressions) |
141 abstract class BExp extends Serializable |
156 abstract class Exp |
142 abstract class Decl extends Serializable |
157 abstract class BExp |
143 |
|
144 case class Main(e: Exp) extends Decl |
|
145 case class Def(name: String, args: List[String], body: Exp) |
|
146 extends Decl |
|
147 |
158 |
148 case class Call(name: String, args: List[Exp]) extends Exp |
159 case class Call(name: String, args: List[Exp]) extends Exp |
149 case class If(a: BExp, e1: Exp, e2: Exp) extends Exp |
160 case class If(a: BExp, e1: Exp, e2: Exp) extends Exp |
150 case class Write(e: Exp) extends Exp |
161 case class Write(e: Exp) extends Exp |
151 case class Var(s: String) extends Exp |
162 case class Var(s: String) extends Exp |
152 case class Num(i: Int) extends Exp |
163 case class Num(i: Int) extends Exp |
153 case class Aop(o: String, a1: Exp, a2: Exp) extends Exp |
164 case class Aop(o: String, a1: Exp, a2: Exp) extends Exp |
154 case class Sequence(e1: Exp, e2: Exp) extends Exp |
165 case class Sequence(e1: Exp, e2: Exp) extends Exp |
155 case class Bop(o: String, a1: Exp, a2: Exp) extends BExp |
166 case class Bop(o: String, a1: Exp, a2: Exp) extends BExp |
|
167 |
|
168 |
|
169 |
|
170 // K-language (K-expressions, K-values) |
|
171 abstract class KExp |
|
172 abstract class KVal |
|
173 |
|
174 case class KVar(s: String) extends KVal |
|
175 case class KNum(i: Int) extends KVal |
|
176 case class Kop(o: String, v1: KVal, v2: KVal) extends KVal |
|
177 case class KCall(o: String, vrs: List[KVal]) extends KVal |
|
178 case class KWrite(v: KVal) extends KVal |
|
179 |
|
180 case class KIf(x1: String, e1: KExp, e2: KExp) extends KExp |
|
181 case class KLet(x: String, e1: KVal, e2: KExp) extends KExp |
|
182 case class KReturn(v: KVal) extends KExp |
156 \end{lstlisting} |
183 \end{lstlisting} |
157 \caption{Abstract syntax trees for the Fun language.\label{absfun}} |
184 \caption{Abstract syntax trees for the Fun language.\label{absfun}} |
158 \end{figure} |
185 \end{figure} |
159 |
186 |
160 |
187 |
161 |
188 |
162 \subsection*{CPS-Translations} |
189 \subsection*{CPS-Translations} |
163 |
190 |
|
191 |
|
192 |
|
193 |
|
194 |
|
195 Another reason why it makes sense to go the extra mile is that stack |
|
196 instructions are very difficult to optimise---you cannot just re-arrange |
|
197 instructions without messing about with what is calculated on the stack. |
|
198 Also it is hard to find out if all the calculations on the stack are |
|
199 actually necessary and not by chance dead code. The JVM has for all this |
|
200 sophisticated machinery to make such ``high-level'' code still run fast, |
|
201 but let's say that for the sake of argument we do not want to rely on |
|
202 it. We want to generate fast code ourselves. This means we have to work |
|
203 around the intricacies of what instructions CPUs can actually process. |
164 |
204 |
165 \end{document} |
205 \end{document} |
166 |
206 |
167 |
207 |
168 %%% Local Variables: |
208 %%% Local Variables: |