136 \texttt{i32} (the result type is in front of the function name, like in |
136 \texttt{i32} (the result type is in front of the function name, like in |
137 C). Each arithmetic operation, for example addition and multiplication, |
137 C). Each arithmetic operation, for example addition and multiplication, |
138 are also prefixed with the type they operate on. Obviously these types |
138 are also prefixed with the type they operate on. Obviously these types |
139 need to match up\ldots{} but since we have in our programs only |
139 need to match up\ldots{} but since we have in our programs only |
140 integers, \texttt{i32} everywhere will do. We do not have to generate |
140 integers, \texttt{i32} everywhere will do. We do not have to generate |
141 any other types, but obviously this is a limitation in our Fun-language. |
141 any other types, but obviously this is a limitation in our Fun language. |
142 |
142 |
143 There are a few interesting instructions in the LLVM-IR which are quite |
143 There are a few interesting instructions in the LLVM-IR which are quite |
144 different than in the JVM. Can you remember the kerfuffle we had to go |
144 different than what we have seen in the JVM. Can you remember the |
145 through with boolean expressions and negating the condition? In the |
145 kerfuffle we had to go through with boolean expressions and negating the |
146 LLVM-IR, branching if-conditions is implemented differently: there |
146 condition? In the LLVM-IR, branching if-conditions is implemented |
147 is a separate \texttt{br}-instruction as follows: |
147 differently: there is a separate \texttt{br}-instruction as follows: |
148 |
148 |
149 \begin{lstlisting}[language=LLVM] |
149 \begin{lstlisting}[language=LLVM] |
150 br i1 %var, label %if_br, label %else_br |
150 br i1 %var, label %if_br, label %else_br |
151 \end{lstlisting} |
151 \end{lstlisting} |
152 |
152 |
153 \noindent |
153 \noindent |
154 The type \texttt{i1} stands for booleans. If the variable is true, then |
154 The type \texttt{i1} stands for booleans. If the variable is true, then |
155 this instruction jumps to the if-branch, which needs an explicit label; |
155 this instruction jumps to the if-branch, which needs an explicit label; |
156 otherwise to the else-branch, again with its own label. This allows us |
156 otherwise to the else-branch, again with its own label. This allows us |
157 to keep the meaning of the boolean expression as is. A value of type |
157 to keep the meaning of the boolean expression as is when compiling if's. |
158 boolean is generated in the LLVM-IR by the \texttt{icmp}-instruction. |
158 A value of type boolean is generated in the LLVM-IR by the |
159 This instruction is for integers (hence the \texttt{i}) and takes the |
159 \texttt{icmp}-instruction. This instruction is for integers (hence the |
160 comparison operation as argument. For example |
160 \texttt{i}) and takes the comparison operation as argument. For example |
161 |
161 |
162 \begin{lstlisting}[language=LLVM] |
162 \begin{lstlisting}[language=LLVM] |
163 icmp eq i32 %x, %y ; for equal |
163 icmp eq i32 %x, %y ; for equal |
164 icmp sle i32 %x, %y ; signed less or equal |
164 icmp sle i32 %x, %y ; signed less or equal |
165 icmp slt i32 %x, %y ; signed less than |
165 icmp slt i32 %x, %y ; signed less than |
168 |
168 |
169 \noindent |
169 \noindent |
170 In some operations, the LLVM-IR distinguishes between signed and |
170 In some operations, the LLVM-IR distinguishes between signed and |
171 unsigned representations of integers. |
171 unsigned representations of integers. |
172 |
172 |
|
173 It is also easy to call another function in LLVM-IR: as can be |
|
174 seen from Figure~\ref{lli} we can just call a function with the |
|
175 instruction \texttt{call} and can also assign the result to |
|
176 a variable. The syntax is as follows |
|
177 |
|
178 \begin{lstlisting}[language=LLVM] |
|
179 %var = call i32 @foo(...args...) |
|
180 \end{lstlisting} |
|
181 |
|
182 \noindent |
|
183 where the arguments can only be simple variables, not compound |
|
184 expressions. |
|
185 |
173 Conveniently, you can use the program \texttt{lli}, which comes with |
186 Conveniently, you can use the program \texttt{lli}, which comes with |
174 LLVM, to interpret programs written in the LLVM-IR. So you can easily |
187 LLVM, to interpret programs written in the LLVM-IR. So you can easily |
175 check whether the code you produced actually works. To get a running |
188 check whether the code you produced actually works. To get a running |
176 program that does something interesting you need to add some boilerplate |
189 program that does something interesting you need to add some boilerplate |
177 about printing out numbers and a main-function that is the entrypoint |
190 about printing out numbers and a main-function that is the entry point |
178 for the program (see Figure~\ref{lli} for a complete listing). Again |
191 for the program (see Figure~\ref{lli} for a complete listing). Again |
179 this is very similar to the boilerplate we needed to add in our JVM |
192 this is very similar to the boilerplate we needed to add in our JVM |
180 compiler. |
193 compiler. |
181 |
194 |
182 You can generate a binary for the program in Figure~\ref{lli} by using |
195 You can generate a binary for the program in Figure~\ref{lli} by using |
212 compiler we will use our own stepping stone which I call the |
225 compiler we will use our own stepping stone which I call the |
213 \emph{K-language}. For what follows recall the various kinds of |
226 \emph{K-language}. For what follows recall the various kinds of |
214 expressions in the Fun language. For convenience the Scala code of the |
227 expressions in the Fun language. For convenience the Scala code of the |
215 corresponding abstract syntax trees is shown on top of |
228 corresponding abstract syntax trees is shown on top of |
216 Figure~\ref{absfun}. Below is the code for the abstract syntax trees in |
229 Figure~\ref{absfun}. Below is the code for the abstract syntax trees in |
217 the K-language. There are two kinds of syntactic entities, namely |
230 the K-language. In K, here are two kinds of syntactic entities, namely |
218 \emph{K-values} and \emph{K-expressions}. The central constructor of the |
231 \emph{K-values} and \emph{K-expressions}. The central constructor of the |
219 K-language is \texttt{KLet}. For this recall that arithmetic expressions |
232 K-language is \texttt{KLet}. For this recall in SSA that arithmetic |
220 such as $((1 + a) + (3 + (b * 5)))$ need to be broken up into smaller |
233 expressions such as $((1 + a) + (3 + (b * 5)))$ need to be broken up |
221 ``atomic'' steps, like so |
234 into smaller ``atomic'' steps, like so |
222 |
235 |
223 \begin{lstlisting}[language=LLVMIR,numbers=none] |
236 \begin{lstlisting}[language=LLVMIR,numbers=none] |
224 let tmp0 = add 1 a in |
237 let tmp0 = add 1 a in |
225 let tmp1 = mul b 5 in |
238 let tmp1 = mul b 5 in |
226 let tmp2 = add 3 tmp1 in |
239 let tmp2 = add 3 tmp1 in |
286 The main difficulty of generating instructions in SSA format is that |
299 The main difficulty of generating instructions in SSA format is that |
287 large compound expressions need to be broken up into smaller pieces and |
300 large compound expressions need to be broken up into smaller pieces and |
288 intermediate results need to be chained into later instructions. To do |
301 intermediate results need to be chained into later instructions. To do |
289 this conveniently, CPS-translations have been developed. They use |
302 this conveniently, CPS-translations have been developed. They use |
290 functions (``continuations'') to represent what is coming next in a |
303 functions (``continuations'') to represent what is coming next in a |
291 sequence of instructions. |
304 sequence of instructions. Continuations are functions of type |
292 |
305 \code{KVal} to \code{KExp}. They can be seen as a sequence of \code{KLet}s |
293 |
306 where there is a ``hole'' that needs to be filled. Consider for example |
294 |
307 |
295 |
308 \begin{lstlisting}[language=LLVMIR,numbers=left,escapeinside={(*@}{@*)}] |
296 |
309 let tmp0 = add 1 a in |
|
310 let tmp1 = mul (*@$\Box$@*) 5 in |
|
311 let tmp2 = add 3 tmp1 in |
|
312 let tmp3 = add tmp0 tmp2 in |
|
313 tmp3 |
|
314 \end{lstlisting} |
|
315 |
|
316 \noindent |
|
317 where in the second line is a $\Box$ which still expects a \code{KVal} |
|
318 to be filled in before it becomes a ``proper'' \code{KExp}. When we |
|
319 apply and argument to the continuation (remember they are functions) |
|
320 we essentially fill something into the corresponding hole. The code |
|
321 of the CPS-translation is |
|
322 |
|
323 \begin{lstlisting}[language=Scala,numbers=none] |
|
324 def CPS(e: Exp)(k: KVal => KExp) : KExp = |
|
325 e match { ... } |
|
326 \end{lstlisting} |
|
327 |
|
328 \noindent |
|
329 where \code{k} is the continuation and \code{e} is the expression |
|
330 to be compiled. In case we have numbers or variables, we can just |
|
331 apply the continuation like |
|
332 |
|
333 \begin{center} |
|
334 \code{k(KNum(n))} \qquad \code{k(KVar(x))} |
|
335 \end{center} |
|
336 |
|
337 \noindent This would just fill in the $\Box$ in a \code{KLet}-expression. |
|
338 More interesting is the case for an arithmetic expression. |
|
339 |
|
340 \begin{lstlisting}[language=Scala,numbers=none] |
|
341 case Aop(o, e1, e2) => { |
|
342 val z = Fresh("tmp") |
|
343 CPS(e1)(y1 => |
|
344 CPS(e2)(y2 => KLet(z, Kop(o, y1, y2), k(KVar(z))))) |
|
345 } |
|
346 \end{lstlisting} |
|
347 |
|
348 \noindent |
297 \end{document} |
349 \end{document} |
298 |
350 |
299 |
351 |
300 %%% Local Variables: |
352 %%% Local Variables: |
301 %%% mode: latex |
353 %%% mode: latex |