handouts/ho09.tex
changeset 680 eecc4d5a2172
parent 679 8fc109f36b78
child 700 52263ffd17b9
equal deleted inserted replaced
679:8fc109f36b78 680:eecc4d5a2172
    14 
    14 
    15 \section*{Handout 9 (LLVM, SSA and CPS)}
    15 \section*{Handout 9 (LLVM, SSA and CPS)}
    16 
    16 
    17 Reflecting on our tiny compiler targetting the JVM, the code generation
    17 Reflecting on our tiny compiler targetting the JVM, the code generation
    18 part was actually not so hard, no? Pretty much just some post-traversal
    18 part was actually not so hard, no? Pretty much just some post-traversal
    19 of the abstract syntax tree, yes? One of the main reason for this ease
    19 of the abstract syntax tree, yes? One of the reasons for this ease is
    20 is that the JVM is a stack-based virtual machine and it is therefore not
    20 that the JVM is a stack-based virtual machine and it is therefore not
    21 hard to translate arithmetic expressions into a sequence of instructions
    21 hard to translate deeply-nested arithmetic expressions into a sequence
    22 manipulating the stack. The problem is that ``real'' CPUs, although
    22 of instructions manipulating the stack. The problem is that ``real''
    23 supporting stack operations, are not really designed to be \emph{stack
    23 CPUs, although supporting stack operations, are not really designed to
    24 machines}.  The design of CPUs is more like, here is a chunk of
    24 be \emph{stack machines}.  The design of CPUs is more like, here is a
    25 memory---compiler, or better compiler writers, do something with it.
    25 chunk of memory---compiler, or better compiler writers, do something
    26 Consequently, modern compilers need to go the extra mile in order to
    26 with it. Consequently, modern compilers need to go the extra mile in
    27 generate code that is much easier and faster to process by CPUs. To make
    27 order to generate code that is much easier and faster to process by
    28 this all tractable for this module, we target the LLVM Intermediate
    28 CPUs. To make this all tractable for this module, we target the LLVM
    29 Language. In this way we can take advantage of the tools coming with
    29 Intermediate Language. In this way we can take advantage of the tools
    30 LLVM. For example we do not have to worry about things like register
    30 coming with LLVM. For example we do not have to worry about things like
    31 allocations.\bigskip 
    31 register allocations.\bigskip 
    32 
    32 
    33 \noindent LLVM\footnote{\url{http://llvm.org}} is a beautiful example
    33 \noindent LLVM\footnote{\url{http://llvm.org}} is a beautiful example
    34 that projects from Academia can make a difference in the world. LLVM
    34 that projects from Academia can make a difference in the world. LLVM
    35 started in 2000 as a project by two researchers at the  University of
    35 started in 2000 as a project by two researchers at the  University of
    36 Illinois at Urbana-Champaign. At the time the behemoth of compilers was
    36 Illinois at Urbana-Champaign. At the time the behemoth of compilers was
    37 gcc with its myriad of front-ends for other languages (e.g.~Fortran,
    37 gcc with its myriad of front-ends for other languages (C++, Fortran,
    38 Ada, Go, Objective-C, Pascal etc). The problem was that gcc morphed over
    38 Ada, Go, Objective-C, Pascal etc). The problem was that gcc morphed over
    39 time into a monolithic gigantic piece of m\ldots ehm software, which you
    39 time into a monolithic gigantic piece of m\ldots ehm software, which you
    40 could not mess about in an afternoon. In contrast, LLVM is designed to
    40 could not mess about in an afternoon. In contrast, LLVM is designed to
    41 be a modular suite of tools with which you could play around easily and
    41 be a modular suite of tools with which you could play around easily and
    42 try out something new. LLVM became a big player once Apple hired one of
    42 try out something new. LLVM became a big player once Apple hired one of
    44 want to use gcc, but maybe they were also just disgusted by its big
    44 want to use gcc, but maybe they were also just disgusted by its big
    45 monolithic codebase). Anyway, LLVM is now the big player and gcc is more
    45 monolithic codebase). Anyway, LLVM is now the big player and gcc is more
    46 or less legacy. This does not mean that programming languages like C and
    46 or less legacy. This does not mean that programming languages like C and
    47 C++ are dying out any time soon---they are nicely supported by LLVM.
    47 C++ are dying out any time soon---they are nicely supported by LLVM.
    48 
    48 
    49 Targetting the LLVM Intermediate Language, or Intermediate
    49 We will target the LLVM Intermediate Language, or Intermediate
    50 Representation (short LLVM-IR), also means we can profit from the very
    50 Representation (short LLVM-IR). As a result we can benefit from the
    51 modular structure of the LLVM compiler and let for example the compiler
    51 modular structure of the LLVM compiler and let for example the compiler
    52 generate code for X86, or ARM etc. That means we can be agnostic about
    52 generate code for different CPUs, like X86 or ARM. That means we can be
    53 where our code actually runs. However, what we have to do is to generate
    53 agnostic about where our code actually runs. We can also be ignorant
    54 code in \emph{Static Single-Assignment} format (short SSA), because that
    54 about optimising code and allocating memory efficiently. However, what
    55 is what the LLVM-IR expects from us. LLVM-IR is the intermediate format
    55 we have to do is to generate code in \emph{Static Single-Assignment}
    56 that LLVM uses for doing cool things, like targetting strange
    56 format (short SSA), because that is what the LLVM-IR expects from us. 
    57 architectures, optimising code and allocating memory efficiently. 
       
    58 
    57 
    59 The idea behind the SSA format is to use very simple variable
    58 The idea behind the SSA format is to use very simple variable
    60 assignments where every variable is assigned only once. The assignments
    59 assignments where every variable is assigned only once. The assignments
    61 also need to be primitive in the sense that they can be just simple
    60 also need to be primitive in the sense that they can be just simple
    62 operations like addition, multiplication, jumps, comparisons and so on.
    61 operations like addition, multiplication, jumps, comparisons and so on.
    63 An idealised snippet of a program in SSA is 
    62 Say, we have an expression $((1 + a) + (3 + (b * 5)))$, then the
    64 
    63 SSA is 
    65 \begin{lstlisting}[language=LLVM,numbers=none]
    64  
    66     x := 1
    65 \begin{lstlisting}[language=LLVMIR,numbers=left]
    67     y := 2     
    66 let tmp0 = add 1 a in   
    68     z := x + y
    67 let tmp1 = mul b 5 in 
       
    68 let tmp2 = add 3 tmp1 in 
       
    69 let tmp3 = add tmp0 tmp2 in
       
    70   tmp3 
    69 \end{lstlisting}
    71 \end{lstlisting}
    70 
    72 
    71 \noindent where every variable is used only once (we could not write
    73 \noindent where every variable is used only once (we could not write
    72 \texttt{x := x + y} in the last line for example).  There are
    74 \texttt{tmp1 = add 3 tmp1} in Line 3 for example).  There are
    73 sophisticated algorithms for imperative languages, like C, that
    75 sophisticated algorithms for imperative languages, like C, that
    74 efficiently transform a high-level program into SSA format. But we can
    76 efficiently transform a high-level program into SSA format. But we can
    75 ignore them here. We want to compile a functional language and there
    77 ignore them here. We want to compile a functional language and there
    76 things get much more interesting than just sophisticated. We will need
    78 things get much more interesting than just sophisticated. We will need
    77 to have a look at CPS translations, where the CPS stands for
    79 to have a look at CPS translations, where the CPS stands for
    79 abracadabra programming. So sit tight.
    81 abracadabra programming. So sit tight.
    80 
    82 
    81 \subsection*{LLVM-IR}
    83 \subsection*{LLVM-IR}
    82 
    84 
    83 Before we start, lets first have a look at the \emph{LLVM Intermediate
    85 Before we start, lets first have a look at the \emph{LLVM Intermediate
    84 Representation}. What is good about our simple Fun language is that it
    86 Representation} in more detail. What is good about our toy Fun language
    85 basically only contains expressions (be they arithmetic expressions or
    87 is that it basically only contains expressions (be they arithmetic
    86 boolean expressions). The exception is function definitions. Luckily,
    88 expressions or boolean expressions or if-expressions). The exception are
    87 for them we can use the mechanism of defining functions in LLVM-IR. For
    89 function definitions. Luckily, for them we can use the mechanism of
    88 example the simple Fun program 
    90 defining functions in LLVM-IR. For example the simple Fun program 
    89 
    91 
    90 
    92 
    91 \begin{lstlisting}[language=Scala,numbers=none]
    93 \begin{lstlisting}[language=Scala,numbers=none]
    92 def sqr(x) = x * x
    94 def sqr(x) = x * x
    93 \end{lstlisting}
    95 \end{lstlisting}
   100    %tmp = mul i32 %x, %x
   102    %tmp = mul i32 %x, %x
   101    ret i32 %tmp
   103    ret i32 %tmp
   102 }    
   104 }    
   103 \end{lstlisting}
   105 \end{lstlisting}
   104 
   106 
   105 \noindent First to notice is that all variable names in the LLVM-IR are
   107 \noindent First notice that all variable names in the LLVM-IR are
   106 prefixed by \texttt{\%}; function names need to be prefixed with @.
   108 prefixed by \texttt{\%}; function names need to be prefixed with
   107 Also, the LLVM-IR is a fully typed language. The \texttt{i32} type stands
   109 @-symbol. Also, the LLVM-IR is a fully typed language. The \texttt{i32}
   108 for a 32-bit integer. There are also types for 64-bit integers, chars
   110 type stands for 32-bit integers. There are also types for 64-bit
   109 (\texttt{i8}), floats, arrays and even pointer types. In teh code above,
   111 integers (\texttt{i64}), chars (\texttt{i8}), floats, arrays and even
   110 \texttt{sqr} takes an argument of type \texttt{i32} and produces a
   112 pointer types. In the code above, \texttt{sqr} takes an argument of type
   111 result of type \texttt{i32}. Each arithmetic operation, like addition or
   113 \texttt{i32} and produces a result of type \texttt{i32} (the result type
   112 multiplication, are also prefixed with the type they operate on.
   114 is before the function name, like in C). Each arithmetic operation, like
   113 Obviously these types need to match up\ldots{} but since we have in our
   115 addition or multiplication, are also prefixed with the type they operate
   114 programs only integers, \texttt{i32} everywhere will do. 
   116 on. Obviously these types need to match up\ldots{} but since we have in
       
   117 our programs only integers, \texttt{i32} everywhere will do. 
   115  
   118  
   116 Conveniently, you can use the program \texttt{lli}, which comes with
   119 Conveniently, you can use the program \texttt{lli}, which comes with
   117 LLVM, to interpret programs written in the LLVM-IR. So you can easily
   120 LLVM, to interpret programs written in the LLVM-IR. So you can easily
   118 check whether the code you produced actually works. To get a running
   121 check whether the code you produced actually works. To get a running
   119 program that does something interesting you need to add some boilerplate
   122 program that does something interesting you need to add some boilerplate
   120 about printing out numbers and a main-function that is the entrypoint
   123 about printing out numbers and a main-function that is the entrypoint
   121 for the program (see Figure~\ref{lli}). You can generate a binary for
   124 for the program (see Figure~\ref{lli} for a complete listing). You can
   122 this program using \texttt{llc}-compiler in order to generate an object
   125 generate a binary for this program using \texttt{llc}-compiler and
   123 file and then use gcc (clang) for generating a binary:
   126 \texttt{gcc}---\texttt{llc} can generate an object file and then you can 
       
   127 use \texttt{gcc} (that is clang) for generating an executable  binary:
   124 
   128 
   125 \begin{lstlisting}[language=bash,numbers=none]
   129 \begin{lstlisting}[language=bash,numbers=none]
   126 llc -filetype=obj sqr.ll
   130 llc -filetype=obj sqr.ll
   127 gcc sqr.o -o a.out
   131 gcc sqr.o -o a.out
   128 ./a.out
   132 ./a.out
       
   133 > 25
   129 \end{lstlisting}
   134 \end{lstlisting}
   130 
   135 
   131 \begin{figure}[t]\small 
   136 \begin{figure}[t]\small 
   132 \lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll}
   137 \lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll}
   133 \caption{An LLVM-IR program for calculating the square function. The 
   138 \caption{An LLVM-IR program for calculating the square function. The 
   139 
   144 
   140     
   145     
   141 \subsection*{Our Own Intermediate Language}
   146 \subsection*{Our Own Intermediate Language}
   142 
   147 
   143 Remember compilers have to solve the problem of bridging the gap between
   148 Remember compilers have to solve the problem of bridging the gap between
   144 ``high-level'' programs and ``low-level'' hardware. If the gap is tool
   149 ``high-level'' programs and ``low-level'' hardware. If the gap is too
   145 wide then a good strategy is to lay a stepping stone somewhere in
   150 wide for one step, then a good strategy is to lay a stepping stone
   146 between. The LLVM-IR itself is such a stepping stone to make the task of
   151 somewhere in between. The LLVM-IR itself is such a stepping stone to
   147 generating code easier. Like a real compiler we will use another
   152 make the task of generating and optimising code easier. Like a real
   148 stepping stone which I call \emph{K-language}. For this remember
   153 compiler we will use our own stepping stone which I call the
   149 expressions (and boolean expressions) in the Fun language are given by
   154 \emph{K-language}. For this remember expressions (and boolean
   150 the code on top of Figure~\ref{absfun}
   155 expressions) in the Fun language. For convenience the Scala code is
       
   156 shown on top of Figure~\ref{absfun}. Below is the code for the
       
   157 K-language. There are two kinds of syntactic entities, namely
       
   158 \emph{K-values} and \emph{K-expressions}. The central point of the
       
   159 K-language is the \texttt{KLet}-constructor. For this recall that 
       
   160 arithmetic expressions such as $((1 + a) + (3 + (b * 5)))$ need
       
   161 to be broken up into smaller ``atomic'' steps, like so
       
   162  
       
   163 \begin{lstlisting}[language=LLVMIR,numbers=none]
       
   164 let tmp0 = add 1 a in   
       
   165 let tmp1 = mul b 5 in 
       
   166 let tmp2 = add 3 tmp1 in 
       
   167 let tmp3 = add tmp0 tmp2 in
       
   168   tmp3 
       
   169 \end{lstlisting}
       
   170 
       
   171 \noindent
       
   172 Here \texttt{tmp3} will contain the result of what the expression stands
       
   173 for. In each step we can only perform an ``atomic'' operation, like
       
   174 addition or multiplication. We could not for example have an
       
   175 if-condition on the right-hand side of an equals. These constraints
       
   176 enforced upon us because how the SSA format works in the LLVM-IR. By
       
   177 having in \texttt{KLet},  first a string (standing for an intermediate
       
   178 result) and second a value, we can fulfil this constraint---there is no
       
   179 way we could write anything else than a value. To sum up, K-values are
       
   180 the atomic operations that can be on the right-hand side of equal-signs.
       
   181 The K-language is restricted such that it is easy to generate the SSA format 
       
   182 for the LLVM-IR.
       
   183 
   151 
   184 
   152 
   185 
   153 \begin{figure}[p]\small
   186 \begin{figure}[p]\small
   154 \begin{lstlisting}[language=Scala,numbers=none]
   187 \begin{lstlisting}[language=Scala,numbers=none]
   155 // Fun-language (expressions)
   188 // Fun-language (expressions)
   176 case class Kop(o: String, v1: KVal, v2: KVal) extends KVal
   209 case class Kop(o: String, v1: KVal, v2: KVal) extends KVal
   177 case class KCall(o: String, vrs: List[KVal]) extends KVal
   210 case class KCall(o: String, vrs: List[KVal]) extends KVal
   178 case class KWrite(v: KVal) extends KVal
   211 case class KWrite(v: KVal) extends KVal
   179 
   212 
   180 case class KIf(x1: String, e1: KExp, e2: KExp) extends KExp
   213 case class KIf(x1: String, e1: KExp, e2: KExp) extends KExp
   181 case class KLet(x: String, e1: KVal, e2: KExp) extends KExp
   214 case class KLet(x: String, v: KVal, e: KExp) extends KExp
   182 case class KReturn(v: KVal) extends KExp
   215 case class KReturn(v: KVal) extends KExp
   183 \end{lstlisting}
   216 \end{lstlisting}
   184 \caption{Abstract syntax trees for the Fun language.\label{absfun}}
   217 \caption{Abstract syntax trees for the Fun language.\label{absfun}}
   185 \end{figure}
   218 \end{figure}
   186 
   219