| author | Christian Urban <christian.urban@kcl.ac.uk> | 
| Sat, 19 Dec 2020 00:13:58 +0000 | |
| changeset 820 | 7fd1f611c21d | 
| parent 752 | c0bdd4ad69ca | 
| child 821 | f914b9476dc7 | 
| permissions | -rw-r--r-- | 
| 630 | 1 | % !TEX program = xelatex | 
| 200 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 2 | \documentclass{article}
 | 
| 299 
6322922aa990
update
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
298diff
changeset | 3 | \usepackage{../style}
 | 
| 820 | 4 | \usepackage{../graphics}
 | 
| 216 
f5ec7c597c5b
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
214diff
changeset | 5 | \usepackage{../langs}
 | 
| 200 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 6 | |
| 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 7 | \begin{document}
 | 
| 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 8 | |
| 820 | 9 | \section*{Coursework 5\footnote{\today}}
 | 
| 200 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 10 | |
| 722 | 11 | |
| 12 | ||
| 751 | 13 | \noindent This coursework is worth 12\% and is due on \cwFIVE{} at
 | 
| 820 | 14 | 18:00. You are asked to implement a compiler targeting the LLVM-IR. | 
| 15 | Be careful that this CW needs some material about the LLVM-IR | |
| 16 | that has not been shown in the lectures and your own experiments | |
| 17 | might be required. You can find information about the LLVM-IR at | |
| 18 | ||
| 19 | \begin{itemize}
 | |
| 20 | \item \url{https://bit.ly/3rheZYr}
 | |
| 21 | \item \url{https://llvm.org/docs/LangRef.html}  
 | |
| 22 | \end{itemize}  
 | |
| 23 | ||
| 24 | \noindent | |
| 25 | You can do the implementation of your compiler in any programming | |
| 748 | 26 | language you like, but you need to submit the source code with which | 
| 820 | 27 | you generated the LLVM-IR files, otherwise a mark of 0\% will be | 
| 28 | awarded. You should use the lexer and parser from the previous | |
| 29 | courseworks, but you need to make some modifications to them for the | |
| 30 | `typed' fun-language. I will award up to 4\% if a lexer and parser are | |
| 31 | implemented. At the end, please package everything(!) in a zip-file | |
| 32 | that creates a directory with the name \texttt{YournameYourFamilyname}
 | |
| 33 | on my end. | |
| 200 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 34 | |
| 750 | 35 | \subsection*{Disclaimer\alert}
 | 
| 358 
b3129cff41e9
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
333diff
changeset | 36 | |
| 750 | 37 | It should be understood that the work you submit represents your own | 
| 38 | effort. You have not copied from anyone else. An exception is the | |
| 39 | Scala code I showed during the lectures or uploaded to KEATS, which | |
| 751 | 40 | you can both use. You can also use your own code from the CW~1 -- | 
| 41 | CW~4. | |
| 200 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 42 | |
| 299 
6322922aa990
update
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
298diff
changeset | 43 | |
| 820 | 44 | \subsection*{Task}
 | 
| 45 | ||
| 46 | The goal is to lex and parse the Mandelbrot program shown in | |
| 47 | Figure~\ref{mand} and generate corresponding code for the
 | |
| 48 | LLVM-IR. Unfortunately the calculations for the Mandelbrot set require | |
| 49 | floating point arithmetic and therefore we cannot be as simple-minded | |
| 50 | about types as we have been so far (remember the LLVM-IR is a | |
| 51 | fully-typed language and needs to know the exact types of each | |
| 52 | expression). The idea is to deal appropriately with three types, | |
| 53 | namely \texttt{Int}, \texttt{Double} and \texttt{Void} (they are
 | |
| 54 | represented in the LLVM-IR as \texttt{i32}, \texttt{double} and
 | |
| 55 | \texttt{void}). You need to extend the lexer and parser accordingly in
 | |
| 56 | order to deal with type annotations. The Fun-language includes global | |
| 57 | constants, such as | |
| 58 | ||
| 59 | \begin{lstlisting}[numbers=none]
 | |
| 60 | val Ymin: Double = -1.3; | |
| 61 | val Maxiters: Int = 1000; | |
| 62 | \end{lstlisting}
 | |
| 63 | ||
| 64 | \noindent | |
| 65 | where you want to assume that they are `normal' identifiers, just | |
| 66 | starting with a capital letter---all other identifiers should have | |
| 67 | lower-case letters. Function definitions can take arguments of | |
| 68 | type \texttt{Int} or \texttt{Double}, and need to specify a return
 | |
| 69 | type, which can be \texttt{Void}, for example
 | |
| 70 | ||
| 71 | \begin{lstlisting}[numbers=none]
 | |
| 72 | def foo(n: Int, x: Double) : Double = ... | |
| 73 | def bar() : Void = ... | |
| 74 | \end{lstlisting}
 | |
| 75 | ||
| 76 | \noindent | |
| 77 | The idea is to record all typing information that is given | |
| 78 | in the program, but then delay any further typing inference to | |
| 79 | after the CPS-translation. That means the parser should | |
| 80 | generate ASTs given by the Scala dataypes: | |
| 81 | ||
| 82 | \begin{lstlisting}[numbers=none,language=Scala]
 | |
| 83 | abstract class Exp | |
| 84 | abstract class BExp | |
| 85 | abstract class Decl | |
| 86 | ||
| 87 | case class Def(name: String, args: List[(String, String)], | |
| 88 | ty: String, body: Exp) extends Decl | |
| 89 | case class Main(e: Exp) extends Decl | |
| 90 | case class Const(name: String, v: Int) extends Decl | |
| 91 | case class FConst(name: String, x: Float) extends Decl | |
| 92 | ||
| 93 | case class Call(name: String, args: List[Exp]) extends Exp | |
| 94 | case class If(a: BExp, e1: Exp, e2: Exp) extends Exp | |
| 95 | case class Var(s: String) extends Exp | |
| 96 | case class Num(i: Int) extends Exp // integer numbers | |
| 97 | case class FNum(i: Float) extends Exp // floating numbers | |
| 98 | case class Aop(o: String, a1: Exp, a2: Exp) extends Exp | |
| 99 | case class Sequence(e1: Exp, e2: Exp) extends Exp | |
| 100 | case class Bop(o: String, a1: Exp, a2: Exp) extends BExp | |
| 101 | \end{lstlisting}
 | |
| 102 | ||
| 103 | \noindent | |
| 104 | This datatype distinguishes whether the global constant is an integer | |
| 105 | constant or floating constant. Also a function definition needs to | |
| 106 | record the return type of the function, namely the argument | |
| 107 | \texttt{ty} in \texttt{Def}, and the arguments consist of an pairs of
 | |
| 108 | identifier names and types (\texttt{Int} or \texttt{Double}). The hard
 | |
| 109 | part of the CW is to design the K-intermediate language and infer all | |
| 110 | necessary types in order to generate LLVM-IR code. You can check | |
| 111 | your LLVM-IR code by running it with the interpreter \texttt{lli}.
 | |
| 112 | ||
| 113 | \begin{figure}[t]
 | |
| 114 | \lstinputlisting[language=Scala]{../progs/fun2/mand.fun}
 | |
| 115 | \caption{The Mandelbrot program in the `typed' Fun-language.\label{mand}}
 | |
| 116 | \end{figure}
 | |
| 117 | ||
| 118 | \begin{figure}[t]
 | |
| 119 | \includegraphics[scale=0.35]{../progs/fun2/out.png}
 | |
| 120 | \caption{Ascii output of the Mandelbrot program.\label{mand}}
 | |
| 121 | \end{figure}
 | |
| 122 | ||
| 123 | \subsection*{LLVM-IR}
 | |
| 124 | ||
| 125 | There are some subtleties in the LLVM-IR you need to be aware of: | |
| 126 | ||
| 127 | \begin{itemize}
 | |
| 128 | \item \textbf{Global constants}: While global constants such as
 | |
| 129 | ||
| 130 | \begin{lstlisting}[numbers=none]  
 | |
| 131 | val Max : Int = 10; | |
| 132 | \end{lstlisting}
 | |
| 200 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 133 | |
| 820 | 134 | \noindent | 
| 135 | can be easily defined in the LLVM-IR as follows | |
| 136 | ||
| 137 | \begin{lstlisting}[numbers=none]  
 | |
| 138 | @Max = global i32 10 | |
| 139 | \end{lstlisting}
 | |
| 140 | ||
| 141 | \noindent | |
| 142 | they cannot easily be referenced. If you want to use | |
| 143 | this constant then you need to generate code such as | |
| 144 | ||
| 145 | \begin{lstlisting}[numbers=none]  
 | |
| 146 | %tmp_22 = load i32, i32* @Max | |
| 147 | \end{lstlisting}
 | |
| 148 | ||
| 149 | \noindent | |
| 150 | first, which treats \texttt{@Max} as an Integer-pointer (type
 | |
| 151 | \texttt{i32*}) that needs to be loaded into a local variable,
 | |
| 152 | here \texttt{\%tmp\_22}.
 | |
| 153 | ||
| 154 | \item \textbf{Void-Functions}: While integer and double functions
 | |
| 155 | can easily be called and their results can be allocated to a | |
| 156 | temporary variable: | |
| 157 | ||
| 158 |   \begin{lstlisting}[numbers=none]  
 | |
| 159 | %tmp_23 = call i32 @sqr (i32 %n) | |
| 160 |   \end{lstlisting}
 | |
| 161 | ||
| 162 | void-functions cannot be allocated to a variable. They need to be | |
| 163 | called just as | |
| 164 | ||
| 165 |   \begin{lstlisting}[numbers=none]  
 | |
| 166 | call void @print_int (i32 %tmp_23) | |
| 167 | \end{lstlisting}
 | |
| 168 | ||
| 169 | \item \textbf{Floating-Point Operations}: While integer operations
 | |
| 170 | are specified in the LLVM-IR as | |
| 201 
c813506e0ee8
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
200diff
changeset | 171 | |
| 820 | 172 |   \begin{lstlisting}[numbers=none,language=Scala]
 | 
| 173 |   def compile_op(op: String) = op match {
 | |
| 174 | case "+" => "add i32 " | |
| 175 | case "*" => "mul i32 " | |
| 176 | case "-" => "sub i32 " | |
| 177 | case "==" => "icmp eq i32 " | |
| 178 | case "<=" => "icmp sle i32 " // signed less or equal | |
| 179 | case "<" => "icmp slt i32 " // signed less than | |
| 180 |   }\end{lstlisting}
 | |
| 181 | ||
| 182 | the corresponding operations on doubles are | |
| 183 | ||
| 184 |   \begin{lstlisting}[numbers=none,language=Scala]
 | |
| 185 |   def compile_dop(op: String) = op match {
 | |
| 186 | case "+" => "fadd double " | |
| 187 | case "*" => "fmul double " | |
| 188 | case "-" => "fsub double " | |
| 189 | case "==" => "fcmp oeq double " | |
| 190 | case "<=" => "fcmp ole double " | |
| 191 | case "<" => "fcmp olt double " | |
| 192 |   }\end{lstlisting}
 | |
| 193 | ||
| 194 | \item \textbf{Typing}: In order to leave the CPS-translations
 | |
| 195 | as is, it makes sense to defer the full type-inference to the | |
| 196 | K-intermediate-language. For this it is good to define | |
| 197 |   the \texttt{KVar} constructor as
 | |
| 198 | ||
| 199 | \begin{lstlisting}[numbers=none,language=Scala]  
 | |
| 200 | case class KVar(s: String, ty: Ty = "UNDEF") extends KVal\end{lstlisting}
 | |
| 201 | ||
| 202 |   where first a default type, for example \texttt{UNDEF}, is
 | |
| 203 | given. Then you need to define two typing functions | |
| 204 | ||
| 205 |   \begin{lstlisting}[numbers=none,language=Scala]  
 | |
| 206 | def typ_val(v: KVal, ts: TyEnv) = ??? | |
| 207 | def typ_exp(a: KExp, ts: TyEnv) = ??? | |
| 208 |   \end{lstlisting}
 | |
| 209 | ||
| 210 | Both functions require a typing-environment that updates | |
| 211 | the information about what type each variable, operation | |
| 212 | and so on receives. Once the types are inferred, the | |
| 213 | LLVM-IR code can be generated. Since we are dealing only | |
| 214 | with simple first-order functions, nothing on the scale | |
| 215 | as the `Hindley-Milner' typing-algorithm is needed. I suggest | |
| 216 | to just look at what data is avaliable and generate all | |
| 217 | missing information by simple means. | |
| 218 | ||
| 219 | \item \textbf{Build-In Functions}: The `prelude' comes
 | |
| 220 |   with several build-in functions: \texttt{new\_line()},
 | |
| 221 |   \texttt{skip}, \texttt{print\_int(n)}, \texttt{print\_space()}
 | |
| 222 |   and \texttt{print\_star()}.
 | |
| 223 | \end{itemize}  
 | |
| 205 
0b59588d28d2
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
204diff
changeset | 224 | |
| 200 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 225 | \end{document}
 | 
| 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 226 | |
| 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 227 | %%% Local Variables: | 
| 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 228 | %%% mode: latex | 
| 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 229 | %%% TeX-master: t | 
| 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 230 | %%% End: |