| author | Christian Urban <christian.urban@kcl.ac.uk> | 
| Sat, 11 Oct 2025 08:33:35 +0100 | |
| changeset 1007 | fe2edf2cbd74 | 
| parent 992 | c3dd3a98f919 | 
| permissions | -rw-r--r-- | 
| 630 | 1 | % !TEX program = xelatex | 
| 200 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 2 | \documentclass{article}
 | 
| 299 
6322922aa990
update
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
298diff
changeset | 3 | \usepackage{../style}
 | 
| 865 | 4 | \usepackage{../graphicss}
 | 
| 216 
f5ec7c597c5b
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
214diff
changeset | 5 | \usepackage{../langs}
 | 
| 873 | 6 | \definecolor{navyblue}{rgb}{0.0, 0.0, 0.5}
 | 
| 7 | \definecolor{pansypurple}{rgb}{0.47, 0.09, 0.29}
 | |
| 200 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 8 | |
| 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 9 | \begin{document}
 | 
| 873 | 10 | |
| 11 | ||
| 959 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 12 | %\color{pansypurple}
 | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 13 | %\section*{RESIT / REPLACEMENT}
 | 
| 917 | 14 | |
| 959 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 15 | %{\bf
 | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 16 | %The resit / replacement task is essentially CW5 (listed below) with | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 17 | %the exception that the lexer and parser is already provided. The | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 18 | %parser will generate an AST (see file \texttt{fun\_llvm.sc}). Your task
 | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 19 | %is to generate an AST for the K-intermediate language and supply | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 20 | %sufficient type annotations such that you can generate valid code for | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 21 | %the LLVM-IR. The submission deadline is 4th August at 16:00. At the | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 22 | %deadline, please send me an email containing a zip-file with your | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 23 | %files. | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 24 | %Feel free to reuse the files I have uploaded on KEATS (especially | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 25 | %the files generating simple LLVM-IR code). Of help might also be the | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 26 | %videos of Week~10.\bigskip | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 27 | % | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 28 | %\noindent | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 29 | %Good Luck!}\smallskip\\ | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 30 | %\noindent | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 31 | %Christian | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 32 | %\color{black}
 | 
| 873 | 33 | |
| 200 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 34 | |
| 836 | 35 | \section*{Coursework 5}
 | 
| 200 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 36 | |
| 722 | 37 | |
| 38 | ||
| 989 | 39 | \noindent This coursework is worth 20\% and is due on \cwFIVE{} at
 | 
| 877 | 40 | 16:00. You are asked to implement a compiler targeting the LLVM-IR. | 
| 820 | 41 | Be careful that this CW needs some material about the LLVM-IR | 
| 42 | that has not been shown in the lectures and your own experiments | |
| 959 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 43 | and research might be required. You can find information about the LLVM-IR at | 
| 820 | 44 | |
| 45 | \begin{itemize}
 | |
| 46 | \item \url{https://bit.ly/3rheZYr}
 | |
| 47 | \item \url{https://llvm.org/docs/LangRef.html}  
 | |
| 48 | \end{itemize}  
 | |
| 49 | ||
| 50 | \noindent | |
| 51 | You can do the implementation of your compiler in any programming | |
| 748 | 52 | language you like, but you need to submit the source code with which | 
| 820 | 53 | you generated the LLVM-IR files, otherwise a mark of 0\% will be | 
| 853 | 54 | awarded. You are asked to submit the code of your compiler, but also | 
| 858 | 55 | the generated \texttt{.ll} files. No PDF is needed for this
 | 
| 56 | coursework. You should use the lexer and parser from the previous | |
| 57 | courseworks, but you need to make some modifications to them for the | |
| 58 | `typed' version of the Fun-language. I will award up to 5\% if a lexer | |
| 959 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 59 | and a parser are correctly implemented. | 
| 853 | 60 | |
| 959 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 61 | %At the end, please package | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 62 | %everything(!) in a zip-file that creates a directory with the name | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 63 | % | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 64 | %\begin{center}
 | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 65 | %\texttt{YournameYourFamilyname}
 | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 66 | %\end{center}
 | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 67 | % | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 68 | %\noindent | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 69 | %on my end. | 
| 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 70 | You will be marked according to the input files | 
| 855 | 71 | |
| 72 | \begin{itemize}
 | |
| 987 | 73 | \item\href{https://cflmark.nms.kcl.ac.uk/hg/afl-material/raw-file/tip/progs/sqr.fun}{sqr.fun}  
 | 
| 74 | \item\href{https://cflmark.nms.kcl.ac.uk/hg/afl-material/raw-file/tip/progs/fact.fun}{fact.fun}
 | |
| 75 | \item\href{https://cflmark.nms.kcl.ac.uk/hg/afl-material/raw-file/tip/progs/mand.fun}{mand.fun}
 | |
| 76 | \item\href{https://cflmark.nms.kcl.ac.uk/hg/afl-material/raw-file/tip/progs/mand2.fun}{mand2.fun}
 | |
| 77 | \item\href{https://cflmark.nms.kcl.ac.uk/hg/afl-material/raw-file/tip/progs/hanoi.fun}{hanoi.fun}   
 | |
| 855 | 78 | \end{itemize}  
 | 
| 79 | ||
| 80 | \noindent | |
| 959 
64ec1884d860
updated and added pascal.while file
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
943diff
changeset | 81 | which are uploaded to KEATS and Github. | 
| 200 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 82 | |
| 750 | 83 | \subsection*{Disclaimer\alert}
 | 
| 358 
b3129cff41e9
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
333diff
changeset | 84 | |
| 750 | 85 | It should be understood that the work you submit represents your own | 
| 86 | effort. You have not copied from anyone else. An exception is the | |
| 87 | Scala code I showed during the lectures or uploaded to KEATS, which | |
| 992 | 88 | you can both use. You can also use your own code from CW~1 -- | 
| 987 | 89 | CW~4. %But do not | 
| 90 | %be tempted to ask Github Copilot for help or do any other | |
| 91 | %shenanigans like this! | |
| 200 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 92 | |
| 299 
6322922aa990
update
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
298diff
changeset | 93 | |
| 820 | 94 | \subsection*{Task}
 | 
| 95 | ||
| 992 | 96 | The goal is to lex and parse five Fun-programs, including the | 
| 853 | 97 | Mandelbrot program shown in Figure~\ref{mand}, and generate
 | 
| 98 | corresponding code for the LLVM-IR. Unfortunately the calculations for | |
| 99 | the Mandelbrot Set require floating point arithmetic and therefore we | |
| 100 | cannot be as simple-minded about types as we have been so far | |
| 101 | (remember the LLVM-IR is a fully-typed language and needs to know the | |
| 102 | exact types of each expression). The idea is to deal appropriately | |
| 103 | with three types, namely \texttt{Int}, \texttt{Double} and
 | |
| 104 | \texttt{Void} (they are represented in the LLVM-IR as \texttt{i32},
 | |
| 105 | \texttt{double} and \texttt{void}). You need to extend the lexer and
 | |
| 106 | parser accordingly in order to deal with type annotations. The | |
| 107 | Fun-language includes global constants, such as | |
| 820 | 108 | |
| 109 | \begin{lstlisting}[numbers=none]
 | |
| 110 | val Ymin: Double = -1.3; | |
| 111 | val Maxiters: Int = 1000; | |
| 112 | \end{lstlisting}
 | |
| 113 | ||
| 114 | \noindent | |
| 858 | 115 | where you can assume that they are `normal' identifiers, just | 
| 820 | 116 | starting with a capital letter---all other identifiers should have | 
| 117 | lower-case letters. Function definitions can take arguments of | |
| 118 | type \texttt{Int} or \texttt{Double}, and need to specify a return
 | |
| 119 | type, which can be \texttt{Void}, for example
 | |
| 120 | ||
| 121 | \begin{lstlisting}[numbers=none]
 | |
| 122 | def foo(n: Int, x: Double) : Double = ... | |
| 853 | 123 | def id(n: Int) : Int = ... | 
| 820 | 124 | def bar() : Void = ... | 
| 125 | \end{lstlisting}
 | |
| 126 | ||
| 127 | \noindent | |
| 128 | The idea is to record all typing information that is given | |
| 853 | 129 | in the Fun-program, but then delay any further typing inference to | 
| 820 | 130 | after the CPS-translation. That means the parser should | 
| 131 | generate ASTs given by the Scala dataypes: | |
| 132 | ||
| 133 | \begin{lstlisting}[numbers=none,language=Scala]
 | |
| 134 | abstract class Exp | |
| 135 | abstract class BExp | |
| 136 | abstract class Decl | |
| 137 | ||
| 138 | case class Def(name: String, args: List[(String, String)], | |
| 139 | ty: String, body: Exp) extends Decl | |
| 140 | case class Main(e: Exp) extends Decl | |
| 141 | case class Const(name: String, v: Int) extends Decl | |
| 868 
8fb3b6d3be70
updated to Doubles trhoughout
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
865diff
changeset | 142 | case class FConst(name: String, x: Double) extends Decl | 
| 820 | 143 | |
| 144 | case class Call(name: String, args: List[Exp]) extends Exp | |
| 145 | case class If(a: BExp, e1: Exp, e2: Exp) extends Exp | |
| 146 | case class Var(s: String) extends Exp | |
| 853 | 147 | case class Num(i: Int) extends Exp // integer numbers | 
| 868 
8fb3b6d3be70
updated to Doubles trhoughout
 Christian Urban <christian.urban@kcl.ac.uk> parents: 
865diff
changeset | 148 | case class FNum(i: Double) extends Exp // floating numbers | 
| 857 | 149 | case class ChConst(c: Int) extends Exp // char constants | 
| 820 | 150 | case class Aop(o: String, a1: Exp, a2: Exp) extends Exp | 
| 151 | case class Sequence(e1: Exp, e2: Exp) extends Exp | |
| 152 | case class Bop(o: String, a1: Exp, a2: Exp) extends BExp | |
| 153 | \end{lstlisting}
 | |
| 154 | ||
| 155 | \noindent | |
| 156 | This datatype distinguishes whether the global constant is an integer | |
| 157 | constant or floating constant. Also a function definition needs to | |
| 158 | record the return type of the function, namely the argument | |
| 159 | \texttt{ty} in \texttt{Def}, and the arguments consist of an pairs of
 | |
| 160 | identifier names and types (\texttt{Int} or \texttt{Double}). The hard
 | |
| 161 | part of the CW is to design the K-intermediate language and infer all | |
| 162 | necessary types in order to generate LLVM-IR code. You can check | |
| 163 | your LLVM-IR code by running it with the interpreter \texttt{lli}.
 | |
| 164 | ||
| 165 | \begin{figure}[t]
 | |
| 857 | 166 | \lstinputlisting[language=Scala]{../cwtests/cw05/mand.fun}
 | 
| 820 | 167 | \caption{The Mandelbrot program in the `typed' Fun-language.\label{mand}}
 | 
| 168 | \end{figure}
 | |
| 169 | ||
| 170 | \begin{figure}[t]
 | |
| 943 | 171 | \includegraphics[scale=0.35]{../solutions/cw5/out.png}
 | 
| 865 | 172 | \caption{Ascii output of the Mandelbrot program.\label{mand2}}
 | 
| 820 | 173 | \end{figure}
 | 
| 174 | ||
| 853 | 175 | Also note that the second version of the Mandelbrot program and also | 
| 858 | 176 | the Tower of Hanoi program use character constants, like \texttt{'a'},
 | 
| 853 | 177 | \texttt{'1'}, \texttt{'$\backslash$n'} and so on. When they are tokenised,
 | 
| 178 | such characters should be interpreted as the corresponding ASCII code (an | |
| 179 | integer), such that we can use them in calculations like \texttt{'a' + 10}
 | |
| 180 | where the result should be 107. As usual, the character \texttt{'$\backslash$n'} is the
 | |
| 181 | ASCII code 10. | |
| 182 | ||
| 183 | ||
| 820 | 184 | \subsection*{LLVM-IR}
 | 
| 185 | ||
| 186 | There are some subtleties in the LLVM-IR you need to be aware of: | |
| 187 | ||
| 188 | \begin{itemize}
 | |
| 189 | \item \textbf{Global constants}: While global constants such as
 | |
| 190 | ||
| 191 | \begin{lstlisting}[numbers=none]  
 | |
| 192 | val Max : Int = 10; | |
| 193 | \end{lstlisting}
 | |
| 200 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 194 | |
| 820 | 195 | \noindent | 
| 196 | can be easily defined in the LLVM-IR as follows | |
| 197 | ||
| 198 | \begin{lstlisting}[numbers=none]  
 | |
| 199 | @Max = global i32 10 | |
| 200 | \end{lstlisting}
 | |
| 201 | ||
| 202 | \noindent | |
| 203 | they cannot easily be referenced. If you want to use | |
| 204 | this constant then you need to generate code such as | |
| 205 | ||
| 206 | \begin{lstlisting}[numbers=none]  
 | |
| 207 | %tmp_22 = load i32, i32* @Max | |
| 208 | \end{lstlisting}
 | |
| 209 | ||
| 210 | \noindent | |
| 211 | first, which treats \texttt{@Max} as an Integer-pointer (type
 | |
| 212 | \texttt{i32*}) that needs to be loaded into a local variable,
 | |
| 213 | here \texttt{\%tmp\_22}.
 | |
| 214 | ||
| 215 | \item \textbf{Void-Functions}: While integer and double functions
 | |
| 216 | can easily be called and their results can be allocated to a | |
| 217 | temporary variable: | |
| 218 | ||
| 219 |   \begin{lstlisting}[numbers=none]  
 | |
| 220 | %tmp_23 = call i32 @sqr (i32 %n) | |
| 221 |   \end{lstlisting}
 | |
| 222 | ||
| 223 | void-functions cannot be allocated to a variable. They need to be | |
| 224 | called just as | |
| 225 | ||
| 226 |   \begin{lstlisting}[numbers=none]  
 | |
| 227 | call void @print_int (i32 %tmp_23) | |
| 228 | \end{lstlisting}
 | |
| 229 | ||
| 230 | \item \textbf{Floating-Point Operations}: While integer operations
 | |
| 231 | are specified in the LLVM-IR as | |
| 201 
c813506e0ee8
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
200diff
changeset | 232 | |
| 820 | 233 |   \begin{lstlisting}[numbers=none,language=Scala]
 | 
| 234 |   def compile_op(op: String) = op match {
 | |
| 235 | case "+" => "add i32 " | |
| 236 | case "*" => "mul i32 " | |
| 237 | case "-" => "sub i32 " | |
| 238 | case "==" => "icmp eq i32 " | |
| 853 | 239 | case "!=" => "icmp ne i32 " | 
| 820 | 240 | case "<=" => "icmp sle i32 " // signed less or equal | 
| 241 | case "<" => "icmp slt i32 " // signed less than | |
| 242 |   }\end{lstlisting}
 | |
| 243 | ||
| 244 | the corresponding operations on doubles are | |
| 245 | ||
| 246 |   \begin{lstlisting}[numbers=none,language=Scala]
 | |
| 247 |   def compile_dop(op: String) = op match {
 | |
| 248 | case "+" => "fadd double " | |
| 249 | case "*" => "fmul double " | |
| 250 | case "-" => "fsub double " | |
| 251 | case "==" => "fcmp oeq double " | |
| 853 | 252 | case "!=" => "fcmp one double " | 
| 820 | 253 | case "<=" => "fcmp ole double " | 
| 254 | case "<" => "fcmp olt double " | |
| 255 |   }\end{lstlisting}
 | |
| 256 | ||
| 257 | \item \textbf{Typing}: In order to leave the CPS-translations
 | |
| 258 | as is, it makes sense to defer the full type-inference to the | |
| 259 | K-intermediate-language. For this it is good to define | |
| 260 |   the \texttt{KVar} constructor as
 | |
| 261 | ||
| 262 | \begin{lstlisting}[numbers=none,language=Scala]  
 | |
| 263 | case class KVar(s: String, ty: Ty = "UNDEF") extends KVal\end{lstlisting}
 | |
| 264 | ||
| 265 |   where first a default type, for example \texttt{UNDEF}, is
 | |
| 266 | given. Then you need to define two typing functions | |
| 267 | ||
| 268 |   \begin{lstlisting}[numbers=none,language=Scala]  
 | |
| 269 | def typ_val(v: KVal, ts: TyEnv) = ??? | |
| 270 | def typ_exp(a: KExp, ts: TyEnv) = ??? | |
| 271 |   \end{lstlisting}
 | |
| 272 | ||
| 273 | Both functions require a typing-environment that updates | |
| 274 | the information about what type each variable, operation | |
| 275 | and so on receives. Once the types are inferred, the | |
| 276 | LLVM-IR code can be generated. Since we are dealing only | |
| 277 | with simple first-order functions, nothing on the scale | |
| 278 | as the `Hindley-Milner' typing-algorithm is needed. I suggest | |
| 279 | to just look at what data is avaliable and generate all | |
| 836 | 280 | missing information by ``simple means''\ldots rather than | 
| 281 | looking at the literature which solves the problem | |
| 282 | with much heavier machinery. | |
| 820 | 283 | |
| 987 | 284 | \item \textbf{Built-In Functions}: The `prelude' comes
 | 
| 285 |   with several built-in functions: \texttt{new\_line()},
 | |
| 853 | 286 |   \texttt{skip}, \texttt{print\_int(n)}, \texttt{print\_space()},
 | 
| 987 | 287 |   \texttt{print\_star()} as well as \texttt{print\_char(n)}. You 
 | 
| 288 | can find the `prelude' for | |
| 289 |   example in the file \texttt{sqr.ll}. When printing strings, you 
 | |
| 290 |   can assume programs only contain string \emph{constants}. (see
 | |
| 291 | for example sqr.fun and hanoi.fun). | |
| 820 | 292 | \end{itemize}  
 | 
| 205 
0b59588d28d2
updated
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: 
204diff
changeset | 293 | |
| 200 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 294 | \end{document}
 | 
| 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 295 | |
| 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 296 | %%% Local Variables: | 
| 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 297 | %%% mode: latex | 
| 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 298 | %%% TeX-master: t | 
| 
7415871b1ef5
added
 Christian Urban <christian dot urban at kcl dot ac dot uk> parents: diff
changeset | 299 | %%% End: |