% !TEX program = xelatex\documentclass{article}\usepackage{../style}\usepackage{../graphicss}\usepackage{../langs}\definecolor{navyblue}{rgb}{0.0, 0.0, 0.5}\definecolor{pansypurple}{rgb}{0.47, 0.09, 0.29}\begin{document}\color{pansypurple}\section*{RESIT / REPLACEMENT}{\bfThe resit / replacement task is essentially CW5 (listed below) withthe exception that the lexer and parser is already provided. Theparser will generate an AST (see file \texttt{fun\_llvm.sc}). Your taskis to generate an AST for the K-intermediate language and supplysufficient type annotations such that you can generate valid code forthe LLVM-IR. The submission deadline is 4th August at 16:00. At thedeadline, please send me an email containing a zip-file with yourfiles.Feel free to reuse the files I have uploaded on KEATS (especiallythe files generating simple LLVM-IR code). Of help might also be thevideos of Week~10.\bigskip\noindentGood Luck!}\smallskip\\\noindentChristian\color{black}\section*{Coursework 5}\noindent This coursework is worth 25\% and is due on \cwFIVE{} at16:00. You are asked to implement a compiler targeting the LLVM-IR.Be careful that this CW needs some material about the LLVM-IRthat has not been shown in the lectures and your own experimentsmight be required. You can find information about the LLVM-IR at\begin{itemize}\item \url{https://bit.ly/3rheZYr}\item \url{https://llvm.org/docs/LangRef.html} \end{itemize} \noindentYou can do the implementation of your compiler in any programminglanguage you like, but you need to submit the source code with whichyou generated the LLVM-IR files, otherwise a mark of 0\% will beawarded. You are asked to submit the code of your compiler, but alsothe generated \texttt{.ll} files. No PDF is needed for thiscoursework. You should use the lexer and parser from the previouscourseworks, but you need to make some modifications to them for the`typed' version of the Fun-language. I will award up to 5\% if a lexerand a parser are correctly implemented. At the end, please packageeverything(!) in a zip-file that creates a directory with the name\begin{center}\texttt{YournameYourFamilyname}\end{center}\noindenton my end. You will be marked according to the input files\begin{itemize}\item\href{https://nms.kcl.ac.uk/christian.urban/cfl/progs/sqr.fun}{sqr.fun} \item\href{https://nms.kcl.ac.uk/christian.urban/cfl/progs/fact.fun}{fact.fun}\item\href{https://nms.kcl.ac.uk/christian.urban/cfl/progs/mand.fun}{mand.fun}\item\href{https://nms.kcl.ac.uk/christian.urban/cfl/progs/mand2.fun}{mand2.fun}\item\href{https://nms.kcl.ac.uk/christian.urban/cfl/progs/hanoi.fun}{hanoi.fun} \end{itemize} \noindentwhich are uploaded to KEATS.\subsection*{Disclaimer\alert}It should be understood that the work you submit represents your owneffort. You have not copied from anyone else. An exception is theScala code I showed during the lectures or uploaded to KEATS, whichyou can both use. You can also use your own code from the CW~1 --CW~4. But do notbe tempted to ask Github Copilot for help or do any othershenanigans like this!\subsection*{Task}The goal is to lex and parse 5 Fun-programs, including theMandelbrot program shown in Figure~\ref{mand}, and generatecorresponding code for the LLVM-IR. Unfortunately the calculations forthe Mandelbrot Set require floating point arithmetic and therefore wecannot be as simple-minded about types as we have been so far(remember the LLVM-IR is a fully-typed language and needs to know theexact types of each expression). The idea is to deal appropriatelywith three types, namely \texttt{Int}, \texttt{Double} and\texttt{Void} (they are represented in the LLVM-IR as \texttt{i32},\texttt{double} and \texttt{void}). You need to extend the lexer andparser accordingly in order to deal with type annotations. TheFun-language includes global constants, such as\begin{lstlisting}[numbers=none] val Ymin: Double = -1.3; val Maxiters: Int = 1000;\end{lstlisting}\noindentwhere you can assume that they are `normal' identifiers, juststarting with a capital letter---all other identifiers should havelower-case letters. Function definitions can take arguments oftype \texttt{Int} or \texttt{Double}, and need to specify a returntype, which can be \texttt{Void}, for example\begin{lstlisting}[numbers=none] def foo(n: Int, x: Double) : Double = ... def id(n: Int) : Int = ... def bar() : Void = ...\end{lstlisting}\noindentThe idea is to record all typing information that is givenin the Fun-program, but then delay any further typing inference toafter the CPS-translation. That means the parser shouldgenerate ASTs given by the Scala dataypes:\begin{lstlisting}[numbers=none,language=Scala]abstract class Exp abstract class BExp abstract class Decl case class Def(name: String, args: List[(String, String)], ty: String, body: Exp) extends Declcase class Main(e: Exp) extends Declcase class Const(name: String, v: Int) extends Declcase class FConst(name: String, x: Double) extends Declcase class Call(name: String, args: List[Exp]) extends Expcase class If(a: BExp, e1: Exp, e2: Exp) extends Expcase class Var(s: String) extends Expcase class Num(i: Int) extends Exp // integer numberscase class FNum(i: Double) extends Exp // floating numberscase class ChConst(c: Int) extends Exp // char constantscase class Aop(o: String, a1: Exp, a2: Exp) extends Expcase class Sequence(e1: Exp, e2: Exp) extends Expcase class Bop(o: String, a1: Exp, a2: Exp) extends BExp\end{lstlisting}\noindentThis datatype distinguishes whether the global constant is an integerconstant or floating constant. Also a function definition needs torecord the return type of the function, namely the argument\texttt{ty} in \texttt{Def}, and the arguments consist of an pairs ofidentifier names and types (\texttt{Int} or \texttt{Double}). The hardpart of the CW is to design the K-intermediate language and infer allnecessary types in order to generate LLVM-IR code. You can checkyour LLVM-IR code by running it with the interpreter \texttt{lli}.\begin{figure}[t]\lstinputlisting[language=Scala]{../cwtests/cw05/mand.fun}\caption{The Mandelbrot program in the `typed' Fun-language.\label{mand}}\end{figure}\begin{figure}[t]\includegraphics[scale=0.35]{../solution/cw5/out.png}\caption{Ascii output of the Mandelbrot program.\label{mand2}}\end{figure}Also note that the second version of the Mandelbrot program and alsothe Tower of Hanoi program use character constants, like \texttt{'a'},\texttt{'1'}, \texttt{'$\backslash$n'} and so on. When they are tokenised,such characters should be interpreted as the corresponding ASCII code (aninteger), such that we can use them in calculations like \texttt{'a' + 10}where the result should be 107. As usual, the character \texttt{'$\backslash$n'} is theASCII code 10.\subsection*{LLVM-IR}There are some subtleties in the LLVM-IR you need to be aware of:\begin{itemize}\item \textbf{Global constants}: While global constants such as\begin{lstlisting}[numbers=none] val Max : Int = 10;\end{lstlisting}\noindentcan be easily defined in the LLVM-IR as follows\begin{lstlisting}[numbers=none] @Max = global i32 10\end{lstlisting}\noindentthey cannot easily be referenced. If you want to usethis constant then you need to generate code such as\begin{lstlisting}[numbers=none] %tmp_22 = load i32, i32* @Max\end{lstlisting}\noindentfirst, which treats \texttt{@Max} as an Integer-pointer (type\texttt{i32*}) that needs to be loaded into a local variable,here \texttt{\%tmp\_22}.\item \textbf{Void-Functions}: While integer and double functions can easily be called and their results can be allocated to a temporary variable: \begin{lstlisting}[numbers=none] %tmp_23 = call i32 @sqr (i32 %n) \end{lstlisting} void-functions cannot be allocated to a variable. They need to be called just as \begin{lstlisting}[numbers=none] call void @print_int (i32 %tmp_23)\end{lstlisting}\item \textbf{Floating-Point Operations}: While integer operations are specified in the LLVM-IR as \begin{lstlisting}[numbers=none,language=Scala] def compile_op(op: String) = op match { case "+" => "add i32 " case "*" => "mul i32 " case "-" => "sub i32 " case "==" => "icmp eq i32 " case "!=" => "icmp ne i32 " case "<=" => "icmp sle i32 " // signed less or equal case "<" => "icmp slt i32 " // signed less than }\end{lstlisting} the corresponding operations on doubles are \begin{lstlisting}[numbers=none,language=Scala] def compile_dop(op: String) = op match { case "+" => "fadd double " case "*" => "fmul double " case "-" => "fsub double " case "==" => "fcmp oeq double " case "!=" => "fcmp one double " case "<=" => "fcmp ole double " case "<" => "fcmp olt double " }\end{lstlisting}\item \textbf{Typing}: In order to leave the CPS-translations as is, it makes sense to defer the full type-inference to the K-intermediate-language. For this it is good to define the \texttt{KVar} constructor as\begin{lstlisting}[numbers=none,language=Scala] case class KVar(s: String, ty: Ty = "UNDEF") extends KVal\end{lstlisting} where first a default type, for example \texttt{UNDEF}, is given. Then you need to define two typing functions \begin{lstlisting}[numbers=none,language=Scala] def typ_val(v: KVal, ts: TyEnv) = ??? def typ_exp(a: KExp, ts: TyEnv) = ??? \end{lstlisting} Both functions require a typing-environment that updates the information about what type each variable, operation and so on receives. Once the types are inferred, the LLVM-IR code can be generated. Since we are dealing only with simple first-order functions, nothing on the scale as the `Hindley-Milner' typing-algorithm is needed. I suggest to just look at what data is avaliable and generate all missing information by ``simple means''\ldots rather than looking at the literature which solves the problem with much heavier machinery.\item \textbf{Build-In Functions}: The `prelude' comes with several build-in functions: \texttt{new\_line()}, \texttt{skip}, \texttt{print\_int(n)}, \texttt{print\_space()}, \texttt{print\_star()} and \texttt{print\_char(n)}. You can find the `prelude' for example in the file \texttt{sqr.ll}.\end{itemize} \end{document}%%% Local Variables: %%% mode: latex%%% TeX-master: t%%% End: