cws/cw05.tex
changeset 820 7fd1f611c21d
parent 752 c0bdd4ad69ca
child 821 f914b9476dc7
--- a/cws/cw05.tex	Thu Dec 17 17:43:40 2020 +0000
+++ b/cws/cw05.tex	Sat Dec 19 00:13:58 2020 +0000
@@ -1,22 +1,36 @@
 % !TEX program = xelatex
 \documentclass{article}
 \usepackage{../style}
+\usepackage{../graphics}
 \usepackage{../langs}
 
 \begin{document}
 
-\section*{Coursework 5}
+\section*{Coursework 5\footnote{\today}}
 
 
 
 \noindent This coursework is worth 12\% and is due on \cwFIVE{} at
-18:00. You are asked to implement a compiler targetting the LLVM-IR.
-You can do the implementation in any programming
+18:00. You are asked to implement a compiler targeting the LLVM-IR.
+Be careful that this CW needs some material about the LLVM-IR
+that has not been shown in the lectures and your own experiments
+might be required. You can find information about the LLVM-IR at
+
+\begin{itemize}
+\item \url{https://bit.ly/3rheZYr}
+\item \url{https://llvm.org/docs/LangRef.html}  
+\end{itemize}  
+
+\noindent
+You can do the implementation of your compiler in any programming
 language you like, but you need to submit the source code with which
-you answered the questions, otherwise a mark of 0\% will be
-awarded. You should use the lexer from the previous coursework for the
-parser.  Please package everything(!) in a zip-file that creates a
-directory with the name \texttt{YournameYourFamilyname} on my end.
+you generated the LLVM-IR files, otherwise a mark of 0\% will be
+awarded. You should use the lexer and parser from the previous
+courseworks, but you need to make some modifications to them for the
+`typed' fun-language. I will award up to 4\% if a lexer and parser are
+implemented. At the end, please package everything(!) in a zip-file
+that creates a directory with the name \texttt{YournameYourFamilyname}
+on my end.
 
 \subsection*{Disclaimer\alert}
 
@@ -27,11 +41,186 @@
 CW~4.
 
 
-\subsection*{Question 1}
+\subsection*{Task}
+
+The goal is to lex and parse the Mandelbrot program shown in
+Figure~\ref{mand} and generate corresponding code for the
+LLVM-IR. Unfortunately the calculations for the Mandelbrot set require
+floating point arithmetic and therefore we cannot be as simple-minded
+about types as we have been so far (remember the LLVM-IR is a
+fully-typed language and needs to know the exact types of each
+expression). The idea is to deal appropriately with three types,
+namely \texttt{Int}, \texttt{Double} and \texttt{Void} (they are
+represented in the LLVM-IR as \texttt{i32}, \texttt{double} and
+\texttt{void}). You need to extend the lexer and parser accordingly in
+order to deal with type annotations. The Fun-language includes global
+constants, such as
+
+\begin{lstlisting}[numbers=none]
+  val Ymin: Double = -1.3;
+  val Maxiters: Int = 1000;
+\end{lstlisting}
+
+\noindent
+where you want to assume that they are `normal' identifiers, just
+starting with a capital letter---all other identifiers should have
+lower-case letters. Function definitions can take arguments of
+type \texttt{Int} or \texttt{Double}, and need to specify a return
+type, which can be \texttt{Void}, for example
+
+\begin{lstlisting}[numbers=none]
+  def foo(n: Int, x: Double) : Double = ...
+  def bar() : Void = ...
+\end{lstlisting}
+
+\noindent
+The idea is to record all typing information that is given
+in the program, but then delay any further typing inference to
+after the CPS-translation. That means the parser should
+generate ASTs given by the Scala dataypes:
+
+\begin{lstlisting}[numbers=none,language=Scala]
+abstract class Exp 
+abstract class BExp  
+abstract class Decl 
+
+case class Def(name: String, args: List[(String, String)],
+               ty: String, body: Exp) extends Decl
+case class Main(e: Exp) extends Decl
+case class Const(name: String, v: Int) extends Decl
+case class FConst(name: String, x: Float) extends Decl
+
+case class Call(name: String, args: List[Exp]) extends Exp
+case class If(a: BExp, e1: Exp, e2: Exp) extends Exp
+case class Var(s: String) extends Exp
+case class Num(i: Int) extends Exp    // integer numbers
+case class FNum(i: Float) extends Exp // floating numbers
+case class Aop(o: String, a1: Exp, a2: Exp) extends Exp
+case class Sequence(e1: Exp, e2: Exp) extends Exp
+case class Bop(o: String, a1: Exp, a2: Exp) extends BExp
+\end{lstlisting}
+
+\noindent
+This datatype distinguishes whether the global constant is an integer
+constant or floating constant. Also a function definition needs to
+record the return type of the function, namely the argument
+\texttt{ty} in \texttt{Def}, and the arguments consist of an pairs of
+identifier names and types (\texttt{Int} or \texttt{Double}). The hard
+part of the CW is to design the K-intermediate language and infer all
+necessary types in order to generate LLVM-IR code. You can check
+your LLVM-IR code by running it with the interpreter \texttt{lli}.
+
+\begin{figure}[t]
+\lstinputlisting[language=Scala]{../progs/fun2/mand.fun}
+\caption{The Mandelbrot program in the `typed' Fun-language.\label{mand}}
+\end{figure}
+
+\begin{figure}[t]
+\includegraphics[scale=0.35]{../progs/fun2/out.png}
+\caption{Ascii output of the Mandelbrot program.\label{mand}}
+\end{figure}
+
+\subsection*{LLVM-IR}
+
+There are some subtleties in the LLVM-IR you need to be aware of:
+
+\begin{itemize}
+\item \textbf{Global constants}: While global constants such as
+
+\begin{lstlisting}[numbers=none]  
+val Max : Int = 10;
+\end{lstlisting}
 
-\subsection*{Question 2}
+\noindent
+can be easily defined in the LLVM-IR as follows
+
+\begin{lstlisting}[numbers=none]  
+@Max = global i32 10
+\end{lstlisting}
+
+\noindent
+they cannot easily be referenced. If you want to use
+this constant then you need to generate code such as
+
+\begin{lstlisting}[numbers=none]  
+%tmp_22 = load i32, i32* @Max
+\end{lstlisting}
+
+\noindent
+first, which treats \texttt{@Max} as an Integer-pointer (type
+\texttt{i32*}) that needs to be loaded into a local variable,
+here \texttt{\%tmp\_22}.
+
+\item \textbf{Void-Functions}: While integer and double functions
+  can easily be called and their results can be allocated to a
+  temporary variable:
+
+  \begin{lstlisting}[numbers=none]  
+   %tmp_23 = call i32 @sqr (i32 %n)
+  \end{lstlisting}
+
+  void-functions cannot be allocated to a variable. They need to be
+  called just as
+
+  \begin{lstlisting}[numbers=none]  
+  call void @print_int (i32 %tmp_23)
+\end{lstlisting}
+
+\item \textbf{Floating-Point Operations}: While integer operations
+  are specified in the LLVM-IR as
 
-\subsection*{Question 3}
+  \begin{lstlisting}[numbers=none,language=Scala]
+  def compile_op(op: String) = op match {
+    case "+" => "add i32 "
+    case "*" => "mul i32 "
+    case "-" => "sub i32 "
+    case "==" => "icmp eq i32 "
+    case "<=" => "icmp sle i32 " // signed less or equal
+    case "<"  => "icmp slt i32 " // signed less than
+  }\end{lstlisting}
+
+  the corresponding operations on doubles are
+
+  \begin{lstlisting}[numbers=none,language=Scala]
+  def compile_dop(op: String) = op match {
+    case "+" => "fadd double "
+    case "*" => "fmul double "
+    case "-" => "fsub double "
+    case "==" => "fcmp oeq double "
+    case "<=" => "fcmp ole double "   
+    case "<"  => "fcmp olt double "   
+  }\end{lstlisting}
+
+\item \textbf{Typing}: In order to leave the CPS-translations
+  as is, it makes sense to defer the full type-inference to the
+  K-intermediate-language. For this it is good to define
+  the \texttt{KVar} constructor as
+
+\begin{lstlisting}[numbers=none,language=Scala]  
+case class KVar(s: String, ty: Ty = "UNDEF") extends KVal\end{lstlisting}
+
+  where first a default type, for example \texttt{UNDEF}, is
+  given. Then you need to define two typing functions
+
+  \begin{lstlisting}[numbers=none,language=Scala]  
+    def typ_val(v: KVal, ts: TyEnv) = ???
+    def typ_exp(a: KExp, ts: TyEnv) = ???
+  \end{lstlisting}
+
+  Both functions require a typing-environment that updates
+  the information about what type each variable, operation
+  and so on receives. Once the types are inferred, the
+  LLVM-IR code can be generated. Since we are dealing only
+  with simple first-order functions, nothing on the scale
+  as the `Hindley-Milner' typing-algorithm is needed. I suggest
+  to just look at what data is avaliable and generate all
+  missing information by simple means.
+
+\item \textbf{Build-In Functions}: The `prelude' comes
+  with several build-in functions: \texttt{new\_line()},
+  \texttt{skip}, \texttt{print\_int(n)}, \texttt{print\_space()}
+  and \texttt{print\_star()}.
+\end{itemize}  
 
 \end{document}