cws/cw05.tex
changeset 820 7fd1f611c21d
parent 752 c0bdd4ad69ca
child 821 f914b9476dc7
equal deleted inserted replaced
819:fd88a0656164 820:7fd1f611c21d
     1 % !TEX program = xelatex
     1 % !TEX program = xelatex
     2 \documentclass{article}
     2 \documentclass{article}
     3 \usepackage{../style}
     3 \usepackage{../style}
       
     4 \usepackage{../graphics}
     4 \usepackage{../langs}
     5 \usepackage{../langs}
     5 
     6 
     6 \begin{document}
     7 \begin{document}
     7 
     8 
     8 \section*{Coursework 5}
     9 \section*{Coursework 5\footnote{\today}}
     9 
    10 
    10 
    11 
    11 
    12 
    12 \noindent This coursework is worth 12\% and is due on \cwFIVE{} at
    13 \noindent This coursework is worth 12\% and is due on \cwFIVE{} at
    13 18:00. You are asked to implement a compiler targetting the LLVM-IR.
    14 18:00. You are asked to implement a compiler targeting the LLVM-IR.
    14 You can do the implementation in any programming
    15 Be careful that this CW needs some material about the LLVM-IR
       
    16 that has not been shown in the lectures and your own experiments
       
    17 might be required. You can find information about the LLVM-IR at
       
    18 
       
    19 \begin{itemize}
       
    20 \item \url{https://bit.ly/3rheZYr}
       
    21 \item \url{https://llvm.org/docs/LangRef.html}  
       
    22 \end{itemize}  
       
    23 
       
    24 \noindent
       
    25 You can do the implementation of your compiler in any programming
    15 language you like, but you need to submit the source code with which
    26 language you like, but you need to submit the source code with which
    16 you answered the questions, otherwise a mark of 0\% will be
    27 you generated the LLVM-IR files, otherwise a mark of 0\% will be
    17 awarded. You should use the lexer from the previous coursework for the
    28 awarded. You should use the lexer and parser from the previous
    18 parser.  Please package everything(!) in a zip-file that creates a
    29 courseworks, but you need to make some modifications to them for the
    19 directory with the name \texttt{YournameYourFamilyname} on my end.
    30 `typed' fun-language. I will award up to 4\% if a lexer and parser are
       
    31 implemented. At the end, please package everything(!) in a zip-file
       
    32 that creates a directory with the name \texttt{YournameYourFamilyname}
       
    33 on my end.
    20 
    34 
    21 \subsection*{Disclaimer\alert}
    35 \subsection*{Disclaimer\alert}
    22 
    36 
    23 It should be understood that the work you submit represents your own
    37 It should be understood that the work you submit represents your own
    24 effort. You have not copied from anyone else. An exception is the
    38 effort. You have not copied from anyone else. An exception is the
    25 Scala code I showed during the lectures or uploaded to KEATS, which
    39 Scala code I showed during the lectures or uploaded to KEATS, which
    26 you can both use. You can also use your own code from the CW~1 --
    40 you can both use. You can also use your own code from the CW~1 --
    27 CW~4.
    41 CW~4.
    28 
    42 
    29 
    43 
    30 \subsection*{Question 1}
    44 \subsection*{Task}
    31 
    45 
    32 \subsection*{Question 2}
    46 The goal is to lex and parse the Mandelbrot program shown in
    33 
    47 Figure~\ref{mand} and generate corresponding code for the
    34 \subsection*{Question 3}
    48 LLVM-IR. Unfortunately the calculations for the Mandelbrot set require
       
    49 floating point arithmetic and therefore we cannot be as simple-minded
       
    50 about types as we have been so far (remember the LLVM-IR is a
       
    51 fully-typed language and needs to know the exact types of each
       
    52 expression). The idea is to deal appropriately with three types,
       
    53 namely \texttt{Int}, \texttt{Double} and \texttt{Void} (they are
       
    54 represented in the LLVM-IR as \texttt{i32}, \texttt{double} and
       
    55 \texttt{void}). You need to extend the lexer and parser accordingly in
       
    56 order to deal with type annotations. The Fun-language includes global
       
    57 constants, such as
       
    58 
       
    59 \begin{lstlisting}[numbers=none]
       
    60   val Ymin: Double = -1.3;
       
    61   val Maxiters: Int = 1000;
       
    62 \end{lstlisting}
       
    63 
       
    64 \noindent
       
    65 where you want to assume that they are `normal' identifiers, just
       
    66 starting with a capital letter---all other identifiers should have
       
    67 lower-case letters. Function definitions can take arguments of
       
    68 type \texttt{Int} or \texttt{Double}, and need to specify a return
       
    69 type, which can be \texttt{Void}, for example
       
    70 
       
    71 \begin{lstlisting}[numbers=none]
       
    72   def foo(n: Int, x: Double) : Double = ...
       
    73   def bar() : Void = ...
       
    74 \end{lstlisting}
       
    75 
       
    76 \noindent
       
    77 The idea is to record all typing information that is given
       
    78 in the program, but then delay any further typing inference to
       
    79 after the CPS-translation. That means the parser should
       
    80 generate ASTs given by the Scala dataypes:
       
    81 
       
    82 \begin{lstlisting}[numbers=none,language=Scala]
       
    83 abstract class Exp 
       
    84 abstract class BExp  
       
    85 abstract class Decl 
       
    86 
       
    87 case class Def(name: String, args: List[(String, String)],
       
    88                ty: String, body: Exp) extends Decl
       
    89 case class Main(e: Exp) extends Decl
       
    90 case class Const(name: String, v: Int) extends Decl
       
    91 case class FConst(name: String, x: Float) extends Decl
       
    92 
       
    93 case class Call(name: String, args: List[Exp]) extends Exp
       
    94 case class If(a: BExp, e1: Exp, e2: Exp) extends Exp
       
    95 case class Var(s: String) extends Exp
       
    96 case class Num(i: Int) extends Exp    // integer numbers
       
    97 case class FNum(i: Float) extends Exp // floating numbers
       
    98 case class Aop(o: String, a1: Exp, a2: Exp) extends Exp
       
    99 case class Sequence(e1: Exp, e2: Exp) extends Exp
       
   100 case class Bop(o: String, a1: Exp, a2: Exp) extends BExp
       
   101 \end{lstlisting}
       
   102 
       
   103 \noindent
       
   104 This datatype distinguishes whether the global constant is an integer
       
   105 constant or floating constant. Also a function definition needs to
       
   106 record the return type of the function, namely the argument
       
   107 \texttt{ty} in \texttt{Def}, and the arguments consist of an pairs of
       
   108 identifier names and types (\texttt{Int} or \texttt{Double}). The hard
       
   109 part of the CW is to design the K-intermediate language and infer all
       
   110 necessary types in order to generate LLVM-IR code. You can check
       
   111 your LLVM-IR code by running it with the interpreter \texttt{lli}.
       
   112 
       
   113 \begin{figure}[t]
       
   114 \lstinputlisting[language=Scala]{../progs/fun2/mand.fun}
       
   115 \caption{The Mandelbrot program in the `typed' Fun-language.\label{mand}}
       
   116 \end{figure}
       
   117 
       
   118 \begin{figure}[t]
       
   119 \includegraphics[scale=0.35]{../progs/fun2/out.png}
       
   120 \caption{Ascii output of the Mandelbrot program.\label{mand}}
       
   121 \end{figure}
       
   122 
       
   123 \subsection*{LLVM-IR}
       
   124 
       
   125 There are some subtleties in the LLVM-IR you need to be aware of:
       
   126 
       
   127 \begin{itemize}
       
   128 \item \textbf{Global constants}: While global constants such as
       
   129 
       
   130 \begin{lstlisting}[numbers=none]  
       
   131 val Max : Int = 10;
       
   132 \end{lstlisting}
       
   133 
       
   134 \noindent
       
   135 can be easily defined in the LLVM-IR as follows
       
   136 
       
   137 \begin{lstlisting}[numbers=none]  
       
   138 @Max = global i32 10
       
   139 \end{lstlisting}
       
   140 
       
   141 \noindent
       
   142 they cannot easily be referenced. If you want to use
       
   143 this constant then you need to generate code such as
       
   144 
       
   145 \begin{lstlisting}[numbers=none]  
       
   146 %tmp_22 = load i32, i32* @Max
       
   147 \end{lstlisting}
       
   148 
       
   149 \noindent
       
   150 first, which treats \texttt{@Max} as an Integer-pointer (type
       
   151 \texttt{i32*}) that needs to be loaded into a local variable,
       
   152 here \texttt{\%tmp\_22}.
       
   153 
       
   154 \item \textbf{Void-Functions}: While integer and double functions
       
   155   can easily be called and their results can be allocated to a
       
   156   temporary variable:
       
   157 
       
   158   \begin{lstlisting}[numbers=none]  
       
   159    %tmp_23 = call i32 @sqr (i32 %n)
       
   160   \end{lstlisting}
       
   161 
       
   162   void-functions cannot be allocated to a variable. They need to be
       
   163   called just as
       
   164 
       
   165   \begin{lstlisting}[numbers=none]  
       
   166   call void @print_int (i32 %tmp_23)
       
   167 \end{lstlisting}
       
   168 
       
   169 \item \textbf{Floating-Point Operations}: While integer operations
       
   170   are specified in the LLVM-IR as
       
   171 
       
   172   \begin{lstlisting}[numbers=none,language=Scala]
       
   173   def compile_op(op: String) = op match {
       
   174     case "+" => "add i32 "
       
   175     case "*" => "mul i32 "
       
   176     case "-" => "sub i32 "
       
   177     case "==" => "icmp eq i32 "
       
   178     case "<=" => "icmp sle i32 " // signed less or equal
       
   179     case "<"  => "icmp slt i32 " // signed less than
       
   180   }\end{lstlisting}
       
   181 
       
   182   the corresponding operations on doubles are
       
   183 
       
   184   \begin{lstlisting}[numbers=none,language=Scala]
       
   185   def compile_dop(op: String) = op match {
       
   186     case "+" => "fadd double "
       
   187     case "*" => "fmul double "
       
   188     case "-" => "fsub double "
       
   189     case "==" => "fcmp oeq double "
       
   190     case "<=" => "fcmp ole double "   
       
   191     case "<"  => "fcmp olt double "   
       
   192   }\end{lstlisting}
       
   193 
       
   194 \item \textbf{Typing}: In order to leave the CPS-translations
       
   195   as is, it makes sense to defer the full type-inference to the
       
   196   K-intermediate-language. For this it is good to define
       
   197   the \texttt{KVar} constructor as
       
   198 
       
   199 \begin{lstlisting}[numbers=none,language=Scala]  
       
   200 case class KVar(s: String, ty: Ty = "UNDEF") extends KVal\end{lstlisting}
       
   201 
       
   202   where first a default type, for example \texttt{UNDEF}, is
       
   203   given. Then you need to define two typing functions
       
   204 
       
   205   \begin{lstlisting}[numbers=none,language=Scala]  
       
   206     def typ_val(v: KVal, ts: TyEnv) = ???
       
   207     def typ_exp(a: KExp, ts: TyEnv) = ???
       
   208   \end{lstlisting}
       
   209 
       
   210   Both functions require a typing-environment that updates
       
   211   the information about what type each variable, operation
       
   212   and so on receives. Once the types are inferred, the
       
   213   LLVM-IR code can be generated. Since we are dealing only
       
   214   with simple first-order functions, nothing on the scale
       
   215   as the `Hindley-Milner' typing-algorithm is needed. I suggest
       
   216   to just look at what data is avaliable and generate all
       
   217   missing information by simple means.
       
   218 
       
   219 \item \textbf{Build-In Functions}: The `prelude' comes
       
   220   with several build-in functions: \texttt{new\_line()},
       
   221   \texttt{skip}, \texttt{print\_int(n)}, \texttt{print\_space()}
       
   222   and \texttt{print\_star()}.
       
   223 \end{itemize}  
    35 
   224 
    36 \end{document}
   225 \end{document}
    37 
   226 
    38 %%% Local Variables: 
   227 %%% Local Variables: 
    39 %%% mode: latex
   228 %%% mode: latex