diff -r 8706b846a3e0 -r 568671822d52 cws/cw05.tex --- a/cws/cw05.tex Tue Nov 30 10:16:47 2021 +0000 +++ b/cws/cw05.tex Fri Dec 03 17:45:11 2021 +0000 @@ -25,11 +25,19 @@ You can do the implementation of your compiler in any programming language you like, but you need to submit the source code with which you generated the LLVM-IR files, otherwise a mark of 0\% will be -awarded. You should use the lexer and parser from the previous -courseworks, but you need to make some modifications to them for the -`typed' fun-language. I will award up to 5\% if a lexer and a parser are -correctly implemented. At the end, please package everything(!) in a zip-file -that creates a directory with the name \texttt{YournameYourFamilyname} +awarded. You are asked to submit the code of your compiler, but also +the generated \texttt{.ll} files. You should use the lexer and parser +from the previous courseworks, but you need to make some modifications +to them for the `typed' fun-language. I will award up to 5\% if a +lexer and a parser are correctly implemented. At the end, please +package everything(!) in a zip-file that creates a directory with the +name + +\begin{center} +\texttt{YournameYourFamilyname} +\end{center} + +\noindent on my end. \subsection*{Disclaimer\alert} @@ -43,18 +51,18 @@ \subsection*{Task} -The goal is to lex and parse the Mandelbrot program shown in -Figure~\ref{mand} and generate corresponding code for the -LLVM-IR. Unfortunately the calculations for the Mandelbrot set require -floating point arithmetic and therefore we cannot be as simple-minded -about types as we have been so far (remember the LLVM-IR is a -fully-typed language and needs to know the exact types of each -expression). The idea is to deal appropriately with three types, -namely \texttt{Int}, \texttt{Double} and \texttt{Void} (they are -represented in the LLVM-IR as \texttt{i32}, \texttt{double} and -\texttt{void}). You need to extend the lexer and parser accordingly in -order to deal with type annotations. The Fun-language includes global -constants, such as +The main goal is to lex and parse 4 Fun-programs, including the +Mandelbrot program shown in Figure~\ref{mand}, and generate +corresponding code for the LLVM-IR. Unfortunately the calculations for +the Mandelbrot Set require floating point arithmetic and therefore we +cannot be as simple-minded about types as we have been so far +(remember the LLVM-IR is a fully-typed language and needs to know the +exact types of each expression). The idea is to deal appropriately +with three types, namely \texttt{Int}, \texttt{Double} and +\texttt{Void} (they are represented in the LLVM-IR as \texttt{i32}, +\texttt{double} and \texttt{void}). You need to extend the lexer and +parser accordingly in order to deal with type annotations. The +Fun-language includes global constants, such as \begin{lstlisting}[numbers=none] val Ymin: Double = -1.3; @@ -70,12 +78,13 @@ \begin{lstlisting}[numbers=none] def foo(n: Int, x: Double) : Double = ... + def id(n: Int) : Int = ... def bar() : Void = ... \end{lstlisting} \noindent The idea is to record all typing information that is given -in the program, but then delay any further typing inference to +in the Fun-program, but then delay any further typing inference to after the CPS-translation. That means the parser should generate ASTs given by the Scala dataypes: @@ -93,8 +102,9 @@ case class Call(name: String, args: List[Exp]) extends Exp case class If(a: BExp, e1: Exp, e2: Exp) extends Exp case class Var(s: String) extends Exp -case class Num(i: Int) extends Exp // integer numbers -case class FNum(i: Float) extends Exp // floating numbers +case class Num(i: Int) extends Exp // integer numbers +case class FNum(i: Float) extends Exp // floating numbers +case class ChConst(c: Int) extends Exp // char constant case class Aop(o: String, a1: Exp, a2: Exp) extends Exp case class Sequence(e1: Exp, e2: Exp) extends Exp case class Bop(o: String, a1: Exp, a2: Exp) extends BExp @@ -120,6 +130,15 @@ \caption{Ascii output of the Mandelbrot program.\label{mand}} \end{figure} +Also note that the second version of the Mandelbrot program and also +the Tower of Hanoi program uses character constants, like \texttt{'a'}, +\texttt{'1'}, \texttt{'$\backslash$n'} and so on. When they are tokenised, +such characters should be interpreted as the corresponding ASCII code (an +integer), such that we can use them in calculations like \texttt{'a' + 10} +where the result should be 107. As usual, the character \texttt{'$\backslash$n'} is the +ASCII code 10. + + \subsection*{LLVM-IR} There are some subtleties in the LLVM-IR you need to be aware of: @@ -175,6 +194,7 @@ case "*" => "mul i32 " case "-" => "sub i32 " case "==" => "icmp eq i32 " + case "!=" => "icmp ne i32 " case "<=" => "icmp sle i32 " // signed less or equal case "<" => "icmp slt i32 " // signed less than }\end{lstlisting} @@ -187,6 +207,7 @@ case "*" => "fmul double " case "-" => "fsub double " case "==" => "fcmp oeq double " + case "!=" => "fcmp one double " case "<=" => "fcmp ole double " case "<" => "fcmp olt double " }\end{lstlisting} @@ -220,8 +241,8 @@ \item \textbf{Build-In Functions}: The `prelude' comes with several build-in functions: \texttt{new\_line()}, - \texttt{skip}, \texttt{print\_int(n)}, \texttt{print\_space()} - and \texttt{print\_star()}. You can find the `prelude' for + \texttt{skip}, \texttt{print\_int(n)}, \texttt{print\_space()}, + \texttt{print\_star()} and \texttt{print\_char(n)}. You can find the `prelude' for example in the file \texttt{sqr.ll}. \end{itemize}