afl-material: comparison handouts/ho09.tex

equal deleted inserted replaced

-:904de68a27a4
+:45a48c47dcca
 time into a monolithic gigantic piece of m\ldots ehm software, which you
 could not mess about in an afternoon. In contrast, LLVM is designed to
 be a modular suite of tools with which you can play around easily and
 try out something new. LLVM became a big player once Apple hired one of
 the original developers (I cannot remember the reason why Apple did not
-want to use gcc, but maybe they were also just disgusted by its big
+want to use gcc, but maybe they were also just disgusted by gcc's big
 monolithic codebase). Anyway, LLVM is now the big player and gcc is more
 or less legacy. This does not mean that programming languages like C and
 C++ are dying out any time soon---they are nicely supported by LLVM.
 We will target the LLVM Intermediate Language, or LLVM Intermediate
 of what instructions CPUs can actually process fast. This is what the
 SSA format is designed for.
 The main idea behind the SSA format is to use very simple variable
-assignments where every variable is assigned only once. The assignments
+assignments where every tmp-variable is assigned only once. The assignments
 also need to be primitive in the sense that they can be just simple
 operations like addition, multiplication, jumps, comparisons and so on.
 Say, we have an expression $((1 + a) + (3 + (b * 5)))$, then the
 corresponding SSA format is
 let tmp2 = add 3 tmp1 in
 let tmp3 = add tmp0 tmp2 in tmp3
 \end{lstlisting}
 \noindent where every variable is used only once (we could not write
-\texttt{tmp1 = add 3 tmp1} in Line 3 for example).  There are
+\texttt{tmp1 = add 3 tmp1} in Line 3 for example).
-sophisticated algorithms for imperative languages, like C, that
-efficiently transform a high-level program into SSA format. But we can
+There are sophisticated algorithms for imperative languages, like C,
-ignore them here. We want to compile a functional language and there
+that efficiently transform a high-level program into SSA format. But
-things get much more interesting than just sophisticated. We will need
+we can ignore them here. We want to compile a functional language and
-to have a look at CPS translations, where the CPS stands for
+there things get much more interesting than just sophisticated. We
+will need to have a look at CPS translations, where the CPS stands for
 Continuation-Passing-Style---basically black programming art or
 abracadabra programming. So sit tight.
 \subsection*{LLVM-IR}
 \end{lstlisting}
 \noindent First notice that all variable names, in this case \texttt{x}
 and \texttt{tmp}, are prefixed with \texttt{\%} in the LLVM-IR.
 Temporary variables can be named with an identifier, such as
-\texttt{tmp}, or numbers. Function names, since they are ``global'',
+\texttt{tmp}, or numbers. In contrast, function names, since they are ``global'',
-need to be prefixed with @-symbol. Also, the LLVM-IR is a fully typed
+need to be prefixed with an @-symbol. Also, the LLVM-IR is a fully typed
 language. The \texttt{i32} type stands for 32-bit integers. There are
 also types for 64-bit integers (\texttt{i64}), chars (\texttt{i8}),
 floats, arrays and even pointer types. In the code above, \texttt{sqr}
 takes an argument of type \texttt{i32} and produces a result of type
 \texttt{i32} (the result type is in front of the function name, like in
 C). Each arithmetic operation, for example addition and multiplication,
 are also prefixed with the type they operate on. Obviously these types
 need to match up\ldots{} but since we have in our programs only
-integers, \texttt{i32} everywhere will do. We do not have to generate
+integers, for the moment \texttt{i32} everywhere will do. We do not have to generate
 any other types, but obviously this is a limitation in our Fun language.
 There are a few interesting instructions in the LLVM-IR which are quite
 different than what we have seen in the JVM. Can you remember the
 kerfuffle we had to go through with boolean expressions and negating the
 \end{lstlisting}
 \noindent
 The type \texttt{i1} stands for booleans. If the variable is true, then
 this instruction jumps to the if-branch, which needs an explicit label;
-otherwise to the else-branch, again with its own label. This allows us
+otherwise it jumps to the else-branch, again with its own label. This allows us
-to keep the meaning of the boolean expression as is when compiling if's.
+to keep the meaning of the boolean expression ``as is'' when compiling if's---thanks god no more negating the boolean.
 A value of type boolean is generated in the LLVM-IR by the
 \texttt{icmp}-instruction. This instruction is for integers (hence the
 \texttt{i}) and takes the comparison operation as argument. For example
 \begin{lstlisting}[language=LLVM]
-icmp eq i32  %x, %y     ; for equal
+icmp eq  i32 %x, %y     ; for equal
-icmp sle i32 %x, %y     ;   signed less or equal
+icmp sle i32 %x, %y     ; signed less or equal
-icmp slt i32 %x, %y     ;   signed less than
+icmp slt i32 %x, %y     ; signed less than
-icmp ult i32 %x, %y     ;   unsigned less than
+icmp ult i32 %x, %y     ; unsigned less than
 \end{lstlisting}
 \noindent
-In some operations, the LLVM-IR distinguishes between signed and
+Note that in some operations the LLVM-IR distinguishes between signed and
 unsigned representations of integers.
 It is also easy to call another function in LLVM-IR: as can be
 seen from Figure~\ref{lli} we can just call a function with the
 instruction \texttt{call} and can also assign the result to
 = 3 * (2 * (1 * 1))
 = 6
 \end{lstlisting}
 \noindent
-which is the expected result. If this looks somewhat familiar, then this
+which is the expected result. If this looks somewhat familiar to you,
-is not a 1000 miles off, because functions with continuations can be
+than this is because functions with continuations can be
 seen as a kind of generalisation of tail-recursive functions. Anyway
 notice how the continuations is ``stacked up'' during the recursion and
 then ``unrolled'' when we apply 1 to the continuation. Interestingly, we
 can do something similar to the Fibonacci function where in the traditional
 version we have two recursive calls. Consider the following function
 main difficulty of generating instructions in SSA format is that large
 compound expressions need to be broken up into smaller pieces and
 intermediate results need to be chained into later instructions. To do
 this conveniently, CPS-translations have been developed. They use
 functions (``continuations'') to represent what is coming next in a
-sequence of instructions. Continuations are functions of type
+sequence of instructions. In our case, continuations are functions of type
 \code{KVal} to \code{KExp}. They can be seen as a sequence of
 \code{KLet}s where there is a ``hole'' that needs to be filled. Consider
 for example
 \begin{lstlisting}[language=LLVMIR,numbers=left,escapeinside={(*@}{@*)}]
 \end{lstlisting}
 \noindent
 where in the second line is a $\Box$ which still expects a \code{KVal}
 to be filled in before it becomes a ``proper'' \code{KExp}. When we
-apply and argument to the continuation (remember they are functions)
+apply an argument to the continuation (remember they are functions)
 we essentially fill something into the corresponding hole. The code
 of the CPS-translation is
 \begin{lstlisting}[language=Scala,numbers=none]
 def CPS(e: Exp)(k: KVal => KExp) : KExp =
 CPS(e2)(y2 => KLet(z, Kop(o, y1, y2), k(KVar(z)))))
 }
 \end{lstlisting}
 \noindent
+For more such rules, have a look to the code of the fun-llvm
+compiler.
+\noindent
 \end{document}
 %%% Local Variables:
 %%% mode: latex

changeset 898	45a48c47dcca
parent 722	14914b57e207
child 908	0138618eff73