afl-material: handouts/ho09.tex@f5655aa04cac (annotated)

677 3787d4fae375 updated Christian Urban <urbanc@in.tum.de> parents: 539 diff changeset	1	% !TEX program = xelatex
539 8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	2	\documentclass{article}
8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	3	\usepackage{../style}
8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	4	\usepackage{../langs}
8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	5	\usepackage{../graphics}
8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	6	\usepackage{../grammar}
677 3787d4fae375 updated Christian Urban <urbanc@in.tum.de> parents: 539 diff changeset	7	%%\usepackage{multicol}
539 8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	8
677 3787d4fae375 updated Christian Urban <urbanc@in.tum.de> parents: 539 diff changeset	9	%%\newcommand{\dn}{\stackrel{\mbox{\scriptsize def}}{=}}
539 8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	10
8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	11	\begin{document}
677 3787d4fae375 updated Christian Urban <urbanc@in.tum.de> parents: 539 diff changeset	12	\fnote{\copyright{} Christian Urban, King's College London, 2019}
539 8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	13
8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	14
677 3787d4fae375 updated Christian Urban <urbanc@in.tum.de> parents: 539 diff changeset	15	\section*{Handout 9 (LLVM, SSA and CPS)}
539 8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	16
678 6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	17	Reflecting on our tiny compiler targetting the JVM, the code generation
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	18	part was actually not so hard, no? Pretty much just some post-traversal
680 242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	19	of the abstract syntax tree, yes? One of the reasons for this ease is
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	20	that the JVM is a stack-based virtual machine and it is therefore not
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	21	hard to translate deeply-nested arithmetic expressions into a sequence
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	22	of instructions manipulating the stack. The problem is that ``real''
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	23	CPUs, although supporting stack operations, are not really designed to
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	24	be \emph{stack machines}. The design of CPUs is more like, here is a
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	25	chunk of memory---compiler, or better compiler writers, do something
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	26	with it. Consequently, modern compilers need to go the extra mile in
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	27	order to generate code that is much easier and faster to process by
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	28	CPUs. To make this all tractable for this module, we target the LLVM
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	29	Intermediate Language. In this way we can take advantage of the tools
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	30	coming with LLVM. For example we do not have to worry about things like
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	31	register allocations.\bigskip
539 8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	32
678 6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	33	\noindent LLVM\footnote{\url{http://llvm.org}} is a beautiful example
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	34	that projects from Academia can make a difference in the world. LLVM
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	35	started in 2000 as a project by two researchers at the University of
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	36	Illinois at Urbana-Champaign. At the time the behemoth of compilers was
680 242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	37	gcc with its myriad of front-ends for other languages (C++, Fortran,
678 6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	38	Ada, Go, Objective-C, Pascal etc). The problem was that gcc morphed over
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	39	time into a monolithic gigantic piece of m\ldots ehm software, which you
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	40	could not mess about in an afternoon. In contrast, LLVM is designed to
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	41	be a modular suite of tools with which you could play around easily and
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	42	try out something new. LLVM became a big player once Apple hired one of
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	43	the original developers (I cannot remember the reason why Apple did not
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	44	want to use gcc, but maybe they were also just disgusted by its big
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	45	monolithic codebase). Anyway, LLVM is now the big player and gcc is more
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	46	or less legacy. This does not mean that programming languages like C and
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	47	C++ are dying out any time soon---they are nicely supported by LLVM.
539 8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	48
680 242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	49	We will target the LLVM Intermediate Language, or Intermediate
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	50	Representation (short LLVM-IR). As a result we can benefit from the
678 6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	51	modular structure of the LLVM compiler and let for example the compiler
680 242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	52	generate code for different CPUs, like X86 or ARM. That means we can be
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	53	agnostic about where our code actually runs. We can also be ignorant
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	54	about optimising code and allocating memory efficiently. However, what
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	55	we have to do is to generate code in \emph{Static Single-Assignment}
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	56	format (short SSA), because that is what the LLVM-IR expects from us.
539 8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	57
678 6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	58	The idea behind the SSA format is to use very simple variable
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	59	assignments where every variable is assigned only once. The assignments
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	60	also need to be primitive in the sense that they can be just simple
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	61	operations like addition, multiplication, jumps, comparisons and so on.
680 242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	62	Say, we have an expression $((1 + a) + (3 + (b * 5)))$, then the
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	63	SSA is
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	64
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	65	\begin{lstlisting}[language=LLVMIR,numbers=left]
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	66	let tmp0 = add 1 a in
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	67	let tmp1 = mul b 5 in
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	68	let tmp2 = add 3 tmp1 in
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	69	let tmp3 = add tmp0 tmp2 in
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	70	tmp3
677 3787d4fae375 updated Christian Urban <urbanc@in.tum.de> parents: 539 diff changeset	71	\end{lstlisting}
539 8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	72
678 6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	73	\noindent where every variable is used only once (we could not write
680 242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	74	\texttt{tmp1 = add 3 tmp1} in Line 3 for example). There are
678 6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	75	sophisticated algorithms for imperative languages, like C, that
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	76	efficiently transform a high-level program into SSA format. But we can
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	77	ignore them here. We want to compile a functional language and there
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	78	things get much more interesting than just sophisticated. We will need
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	79	to have a look at CPS translations, where the CPS stands for
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	80	Continuation-Passing-Style---basically black programming art or
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	81	abracadabra programming. So sit tight.
539 8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	82
678 6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	83	\subsection*{LLVM-IR}
539 8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	84
678 6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	85	Before we start, lets first have a look at the \emph{LLVM Intermediate
680 242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	86	Representation} in more detail. What is good about our toy Fun language
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	87	is that it basically only contains expressions (be they arithmetic
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	88	expressions or boolean expressions or if-expressions). The exception are
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	89	function definitions. Luckily, for them we can use the mechanism of
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	90	defining functions in LLVM-IR. For example the simple Fun program
539 8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	91
8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	92
677 3787d4fae375 updated Christian Urban <urbanc@in.tum.de> parents: 539 diff changeset	93	\begin{lstlisting}[language=Scala,numbers=none]
678 6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	94	def sqr(x) = x * x
677 3787d4fae375 updated Christian Urban <urbanc@in.tum.de> parents: 539 diff changeset	95	\end{lstlisting}
539 8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	96
677 3787d4fae375 updated Christian Urban <urbanc@in.tum.de> parents: 539 diff changeset	97	\noindent
3787d4fae375 updated Christian Urban <urbanc@in.tum.de> parents: 539 diff changeset	98	can be compiled into the following LLVM-IR function:
539 8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	99
677 3787d4fae375 updated Christian Urban <urbanc@in.tum.de> parents: 539 diff changeset	100	\begin{lstlisting}[language=LLVM]
678 6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	101	define i32 @sqr(i32 %x) {
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	102	%tmp = mul i32 %x, %x
677 3787d4fae375 updated Christian Urban <urbanc@in.tum.de> parents: 539 diff changeset	103	ret i32 %tmp
3787d4fae375 updated Christian Urban <urbanc@in.tum.de> parents: 539 diff changeset	104	}
3787d4fae375 updated Christian Urban <urbanc@in.tum.de> parents: 539 diff changeset	105	\end{lstlisting}
539 8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	106
680 242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	107	\noindent First notice that all variable names in the LLVM-IR are
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	108	prefixed by \texttt{\%}; function names need to be prefixed with
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	109	@-symbol. Also, the LLVM-IR is a fully typed language. The \texttt{i32}
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	110	type stands for 32-bit integers. There are also types for 64-bit
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	111	integers (\texttt{i64}), chars (\texttt{i8}), floats, arrays and even
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	112	pointer types. In the code above, \texttt{sqr} takes an argument of type
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	113	\texttt{i32} and produces a result of type \texttt{i32} (the result type
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	114	is before the function name, like in C). Each arithmetic operation, like
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	115	addition or multiplication, are also prefixed with the type they operate
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	116	on. Obviously these types need to match up\ldots{} but since we have in
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	117	our programs only integers, \texttt{i32} everywhere will do.
539 8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	118
679 9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	119	Conveniently, you can use the program \texttt{lli}, which comes with
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	120	LLVM, to interpret programs written in the LLVM-IR. So you can easily
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	121	check whether the code you produced actually works. To get a running
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	122	program that does something interesting you need to add some boilerplate
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	123	about printing out numbers and a main-function that is the entrypoint
680 242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	124	for the program (see Figure~\ref{lli} for a complete listing). You can
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	125	generate a binary for this program using \texttt{llc}-compiler and
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	126	\texttt{gcc}---\texttt{llc} can generate an object file and then you can
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	127	use \texttt{gcc} (that is clang) for generating an executable binary:
678 6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	128
679 9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	129	\begin{lstlisting}[language=bash,numbers=none]
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	130	llc -filetype=obj sqr.ll
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	131	gcc sqr.o -o a.out
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	132	./a.out
680 242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	133	> 25
679 9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	134	\end{lstlisting}
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	135
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	136	\begin{figure}[t]\small
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	137	\lstinputlisting[language=LLVM,numbers=left]{../progs/sqr.ll}
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	138	\caption{An LLVM-IR program for calculating the square function. The
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	139	interesting function is \texttt{sqr} in Lines 13 -- 16. The main
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	140	function calls \texttt{sqr} and prints out the result. The other
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	141	code is boilerplate for printing out integers.\label{lli}}
678 6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	142	\end{figure}
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	143
679 9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	144
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	145
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	146	\subsection*{Our Own Intermediate Language}
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	147
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	148	Remember compilers have to solve the problem of bridging the gap between
680 242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	149	``high-level'' programs and ``low-level'' hardware. If the gap is too
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	150	wide for one step, then a good strategy is to lay a stepping stone
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	151	somewhere in between. The LLVM-IR itself is such a stepping stone to
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	152	make the task of generating and optimising code easier. Like a real
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	153	compiler we will use our own stepping stone which I call the
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	154	\emph{K-language}. For this remember expressions (and boolean
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	155	expressions) in the Fun language. For convenience the Scala code is
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	156	shown on top of Figure~\ref{absfun}. Below is the code for the
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	157	K-language. There are two kinds of syntactic entities, namely
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	158	\emph{K-values} and \emph{K-expressions}. The central point of the
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	159	K-language is the \texttt{KLet}-constructor. For this recall that
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	160	arithmetic expressions such as $((1 + a) + (3 + (b * 5)))$ need
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	161	to be broken up into smaller ``atomic'' steps, like so
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	162
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	163	\begin{lstlisting}[language=LLVMIR,numbers=none]
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	164	let tmp0 = add 1 a in
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	165	let tmp1 = mul b 5 in
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	166	let tmp2 = add 3 tmp1 in
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	167	let tmp3 = add tmp0 tmp2 in
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	168	tmp3
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	169	\end{lstlisting}
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	170
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	171	\noindent
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	172	Here \texttt{tmp3} will contain the result of what the expression stands
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	173	for. In each step we can only perform an ``atomic'' operation, like
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	174	addition or multiplication. We could not for example have an
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	175	if-condition on the right-hand side of an equals. These constraints
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	176	enforced upon us because how the SSA format works in the LLVM-IR. By
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	177	having in \texttt{KLet}, first a string (standing for an intermediate
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	178	result) and second a value, we can fulfil this constraint---there is no
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	179	way we could write anything else than a value. To sum up, K-values are
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	180	the atomic operations that can be on the right-hand side of equal-signs.
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	181	The K-language is restricted such that it is easy to generate the SSA format
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	182	for the LLVM-IR.
242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	183
679 9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	184
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	185
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	186	\begin{figure}[p]\small
678 6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	187	\begin{lstlisting}[language=Scala,numbers=none]
679 9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	188	// Fun-language (expressions)
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	189	abstract class Exp
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	190	abstract class BExp
678 6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	191
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	192	case class Call(name: String, args: List[Exp]) extends Exp
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	193	case class If(a: BExp, e1: Exp, e2: Exp) extends Exp
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	194	case class Write(e: Exp) extends Exp
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	195	case class Var(s: String) extends Exp
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	196	case class Num(i: Int) extends Exp
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	197	case class Aop(o: String, a1: Exp, a2: Exp) extends Exp
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	198	case class Sequence(e1: Exp, e2: Exp) extends Exp
679 9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	199	case class Bop(o: String, a1: Exp, a2: Exp) extends BExp
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	200
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	201
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	202
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	203	// K-language (K-expressions, K-values)
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	204	abstract class KExp
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	205	abstract class KVal
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	206
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	207	case class KVar(s: String) extends KVal
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	208	case class KNum(i: Int) extends KVal
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	209	case class Kop(o: String, v1: KVal, v2: KVal) extends KVal
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	210	case class KCall(o: String, vrs: List[KVal]) extends KVal
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	211	case class KWrite(v: KVal) extends KVal
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	212
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	213	case class KIf(x1: String, e1: KExp, e2: KExp) extends KExp
680 242c1f0e60df updated Christian Urban <urbanc@in.tum.de> parents: 679 diff changeset	214	case class KLet(x: String, v: KVal, e: KExp) extends KExp
679 9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	215	case class KReturn(v: KVal) extends KExp
678 6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	216	\end{lstlisting}
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	217	\caption{Abstract syntax trees for the Fun language.\label{absfun}}
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	218	\end{figure}
679 9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	219
678 6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	220
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	221
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	222	\subsection*{CPS-Translations}
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	223
6601ff1d9e0a updated Christian Urban <urbanc@in.tum.de> parents: 677 diff changeset	224
679 9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	225
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	226
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	227
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	228	Another reason why it makes sense to go the extra mile is that stack
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	229	instructions are very difficult to optimise---you cannot just re-arrange
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	230	instructions without messing about with what is calculated on the stack.
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	231	Also it is hard to find out if all the calculations on the stack are
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	232	actually necessary and not by chance dead code. The JVM has for all this
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	233	sophisticated machinery to make such ``high-level'' code still run fast,
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	234	but let's say that for the sake of argument we do not want to rely on
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	235	it. We want to generate fast code ourselves. This means we have to work
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	236	around the intricacies of what instructions CPUs can actually process.
9a4404f65b63 updated Christian Urban <urbanc@in.tum.de> parents: 678 diff changeset	237
539 8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	238	\end{document}
8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	239
8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	240
8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	241	%%% Local Variables:
8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	242	%%% mode: latex
8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	243	%%% TeX-master: t
8a12889f8c8a updated Christian Urban <urbanc@in.tum.de> parents: diff changeset	244	%%% End:

author	Christian Urban <urbanc@in.tum.de>
	Thu, 21 Nov 2019 14:41:53 +0000
changeset 697	f5655aa04cac
parent 680	242c1f0e60df
child 700	f1d4d582ac29
permissions	-rw-r--r--