afl-material: changeset 690:8d57433c7b5e

Binary file handouts/ho01.pdf has changed

--- a/handouts/ho01.tex	Thu Nov 14 13:50:39 2019 +0000
+++ b/handouts/ho01.tex	Fri Nov 15 14:58:16 2019 +0000
@@ -43,8 +43,35 @@
 
 \section*{Handout 1}
 
+The purpose of a compiler is to transform a program a human can read
+and write into code the machine can run as fast as
+possible. Developping a compiler is an old craft going back to 1952
+with the first compiler written by Grace Hopper.  Why studiying
+compilers nowadays?  An interesting answer is given by John Regher in
+his compiler
+blog:\footnote{\url{http://blog.regehr.org/archives/1419}}
+
+\begin{quote}\it{}``We can start off with a couple of observations
+  about the role of compilers. First, hardware is getting weirder
+  rather than getting clocked faster: almost all processors are
+  multicores and it looks like there is increasing asymmetry in
+  resources across cores. Processors come with vector units, crypto
+  accelerators, bit twiddling instructions, and lots of features to
+  make virtualization and concurrency work. We have DSPs, GPUs,
+  big.little, and Xeon Phi. This is only scratching the
+  surface. Second, we’re getting tired of low-level languages and
+  their associated security disasters, we want to write new code, to
+  whatever extent possible, in safer, higher-level
+  languages. Compilers are caught right in the middle of these
+  opposing trends: one of their main jobs is to help bridge the large
+  and growing gap between increasingly high-level languages and
+  increasingly wacky platforms. It’s effectively a perpetual
+  employment act for solid compiler hackers.''
+\end{quote}  
+
+
 The first part of this module is about text processing, be it for
-web-crawlers, compilers, dictionaries, DNA-data, spam-filters and so on.
+compilers, dictionaries, DNA-data, spam-filters and so on.
 When looking for a particular string, say \pcode{"foobar"}, in a large
 text we can use the Knuth-Morris-Pratt algorithm, which is currently the
 most efficient general string search algorithm. But often we do

Binary file handouts/ho07.pdf has changed

--- a/handouts/ho07.tex	Thu Nov 14 13:50:39 2019 +0000
+++ b/handouts/ho07.tex	Fri Nov 15 14:58:16 2019 +0000
@@ -11,41 +11,42 @@
 
 \section*{Handout 7 (Compilation)}
 
+
+
 The purpose of a compiler is to transform a program a human can read and
 write into code the machine can run as fast as possible. The fastest
 code would be machine code the CPU can run directly, but it is often
-good enough for improving the speed of a program to just target a
+good enough for improving the speed of a program to target a
 virtual machine. This produces not the fastest possible code, but code
-that is pretty enough. This way of producing code has the advantage that
+that is often pretty fast. This way of producing code has the advantage that
 the virtual machine takes care of things a compiler would normally need
-to take care of (like explicit memory management). An interesting answer
-for why studying compilers is given by John Regher in his compiler
-blog:\footnote{\url{http://blog.regehr.org/archives/1419}}
-
-\begin{quote}\it{}``We can start off with a couple of observations
-  about the role of compilers. First, hardware is getting weirder
-  rather than getting clocked faster: almost all processors are
-  multicores and it looks like there is increasing asymmetry in
-  resources across cores. Processors come with vector units, crypto
-  accelerators, bit twiddling instructions, and lots of features to
-  make virtualization and concurrency work. We have DSPs, GPUs,
-  big.little, and Xeon Phi. This is only scratching the
-  surface. Second, we’re getting tired of low-level languages and
-  their associated security disasters, we want to write new code, to
-  whatever extent possible, in safer, higher-level
-  languages. Compilers are caught right in the middle of these
-  opposing trends: one of their main jobs is to help bridge the large
-  and growing gap between increasingly high-level languages and
-  increasingly wacky platforms. It’s effectively a perpetual
-  employment act for solid compiler hackers.''
-\end{quote}  
+to take care of (like explicit memory management). 
 
 As a first example in this module we will implement a compiler for the
 very simple While-language. It will generate code for the Java Virtual
-Machine (JVM). This is a stack-based virtual machine, a fact which will
-make it easy to generate code for arithmetic expressions. For example
-when compiling $1 + 2$ we need to generate the following three
-instructions
+Machine (JVM). Unfortunately the Java ecosystem does not come with an
+assembler which would be handy for our  compiler-endeavour  (unlike
+Microsoft's  Common Language Infrastructure for the .Net platform which
+has an assembler out-of-the-box). As a substitute we use in this module
+the 3rd-party programs Jasmin and Krakatau
+
+\begin{itemize}
+  \item \url{http://jasmin.sourceforge.net}
+  \item \url{https://github.com/Storyyeller/Krakatau}
+\end{itemize}
+
+\noindent
+The first is a Java program and the second a program written in Python.
+Each of them allow us to generate \emph{assembly} files that are still
+readable by humans, as opposed to class-files which are pretty much just
+(horrible) zeros and ones. Jasmin (respectively Krakatau) will then take
+an assembly file as input and generate the corresponding class file for
+us. 
+
+Good about the JVM is that it is a stack-based virtual machine, a fact
+which will make it easy to generate code for arithmetic expressions. For
+example when compiling the expression $1 + 2$ we need to generate the
+following three instructions
 
 \begin{lstlisting}[language=JVMIS,numbers=none]
 ldc 1
@@ -113,11 +114,10 @@
 \noindent If we ``run'' these instructions, the result $8$ will be on
 top of the stack (I leave this to you to verify; the meaning of each
 instruction should be clear). The result being on the top of the stack
-will be a convention we always observe in our compiler, that is the
-results of arithmetic expressions will always be on top of the stack.
-Note, that a different bracketing of the expression, for example $(1 +
-(2 * 3)) + (4 - 3)$, produces a different abstract syntax tree and thus
-also a different list of instructions. Generating code in this
+will be an important convention we always observe in our compiler. Note,
+that a different bracketing of the expression, for example $(1 + (2 *
+3)) + (4 - 3)$, produces a different abstract syntax tree and thus also
+a different list of instructions. Generating code in this
 post-order-traversal fashion is rather easy to implement: it can be done
 with the following recursive \textit{compile}-function, which takes the
 abstract syntax tree as argument:
@@ -184,7 +184,12 @@
 
 \noindent In the last line we generate the code for variables
 where $E(x)$ stands for looking up the environment to which
-index the variable $x$ maps to.
+index the variable $x$ maps to. This is similar to an interpreter,
+which also needs an environment: the difference is that the 
+interpreter maintains a mapping from variables to current values (what is the
+currently the value of a variable), while compilers need a mapping
+from variables to memory locations (where can I find the current 
+value for the variable in memory).
 
 There is a similar \textit{compile}-function for boolean
 expressions, but it includes a ``trick'' to do with
@@ -206,8 +211,9 @@
 that assignments in the While-language might change the
 environment---clearly if a variable is used for the first
 time, we need to allocate a new index and if it has been used
-before, we need to be able to retrieve the associated index.
-This is reflected in the clause for compiling assignments:
+before, then we need to be able to retrieve the associated index.
+This is reflected in the clause for compiling assignments, say
+$\textit{x := a}$:
 
 \begin{center}
 \begin{tabular}{lcl}
@@ -228,7 +234,7 @@
 of the variable $x$ in the program, then we have to augment 
 the environment and assign $x$ with the largest index in $E$
 plus one (that is $E' = E(x \mapsto largest\_index + 1)$). 
-That means for the assignment $x := x + 1$ we generate the
+This means for the assignment $x := x + 1$ we generate the
 following code
 
 \begin{lstlisting}[language=JVMIS,mathescape,numbers=none]
@@ -239,12 +245,12 @@
 \end{lstlisting}
 
 \noindent 
-where $n_x$ is the index for the variable $x$. The code for 
+where $n_x$ is the index (or pointer to the memory) for the variable $x$ . The code for 
 looking-up the index for the variable is as follow:
 
 \begin{center}
 \begin{tabular}{lcl}
-$index \;=\; \textit{if}\;(E(x).\textit{isDefind})\;\textit{then}\;E(x)\;\textit{else}\;|E|$
+$index \;=\; E\textit{.getOrElse}(x, |E|)$
 \end{tabular}
 \end{center}
 
@@ -254,7 +260,7 @@
 environment (that will be an index that is guaranteed to be not used
 yet).
 
-More complicated is the generation of code for \pcode{if}-statements, say
+A bit more complicated is the generation of code for \pcode{if}-statements, say
 
 \begin{lstlisting}[mathescape,language={},numbers=none]
 if $b$ then $cs_1$ else $cs_2$
@@ -423,14 +429,14 @@
 after-the-else-branch label. Consider for example the 
 if-statement:
 
-\begin{lstlisting}[mathescape,numbers=none,language={}]
+\begin{lstlisting}[mathescape,numbers=none,language=While]
 if 1 = 1 then x := 2 else y := 3
 \end{lstlisting}
 
 \noindent 
 The generated code is as follows:
 
-\begin{lstlisting}[language=JVMIS,mathescape,language={}]
+\begin{lstlisting}[language=JVMIS,mathescape,numbers=left]
    ldc 1
    ldc 1
    if_icmpne L_ifelse $\quad\tikz[remember picture] \node (C) {\mbox{}};$
@@ -533,13 +539,13 @@
 \noindent I let you go through how this clause works. As an example
 you can consider the while-loop
 
-\begin{lstlisting}[mathescape,numbers=none,language={}]
+\begin{lstlisting}[mathescape,numbers=none,language=While]
 while x <= 10 do x := x + 1
 \end{lstlisting}
 
 \noindent yielding the following code
 
-\begin{lstlisting}[language=JVMIS,mathescape,language={}]
+\begin{lstlisting}[language=JVMIS,mathescape,numbers=left]
 L_wbegin: $\quad\tikz[remember picture] \node[] (LB) {\mbox{}};$
    iload 0
    ldc 10
@@ -559,6 +565,8 @@
            -- ++(0mm,-17.3mm) |- (\tikztotarget)},line width=1mm] (LD.east);
 \end{tikzpicture}
 
+\noindent
+I leave it to you to read the code and follow its controlflow.
 
 Next we need to consider the statement \pcode{write x}, which
 can be used to print out the content of a variable. For this
@@ -570,7 +578,7 @@
 follows.
 
 
-\begin{lstlisting}[language=JVMIS]
+\begin{lstlisting}[language=JVMIS,numbers=left]
 .method public static write(I)V 
     .limit locals 1 
     .limit stack 2 
@@ -613,7 +621,7 @@
 
 
 \begin{figure}[t]
-\begin{lstlisting}[mathescape,language=JVMIS]
+\begin{lstlisting}[mathescape,language=JVMIS,numbers=left]
 .class public XXX.XXX
 .super java/lang/Object
 
@@ -674,6 +682,53 @@
 can be run by the {\tt{}java}-program.\label{test}}
 \end{figure}
 
+\subsection*{Arrays}
+
+Maybe a useful addition to the While-language are arrays. This 
+lets us generate more interesting While-programs by translating
+BF*** programs into equivalent While-programs. This means we 
+want to support the following three constructions
+
+\begin{lstlisting}[mathescape,language=While]
+new arr[15000]
+x := 3 + arr[3 + y]
+arr[42 * n] := ...
+\end{lstlisting}
+
+\noindent
+The first one creates a new array with name \pcode{arr} that can 
+hold a given number of integers. The second is referencing an array
+inside a arithmetic expression. Essentially we have to be able to 
+look up the contents of an array at an index. Similarly we need to be
+able to update the content of an array (3rd line). The latter two
+constructions state that the index to an array can be given by 
+an arithmetic expression. For creating a new array we need to generate
+the following JVM instructions:
+
+\begin{lstlisting}[mathescape,language=JVMIS]
+ldc number 
+newarray int 
+astore loc_var
+\end{lstlisting}
+
+\noindent
+First we need to put the dimension of the array onto the
+stack, then next instruction creates the array and last we
+need to store the array in a local variable (like ``simple'' variables).
+For looking up an element in an array we can use the following code
+
+\begin{lstlisting}[mathescape,language=JVMIS]
+aload loc_var 
+index_aexp 
+iaload
+\end{lstlisting}
+
+\noindent
+This first loads the ``pointer'' to the array onto the stack. Then we have the
+instructions corresponding to the index where we want to look up the array. The
+idea is that these instructions will leave a concrete number on the stack, which is
+the index. Finally we just need to load the corresponding element onto the stack.
+
 \end{document}
 
 %%% Local Variables:

--- a/langs.sty	Thu Nov 14 13:50:39 2019 +0000
+++ b/langs.sty	Fri Nov 15 14:58:16 2019 +0000
@@ -56,7 +56,7 @@
 }[keywords,comments,strings]
 
 \lstdefinelanguage{While}{
-  morekeywords={if,then,else,while,do,true,false,write,upto,read,for,skip},
+  morekeywords={if,then,else,while,do,true,false,write,upto,read,for,skip,new},
   morecomment=[l]{//},
   morecomment=[n]{/*}{*/},
   morestring=[b]",

--- a/progs/compile.scala	Thu Nov 14 13:50:39 2019 +0000
+++ b/progs/compile.scala	Fri Nov 15 14:58:16 2019 +0000
@@ -119,7 +119,7 @@
 }
 
 // this allows you to write things like
-// i"add" and l"Lable"
+// i"add" and l"Label"
 
 
 // environments 
@@ -205,7 +205,7 @@
 // main compilation function for blocks
 def compile(bl: Block, class_name: String) : String = {
   val instructions = compile_block(bl, Map.empty)._1
-  (beginning ++ instructions.mkString ++ ending).replaceAllLiterally("XXX", class_name)
+  (beginning ++ instructions ++ ending).replaceAllLiterally("XXX", class_name)
 }
 
 
@@ -257,7 +257,7 @@
 
 // Fibonacci numbers as a bare-bone test-case
 val fib_test = 
-  List(Assign("n", Num(10)),            //  n := 10;                     
+  List(Assign("n", Num(9)),            //  n := 10;                     
        Assign("minus1",Num(0)),         //  minus1 := 0;
        Assign("minus2",Num(1)),         //  minus2 := 1;
        Assign("temp",Num(0)),           //  temp := 0;

author	Christian Urban <urbanc@in.tum.de>
	Fri, 15 Nov 2019 14:58:16 +0000
changeset 690	8d57433c7b5e
parent 689	d7c9ef381437
child 691	991849dfbcb1

handouts/ho01.pdf		file \| annotate \| diff \| comparison \| revisions
handouts/ho01.tex		file \| annotate \| diff \| comparison \| revisions
handouts/ho07.pdf		file \| annotate \| diff \| comparison \| revisions
handouts/ho07.tex		file \| annotate \| diff \| comparison \| revisions
langs.sty		file \| annotate \| diff \| comparison \| revisions
progs/compile.scala		file \| annotate \| diff \| comparison \| revisions