sen-material: comparison handouts/ho09.tex

equal deleted inserted replaced

-:cd8d18f7b7ac
+:da5713bcdbb0
 \subsubsection*{A Simple, Idealised Programming Language}
 Our starting point is a small, idealised programming language.
 It is idealised because we cut several corners in comparison
 with real programming languages. The language we will study
-contains, amongst other things, variables holding integers. We
+contains, amongst other things, variables holding integers.
-want to find out what the sign of these integers (positive or
+Using static analysis, we want to find out what the sign of
-negative) will be when the program runs. This sign-analysis
+these integers (positive or negative) will be when the program
-seems like a very simple problem, but it will turn out even
+runs. This sign-analysis seems like a very simple problem. But
-such simple problems, if approached naively, are in general
+it will turn out even such simple problems, if approached
-undecidable, just like Turing's halting problem. I let you
+naively, are in general undecidable, just like Turing's
-think why?
+halting problem. I let you think why?
 Is sign-analysis of variables an interesting problem? Well,
 yes---if a compiler can find out that for example a variable
 will never be negative and this variable is used as an index
 There are three main syntactic categories: \emph{statments}
 and \emph{expressions} as well as \emph{programs}, which are
 sequences of statements. Statements are either labels,
 variable assignments, conditional jumps (\pcode{jmp?}) and
 unconditional jumps (\pcode{goto}). Labels are just strings,
-which can be used as the target of a jump. The conditional
+which can be used as the target of a jump. We assume that in
-jumps and variable assignments involve (arithmetic)
+every program the labels are unique---otherwise if there is a
-expressions. Expressions are either numbers, variables or
+clash we do not know where to jump to. The conditional jumps
-compound expressions built up from \pcode{+}, \pcode{*} and
+and variable assignments involve (arithmetic) expressions.
-\emph{=} (for simplicity reasons we do not consider any other
+Expressions are either numbers, variables or compound
+expressions built up from \pcode{+}, \pcode{*} and \emph{=}
+(for simplicity reasons we do not consider any other
 operations). We assume we have negative and positive numbers,
 \ldots \pcode{-2}, \pcode{-1}, \pcode{0}, \pcode{1},
 \pcode{2}\ldots{} An example program that calculates the
 factorial of 5 is as follows:
 just addition and multiplication) and many more conditional
 jumps. We could add these to our language if we wanted, but
 complexity is really beside the point here. Furthermore, real
 machine code has many instructions for manipulating memory. We
 do not have this at all. This is actually a more serious
-simplification because we assume numbers to be arbitrary
+simplification because we assume numbers to be arbitrary small
-precision, which is not the case with real machine code. In
+or large, which is not the case with real machine code. In
 real code basic number formats have a range and might
-over-flow or under-flow from this range. Also the numbers of
+over-flow or under-flow from this range. Also the number of
-variables in our programs is unlimited, while memory, of
+variables in our programs is potentially unlimited, while
-course, is always limited somehow on any actual machine. To
+memory in an actual computer, of course, is always limited
-sum up, our language might look very simple, but it is not
+somehow on any actual. To sum up, our language might look very
-completely removed from practically relevant issues.
+simple, but it is not completely removed from practically
+relevant issues.
 \subsubsection*{An Interpreter}
-Designing a language is like being god: you can say what
+Designing a language is like playing god: you can say what
-each part of the program should mean.
+names for variables you allow; what programs should look like;
+most importantly you can decide what each part of the program
+should mean and do. While our language is rather simple and
+the meaning is rather straightforward, there are still places
+where we need to make a real choice. For example with
+conditional jumps, say the one in the factorial program:
+\begin{center}
+\code{jmp? n = 0 done}
+\end{center}
+\noindent How should they work? We could introduce Booleans
+(\pcode{true} and \pcode{false}) and then jump only when the
+condition is \pcode{true}. However, since we have numbers in
+our language anyway, why not just encoding \emph{true} as
+zero, and \pcode{false} as anything else? In this way we can
+dispense with the additional concept of Booleans, but also we
+could replace the jump above by
+\begin{center}
+\code{jmp? n done}
+\end{center}
+\noindent which behaves exactly the same. But what does it
+mean that two jumps behave the same?
+I hope the above discussion makes it already clear we need to
+be a bit more careful with our programs. Below we shall
+describe an interpreter for our programs, which specifies
+exactly how programs are supposed to be run\ldots{}at least we
+will specify this for all \emph{good} programs. By good
+programs we mean where for example all variables are
+initialised. Our interpreter will just crash if it cannot find
+out the value for a variable, because it is not initialised.
+First we will pre-process our programs. This will simplify
+our definition of our interpreter later on. We will transform
+programs into \emph{snippets}.
 \end{document}
 %%% Local Variables:
 %%% mode: latex

changeset 352	da5713bcdbb0
parent 351	cd8d18f7b7ac
child 354	8e5e84b14041