handouts/ho09.tex
changeset 352 da5713bcdbb0
parent 351 cd8d18f7b7ac
child 354 8e5e84b14041
--- a/handouts/ho09.tex	Thu Dec 11 23:51:08 2014 +0000
+++ b/handouts/ho09.tex	Fri Dec 12 15:41:37 2014 +0000
@@ -62,13 +62,13 @@
 Our starting point is a small, idealised programming language.
 It is idealised because we cut several corners in comparison
 with real programming languages. The language we will study
-contains, amongst other things, variables holding integers. We
-want to find out what the sign of these integers (positive or
-negative) will be when the program runs. This sign-analysis
-seems like a very simple problem, but it will turn out even
-such simple problems, if approached naively, are in general
-undecidable, just like Turing's halting problem. I let you
-think why?
+contains, amongst other things, variables holding integers.
+Using static analysis, we want to find out what the sign of
+these integers (positive or negative) will be when the program
+runs. This sign-analysis seems like a very simple problem. But
+it will turn out even such simple problems, if approached
+naively, are in general undecidable, just like Turing's
+halting problem. I let you think why?
 
 
 Is sign-analysis of variables an interesting problem? Well,
@@ -108,11 +108,13 @@
 sequences of statements. Statements are either labels,
 variable assignments, conditional jumps (\pcode{jmp?}) and
 unconditional jumps (\pcode{goto}). Labels are just strings,
-which can be used as the target of a jump. The conditional
-jumps and variable assignments involve (arithmetic)
-expressions. Expressions are either numbers, variables or
-compound expressions built up from \pcode{+}, \pcode{*} and
-\emph{=} (for simplicity reasons we do not consider any other
+which can be used as the target of a jump. We assume that in
+every program the labels are unique---otherwise if there is a
+clash we do not know where to jump to. The conditional jumps
+and variable assignments involve (arithmetic) expressions.
+Expressions are either numbers, variables or compound
+expressions built up from \pcode{+}, \pcode{*} and \emph{=}
+(for simplicity reasons we do not consider any other
 operations). We assume we have negative and positive numbers,
 \ldots \pcode{-2}, \pcode{-1}, \pcode{0}, \pcode{1},
 \pcode{2}\ldots{} An example program that calculates the
@@ -173,20 +175,58 @@
 complexity is really beside the point here. Furthermore, real
 machine code has many instructions for manipulating memory. We
 do not have this at all. This is actually a more serious
-simplification because we assume numbers to be arbitrary
-precision, which is not the case with real machine code. In
+simplification because we assume numbers to be arbitrary small
+or large, which is not the case with real machine code. In
 real code basic number formats have a range and might
-over-flow or under-flow from this range. Also the numbers of
-variables in our programs is unlimited, while memory, of
-course, is always limited somehow on any actual machine. To
-sum up, our language might look very simple, but it is not
-completely removed from practically relevant issues.
+over-flow or under-flow from this range. Also the number of
+variables in our programs is potentially unlimited, while
+memory in an actual computer, of course, is always limited
+somehow on any actual. To sum up, our language might look very
+simple, but it is not completely removed from practically
+relevant issues.
 
 
 \subsubsection*{An Interpreter}
 
-Designing a language is like being god: you can say what 
-each part of the program should mean. 
+Designing a language is like playing god: you can say what
+names for variables you allow; what programs should look like;
+most importantly you can decide what each part of the program
+should mean and do. While our language is rather simple and
+the meaning is rather straightforward, there are still places
+where we need to make a real choice. For example with
+conditional jumps, say the one in the factorial program: 
+
+\begin{center}
+\code{jmp? n = 0 done}
+\end{center}
+
+\noindent How should they work? We could introduce Booleans
+(\pcode{true} and \pcode{false}) and then jump only when the
+condition is \pcode{true}. However, since we have numbers in
+our language anyway, why not just encoding \emph{true} as
+zero, and \pcode{false} as anything else? In this way we can
+dispense with the additional concept of Booleans, but also we
+could replace the jump above by
+
+\begin{center}
+\code{jmp? n done}
+\end{center}
+
+\noindent which behaves exactly the same. But what does it
+mean that two jumps behave the same?
+
+I hope the above discussion makes it already clear we need to
+be a bit more careful with our programs. Below we shall
+describe an interpreter for our programs, which specifies
+exactly how programs are supposed to be run\ldots{}at least we
+will specify this for all \emph{good} programs. By good
+programs we mean where for example all variables are
+initialised. Our interpreter will just crash if it cannot find
+out the value for a variable, because it is not initialised.
+
+First we will pre-process our programs. This will simplify 
+our definition of our interpreter later on. We will transform
+programs into \emph{snippets}. 
 
 \end{document}