73 |
73 |
74 Is sign-analysis of variables an interesting problem? Well, |
74 Is sign-analysis of variables an interesting problem? Well, |
75 yes---if a compiler can find out that for example a variable |
75 yes---if a compiler can find out that for example a variable |
76 will never be negative and this variable is used as an index |
76 will never be negative and this variable is used as an index |
77 for an array, then the compiler does not need to generate code |
77 for an array, then the compiler does not need to generate code |
78 for an underflow-test. Remember some languages are immune to |
78 for an underflow-check. Remember some languages are immune to |
79 buffer-overflow attacks, but they need to add underflow and |
79 buffer-overflow attacks, but they need to add underflow and |
80 overflow checks everywhere. If the compiler can omit the |
80 overflow checks everywhere. According to Regehr, an expert in |
81 underflow test, for example, then this can potentially |
81 the field of compilers, overflow checks can cause 5-10\% |
82 drastically speed up the generated code. According to the John |
82 slowdown, and in some languages even 100\% for tight |
83 Regehr, an expert in the field of compilers, overflow checks |
83 loops.\footnote{\url{http://blog.regehr.org/archives/1154}} If |
84 can cause 5-10\% slowdown, and in some languages even 100\% |
84 the compiler can omit the underflow check, for example, then |
85 for tight loops.\footnote{\url{http://blog.regehr.org/archives/1154}} |
85 this can potentially drastically speed up the generated code. |
86 |
86 |
87 What do programs in our simple programming language look like? |
87 What do programs in our simple programming language look like? |
88 The following grammar gives a first specification: |
88 The following grammar gives a first specification: |
89 |
89 |
90 \begin{multicols}{2} |
90 \begin{multicols}{2} |
263 statement is a label or not. As said before, if not, then we |
263 statement is a label or not. As said before, if not, then we |
264 just ``throw away'' the label and recursively calculate the |
264 just ``throw away'' the label and recursively calculate the |
265 snippets for the rest of the program. If yes, then we do the |
265 snippets for the rest of the program. If yes, then we do the |
266 same, but also update the map so that $label$ now points to |
266 same, but also update the map so that $label$ now points to |
267 the rest of the statements. There is one small problem we need |
267 the rest of the statements. There is one small problem we need |
268 to overcome: our two programs have no label as \emph{entry |
268 to overcome: our two programs shown so far have no label as |
269 point}---that is where the execution starts. We usually assume |
269 \emph{entry point}---that is where the execution is supposed |
270 that the first statement will be run first. To make this the |
270 to start. We usually assume that the first statement will be |
271 default, it is convenient if we add to all our programs a |
271 run first. To make this the default, it is convenient if we |
272 default label, say \pcode{""} (the empty string). With this we |
272 add to all our programs a default label, say \pcode{""} (the |
273 can define our pre-processing of programs as follows |
273 empty string). With this we can define our pre-processing of |
|
274 programs as follows |
274 |
275 |
275 \begin{center} |
276 \begin{center} |
276 $\textit{preproc}(prog) \dn \textit{snippets}(\pcode{"":}\;\; prog)$ |
277 $\textit{preproc}(prog) \dn \textit{snippets}(\pcode{"":}\;\; prog)$ |
277 \end{center} |
278 \end{center} |
278 |
279 |
367 \textit{eval\_exp} takes an expression of our programming |
368 \textit{eval\_exp} takes an expression of our programming |
368 language as input and returns a number as output. Therefore |
369 language as input and returns a number as output. Therefore |
369 whenever we have a number in our program, we just return this |
370 whenever we have a number in our program, we just return this |
370 number---this is defined in the first clause above. Whenever |
371 number---this is defined in the first clause above. Whenever |
371 we encounter an addition, well then we first evaluate the |
372 we encounter an addition, well then we first evaluate the |
372 left-hand side, $\texttt{e}_\texttt{1}$ of the addition (this |
373 left-hand side $\texttt{e}_\texttt{1}$ of the addition (this |
373 will give a number), then evaluate the right-hand side |
374 will give a number), then evaluate the right-hand side |
374 $\texttt{e}_\texttt{2}$ (this gives another number), and |
375 $\texttt{e}_\texttt{2}$ (this gives another number), and |
375 finally add both numbers together. Here is the subtlety: on |
376 finally add both numbers together. Here is the subtlety: on |
376 the left-hand side of the $\dn$ we have a \texttt{+} (in the |
377 the left-hand side of the $\dn$ we have a \texttt{+} (in the |
377 teletype font) which is the symbol for addition in our |
378 teletype font) which is the symbol for addition in our |
393 designing a programming language. You can see this here: we |
394 designing a programming language. You can see this here: we |
394 need to give meaning to symbols. |
395 need to give meaning to symbols. |
395 |
396 |
396 At the moment however, we are a poor fallible god. Look again |
397 At the moment however, we are a poor fallible god. Look again |
397 at the grammar of our programming language and our definition. |
398 at the grammar of our programming language and our definition. |
398 Clearly, an expression can contain variables. So far we we |
399 Clearly, an expression can contain variables. So far we have |
399 ignored them. What should our interpreter do with variables? |
400 ignored them. What should our interpreter do with variables? |
400 They might change during the evaluation of a program. For |
401 They might change during the evaluation of a program. For |
401 example the variable \pcode{n} in the factorial program counts |
402 example the variable \pcode{n} in the factorial program counts |
402 down from 5 up to 0. How can we improve our definition above |
403 down from 5 up to 0. How can we improve our definition above |
403 to give also an answer whenever our interpreter encounters a |
404 to give also an answer whenever our interpreter encounters a |
404 variable in an expression? The solution is to add an |
405 variable in an expression? The solution is to add an |
405 \emph{environment}, $env$ as an additional input argument to |
406 \emph{environment}, written $env$, as an additional input |
406 our \textit{eval\_exp} function. |
407 argument to our \textit{eval\_exp} function. |
407 |
408 |
408 \begin{center} |
409 \begin{center} |
409 \begin{tabular}{l@{\hspace{1mm}}c@{\hspace{1mm}}l} |
410 \begin{tabular}{l@{\hspace{1mm}}c@{\hspace{1mm}}l} |
410 $\textit{eval\_exp}(\texttt{n}, env)$ & $\dn$ & $n$ |
411 $\textit{eval\_exp}(\texttt{n}, env)$ & $\dn$ & $n$ |
411 \qquad\text{if}\; \texttt{n}\; \text{is a number like} |
412 \qquad\text{if}\; \texttt{n}\; \text{is a number like} |
429 $\textit{eval\_exp}(\texttt{x}, env)$ & $\dn$ & $env(x)$ |
430 $\textit{eval\_exp}(\texttt{x}, env)$ & $\dn$ & $env(x)$ |
430 \end{tabular} |
431 \end{tabular} |
431 \end{center} |
432 \end{center} |
432 |
433 |
433 \noindent This environment $env$ also acts like a map: it |
434 \noindent This environment $env$ also acts like a map: it |
434 associates variable names with the current value. In the |
435 associates variable with their current values. For example |
435 clause for variables, we therefore consult this environment |
436 such an environment might look as follows |
436 and return whatever value is currently stored for this |
437 |
437 variable. This is written $env(x)$ indicating that the |
438 \begin{center} |
438 environment acts like a function. If we call the function with |
439 \begin{tabular}{ll} |
439 $x$ we obtain the corresponding number. What happens if an |
440 $\fbox{\texttt{a}} \mapsto \texttt{1}$ & |
440 environment does not contain any value for, say, the variable |
441 $\fbox{\texttt{n}} \mapsto \texttt{5}$ |
441 $x$. Well, then our interpreter just crashes, or will raise an |
442 \end{tabular} |
442 exception. In this case we had a ``bad'' program that tried to |
443 \end{center} |
443 use a variable before it was initialised. With the second |
444 |
444 version of \textit{eval\_exp} we completed our definition for |
445 \noindent Again I hilighted the keys. In the clause for |
|
446 variables, we can therefore consult this environment and |
|
447 return whatever value is currently stored for this variable. |
|
448 This is written $env(x)$. If we query this map with $x$ we |
|
449 obtain the corresponding number. You might ask what happens if |
|
450 an environment does not contain any value for, say, the |
|
451 variable $x$? Well, then our interpreter just crashes, or will |
|
452 raise an exception. In this case we have a ``bad'' program |
|
453 that tried to use a variable before it was initialised. The |
|
454 programmer should not have don this. With the second version |
|
455 of \textit{eval\_exp} we completed our definition for |
445 evaluating expressions. |
456 evaluating expressions. |
446 |
457 |
447 Next comes the evaluation function for statements. We define |
458 Next comes the evaluation function for statements. We define |
448 this function in such a way that we evaluate a whole sequence |
459 this function in such a way that we evaluate a whole sequence |
449 of statements. Assume a program $p$ and its preprocessed |
460 of statements. Assume a program $p$ and its preprocessed |