handouts/ho09.tex
changeset 368 b46f86d95967
parent 366 34a8f73b2c94
child 369 6c7996b6b471
equal deleted inserted replaced
367:3f0738fc8230 368:b46f86d95967
   180 complexity is really beside the point here. Furthermore, real
   180 complexity is really beside the point here. Furthermore, real
   181 machine code has many instructions for manipulating memory. We
   181 machine code has many instructions for manipulating memory. We
   182 do not have this at all. This is actually a more serious
   182 do not have this at all. This is actually a more serious
   183 simplification because we assume numbers to be arbitrary small
   183 simplification because we assume numbers to be arbitrary small
   184 or large, which is not the case with real machine code. In
   184 or large, which is not the case with real machine code. In
   185 real code basic number formats have a range and might
   185 real machine code, basic number formats have a range and might
   186 over-flow or under-flow from this range. Also the number of
   186 over-flow or under-flow this range. Also the number of
   187 variables in our programs is potentially unlimited, while
   187 variables in our programs is potentially unlimited, while
   188 memory in an actual computer, of course, is always limited
   188 memory in an actual computer, of course, is always limited. To
   189 somehow on any actual. To sum up, our language might look
   189 sum up, our language might look ridiculously simple, but it is
   190 ridiculously simple, but it is not far removed from
   190 not too far removed from practically relevant issues.
   191 practically relevant issues.
       
   192 
   191 
   193 
   192 
   194 \subsubsection*{An Interpreter}
   193 \subsubsection*{An Interpreter}
   195 
   194 
   196 Designing a language is like playing god: you can say what
   195 Designing a language is like playing god: you can say what
   264 \noindent In the first clause we just return the empty map for
   263 \noindent In the first clause we just return the empty map for
   265 the program that does not contain any statement. In the second
   264 the program that does not contain any statement. In the second
   266 clause, we have to distinguish the case where the first
   265 clause, we have to distinguish the case where the first
   267 statement is a label or not. As said before, if not, then we
   266 statement is a label or not. As said before, if not, then we
   268 just ``throw away'' the label and recursively calculate the
   267 just ``throw away'' the label and recursively calculate the
   269 snippets for the rest of the program. If yes, then we do the
   268 snippets for the rest of the program (the otherwise clause).
   270 same, but also update the map so that $label$ now points to
   269 If yes, then we do the same, but also update the map so that
   271 the rest of the statements. There is one small problem we need
   270 $label$ now points to the rest of the statements (the if
   272 to overcome: our two programs shown so far have no label as
   271 clause). This looks all realtively straightforward, but there
   273 \emph{entry point}---that is where the execution is supposed
   272 is one small problem we need to overcome: our two programs
   274 to start. We usually assume that the first statement will be
   273 shown so far have no label as \emph{entry point}---that is
   275 run first. To make this the default, it is convenient if we
   274 where the execution is supposed to start. We usually assume
   276 add to all our programs a default label, say \pcode{""} (the
   275 that the first statement will be run first. To make this the
   277 empty string). With this we can define our pre-processing of
   276 default, it is convenient if we add to all our programs a
   278 programs as follows
   277 default label, say \pcode{""} (the empty string). With this we
       
   278 can define our pre-processing of programs as follows
   279 
   279 
   280 \begin{center}
   280 \begin{center}
   281 $\textit{preproc}(prog) \dn \textit{snippets}(\pcode{"":}\;\; prog)$ 
   281 $\textit{preproc}(prog) \dn \textit{snippets}(\pcode{"":}\;\; prog)$ 
   282 \end{center} 
   282 \end{center} 
   283 
   283 
   364   0 & \text{otherwise}
   364   0 & \text{otherwise}
   365   \end{cases}$          
   365   \end{cases}$          
   366 \end{tabular}
   366 \end{tabular}
   367 \end{center}
   367 \end{center}
   368 
   368 
   369 \noindent While this should look all relatively
   369 \noindent While this should look all rather intuitive`, still
   370 straightforward, still be very careful. There is a subtlety
   370 be very careful. There is a subtlety which can be easily
   371 which can be easily overlooked: The function
   371 overlooked: The function \textit{eval\_exp} takes an
   372 \textit{eval\_exp} takes an expression of our programming
   372 expression of our programming language as input and returns a
   373 language as input and returns a number as output. Therefore
   373 number as output. Therefore whenever we encounter a number in
   374 whenever we have a number in our program, we just return this
   374 our program, we just return this number---this is defined in
   375 number---this is defined in the first clause above. Whenever
   375 the first clause above. Whenever we encounter an addition,
   376 we encounter an addition, well then we first evaluate the
   376 well then we first evaluate the left-hand side
   377 left-hand side $\texttt{e}_\texttt{1}$ of the addition (this
   377 $\texttt{e}_\texttt{1}$ of the addition (this will give a
   378 will give a number), then evaluate the right-hand side
   378 number), then evaluate the right-hand side
   379 $\texttt{e}_\texttt{2}$ (this gives another number), and
   379 $\texttt{e}_\texttt{2}$ (this gives another number), and
   380 finally add both numbers together. Here is the subtlety: on
   380 finally add both numbers together. Here is the subtlety: on
   381 the left-hand side of the $\dn$ we have a \texttt{+} (in the
   381 the left-hand side of the $\dn$ we have a \texttt{+} (in the
   382 teletype font) which is the symbol for addition in our
   382 teletype font) which is the symbol for addition in our
   383 programming language. On the right-hand side we have $+$ which
   383 programming language. On the right-hand side we have $+$ which
   388 implementation of our interpreter, the mathematical operation
   388 implementation of our interpreter, the mathematical operation
   389 will be the function for addition from the programming
   389 will be the function for addition from the programming
   390 language in which we \underline{\smash{implement}} our
   390 language in which we \underline{\smash{implement}} our
   391 interpreter. While the \texttt{+} is just a symbol that is
   391 interpreter. While the \texttt{+} is just a symbol that is
   392 used in our programming language. Clearly we have to use a
   392 used in our programming language. Clearly we have to use a
   393 symbol that is a good mnemonic for addition otherwise we will
   393 symbol that is a good mnemonic for addition, otherwise we will
   394 confuse the programmers working with our language. Therefore
   394 confuse the programmers working with our language. Therefore
   395 we use $\texttt{+}$. A similar choice is made for times in the
   395 we use $\texttt{+}$. A similar choice is made for times in the
   396 third clause and equality in the fourth clause. Remember I
   396 third clause and equality in the fourth clause. 
   397 wrote at the beginning of this section about being god when
   397 
   398 designing a programming language. You can see this here: we
   398 Remember I wrote at the beginning of this section about being
   399 need to give meaning to symbols. 
   399 god when designing a programming language. You can see this
   400 
   400 here: we need to give meaning to symbols. At the moment
   401 At the moment however, we are a poor fallible god. Look again
   401 however, we are a poor fallible god. Look again at the grammar
   402 at the grammar of our programming language and our definition.
   402 of our programming language and our definition. Clearly, an
   403 Clearly, an expression can contain variables. So far we have
   403 expression can contain variables. So far we have ignored them.
   404 ignored them. What should our interpreter do with variables?
   404 What should our interpreter do with variables? They might
   405 They might change during the evaluation of a program. For
   405 change during the evaluation of a program. For example the
   406 example the variable \pcode{n} in the factorial program counts
   406 variable \pcode{n} in the factorial program counts down from 5
   407 down from 5 up to 0. How can we improve our definition above
   407 down to 0. How can we improve our definition above to give also
   408 to give also an answer whenever our interpreter encounters a
   408 an answer whenever our interpreter encounters a variable in an
   409 variable in an expression? The solution is to add an
   409 expression? The solution is to add an \emph{environment},
   410 \emph{environment}, written $env$, as an additional input
   410 written $env$, as an additional input argument to our
   411 argument to our \textit{eval\_exp} function.
   411 \textit{eval\_exp} function.
   412  
   412  
   413 \begin{center}
   413 \begin{center}
   414 \begin{tabular}{l@{\hspace{1mm}}c@{\hspace{1mm}}l}
   414 \begin{tabular}{l@{\hspace{1mm}}c@{\hspace{1mm}}l}
   415 $\textit{eval\_exp}(\texttt{n}, env)$ & $\dn$ & $n$
   415 $\textit{eval\_exp}(\texttt{n}, env)$ & $\dn$ & $n$
   416   \qquad\text{if}\; \texttt{n}\; \text{is a number like} 
   416   \qquad\text{if}\; \texttt{n}\; \text{is a number like} 
   434 $\textit{eval\_exp}(\texttt{x}, env)$ & $\dn$ & $env(x)$      
   434 $\textit{eval\_exp}(\texttt{x}, env)$ & $\dn$ & $env(x)$      
   435 \end{tabular}
   435 \end{tabular}
   436 \end{center}
   436 \end{center}
   437  
   437  
   438 \noindent This environment $env$ also acts like a map: it
   438 \noindent This environment $env$ also acts like a map: it
   439 associates variable with their current values. For example
   439 associates variables with their current values. For example
   440 ߆after evaluating the first two lines in our factorial
   440 after evaluating the first two lines in our factorial
   441 program, such an environment might look as follows
   441 program, such an environment might look as follows
   442 
   442 
   443 \begin{center}
   443 \begin{center}
   444 \begin{tabular}{ll}
   444 \begin{tabular}{ll}
   445 $\fbox{\texttt{a}} \mapsto \texttt{1}$ &
   445 $\fbox{\texttt{a}} \mapsto \texttt{1}$ &
   448 \end{center}
   448 \end{center}
   449 
   449 
   450 \noindent Again I highlighted the keys. In the clause for
   450 \noindent Again I highlighted the keys. In the clause for
   451 variables, we can therefore consult this environment and
   451 variables, we can therefore consult this environment and
   452 return whatever value is currently stored for this variable.
   452 return whatever value is currently stored for this variable.
   453 This is written $env(x)$. If we query this map with $x$ we
   453 This is written $env(x)$---meaning we query this map with $x$
   454 obtain the corresponding number. You might ask what happens if
   454 we obtain the corresponding number. You might ask what happens
   455 an environment does not contain any value for, say, the
   455 if an environment does not contain any value for, say, the
   456 variable $x$? Well, then our interpreter just ``crashes'', or
   456 variable $x$? Well, then our interpreter just ``crashes'', or
   457 more precisely will raise an exception. In this case we have a
   457 more precisely will raise an exception. In this case we have a
   458 ``bad'' program that tried to use a variable before it was
   458 ``bad'' program that tried to use a variable before it was
   459 initialised. The programmer should not have done this. In a
   459 initialised. The programmer should not have done this. In a
   460 real programming language we would of course try a bit harder
   460 real programming language, we would of course try a bit harder
   461 and for example give an error at compile time, or design our
   461 and for example give an error at compile time, or design our
   462 language in such a way that this can never happen. With the
   462 language in such a way that this can never happen. With the
   463 second version of \textit{eval\_exp} we completed our
   463 second version of \textit{eval\_exp} we completed our
   464 definition for evaluating expressions.
   464 definition for evaluating expressions.
   465  
   465  
   491 \noindent The first clause is for the empty program, or when
   491 \noindent The first clause is for the empty program, or when
   492 we arrived at the end of the program. In this case we just
   492 we arrived at the end of the program. In this case we just
   493 return the environment. The second clause is for when the next
   493 return the environment. The second clause is for when the next
   494 statement is a label. That means the program is of the form
   494 statement is a label. That means the program is of the form
   495 $\texttt{label:}\;rest$ where the label is some string and
   495 $\texttt{label:}\;rest$ where the label is some string and
   496 $rest$ stands for all following statement. This case is easy,
   496 $rest$ stands for all following statements. This case is easy,
   497 because our evaluation function just discards the label and
   497 because our evaluation function just discards the label and
   498 evaluates the rest of the statements (we already extracted all
   498 evaluates the rest of the statements (we already extracted all
   499 important information about labels when we pre-processed our
   499 important information about labels when we pre-processed our
   500 programs and generated the snippets). The third clause is for
   500 programs and generated the snippets). The third clause is for
   501 variable assignments. Again we just evaluate the rest for the
   501 variable assignments. Again we just evaluate the rest for the
   505 of the environment we first evaluate the expression
   505 of the environment we first evaluate the expression
   506 $\texttt{e}$ using our evaluation function for expressions.
   506 $\texttt{e}$ using our evaluation function for expressions.
   507 This gives us a number. Then we assign this number to the
   507 This gives us a number. Then we assign this number to the
   508 variable $x$ in the environment. This modified environment
   508 variable $x$ in the environment. This modified environment
   509 will be used to evaluate the rest of the program. The fourth
   509 will be used to evaluate the rest of the program. The fourth
   510 clause is for the unconditional jump to a label. That means we
   510 clause is for the unconditional jump to a label, called
   511 have to look up in our snippets map $sn$ what are the next
   511 \texttt{lbl}. That means we have to look up in our snippets
   512 statements for this label are. Therefore we will continue with
   512 map $sn$ what are the next statements for this label.
   513 evaluating, not with the rest of the program, but with the
   513 Therefore we will continue with evaluating, not with the rest
   514 statements stored in the snippets-map under the label
   514 of the program, but with the statements stored in the
   515 $\texttt{lbl}$. The fifth clause for conditional jumps is
   515 snippets-map under the label $\texttt{lbl}$. The fifth clause
   516 similar, but in order to make the jump we first need to
   516 for conditional jumps is similar, but to decide whether to
   517 evaluate the expression $\texttt{e}$ in order to find out
   517 make the jump we first need to evaluate the expression
   518 whether it is $1$. If yes, we jump, otherwise we just continue
   518 $\texttt{e}$ in order to find out whether it is $1$. If yes,
   519 with evaluating the rest of the program.
   519 we jump, otherwise we just continue with evaluating the rest
       
   520 of the program.
   520 
   521 
   521 Our interpreter works in two stages: First we pre-process our
   522 Our interpreter works in two stages: First we pre-process our
   522 program generating the snippets map $sn$, say. Second we call
   523 program generating the snippets map $sn$, say. Second we call
   523 the evaluation function with the default entry point and the
   524 the evaluation function with the default entry point and the
   524 empty environment:
   525 empty environment:
   543 $\fbox{\texttt{n}} \mapsto \texttt{0}$
   544 $\fbox{\texttt{n}} \mapsto \texttt{0}$
   544 \end{tabular}
   545 \end{tabular}
   545 \end{center}
   546 \end{center}
   546 
   547 
   547 \noindent While the discussion above should have illustrated
   548 \noindent While the discussion above should have illustrated
   548 the ideas, in order to do some serious calculation we clearly
   549 the ideas, in order to do some serious calculations, we clearly
   549 need to implement the interpreter.
   550 need to implement the interpreter.
   550 
   551 
   551 
   552 
   552 \subsubsection*{Code of the Interpreter}
   553 \subsubsection*{Scala Code for the Interpreter}
   553 
   554 
   554 Functional programming languages are very convenient for
   555 Functional programming languages are very convenient for
   555 implementations of interpreters. A convenient choice for a
   556 implementations of interpreters. A good choice for a
   556 functional programming language is Scala. This is a
   557 functional programming language is Scala, a programming
   557 programming language that combines functional and
   558 language that combines functional and object-oriented
   558 object-oriented programming-styles. It has received in the
   559 programming-styles. It has received in the last five years or
   559 last five years or so quite a bit of attention. One reason for
   560 so quite a bit of attention. One reason for this attention is
   560 this attention is that, like the Java programming language,
   561 that, like the Java programming language, Scala compiles to
   561 Scala compiles to the Java Virtual Machine (JVM) and therefore
   562 the Java Virtual Machine (JVM) and therefore Scala programs
   562 Scala programs can run under MacOSX, Linux and
   563 can run under MacOSX, Linux and Windows.\footnote{There are
   563 Windows.\footnote{There are also experimental backends for
   564 also experimental backends for Android and JavaScript.} Unlike
   564 Android and JavaScript.} Unlike Java, however, Scala often
   565 Java, however, Scala often allows programmers to write very
   565 allows programmers to write very concise and elegant code.
   566 concise and elegant code. Some therefore say Scala is the much
   566 Some therefore say Scala is the much better Java. A number of
   567 better Java. A number of companies, The Guardian, Twitter,
   567 companies, The Guardian, Twitter, Coursera, FourSquare,
   568 Coursera, FourSquare, LinkedIn to name a few, either use Scala
   568 LinkedIn to name a few, either use Scala exclusively in
   569 exclusively in production code, or at least to some
   569 production code, or at least to some substantial degree. If
   570 substantial degree. If you want to try out Scala yourself, the
   570 you want to try out Scala yourself, the Scala compiler can be
   571 Scala compiler can be downloaded from
   571 downloaded from
       
   572 
   572 
   573 \begin{quote}
   573 \begin{quote}
   574 \url{http://www.scala-lang.org}
   574 \url{http://www.scala-lang.org}
   575 \end{quote}
   575 \end{quote}
   576 
   576