Paper/Paper.thy
changeset 50 816e84ca16d6
parent 49 b388dceee892
child 52 2cb1e4499983
equal deleted inserted replaced
49:b388dceee892 50:816e84ca16d6
     4 begin
     4 begin
     5 
     5 
     6 (*
     6 (*
     7 hide_const (open) s 
     7 hide_const (open) s 
     8 *)
     8 *)
       
     9 
       
    10 
       
    11 hide_const (open) Divides.adjust
     9 
    12 
    10 abbreviation
    13 abbreviation
    11   "update2 p a \<equiv> update a p"
    14   "update2 p a \<equiv> update a p"
    12 
    15 
    13 consts DUMMY::'a
    16 consts DUMMY::'a
    29 declare [[show_question_marks = false]]
    32 declare [[show_question_marks = false]]
    30 (*>*)
    33 (*>*)
    31 
    34 
    32 section {* Introduction *}
    35 section {* Introduction *}
    33 
    36 
       
    37 
       
    38 
    34 text {*
    39 text {*
    35 
    40 
    36 %\noindent
    41 %\noindent
    37 %We formalised in earlier work the correctness proofs for two
    42 %We formalised in earlier work the correctness proofs for two
    38 %algorithms in Isabelle/HOL---one about type-checking in
    43 %algorithms in Isabelle/HOL---one about type-checking in
    44 %formalise in Isabelle/HOL computability arguments about the
    49 %formalise in Isabelle/HOL computability arguments about the
    45 %algorithms. 
    50 %algorithms. 
    46 
    51 
    47 
    52 
    48 \noindent
    53 \noindent
    49 Suppose you want to mechanise a proof whether a predicate @{term P}, say, is
    54 Suppose you want to mechanise a proof about whether a predicate @{term P}, say, is
    50 decidable or not. Decidability of @{text P} usually amounts to showing
    55 decidable or not. Decidability of @{text P} usually amounts to showing
    51 whether \mbox{@{term "P \<or> \<not>P"}} holds. But this does \emph{not} work
    56 whether \mbox{@{term "P \<or> \<not>P"}} holds. But this does \emph{not} work
    52 in Isabelle/HOL and other HOL theorem provers, since they are based on classical logic
    57 in Isabelle/HOL and other HOL theorem provers, since they are based on classical logic
    53 where the law of excluded middle ensures that \mbox{@{term "P \<or> \<not>P"}}
    58 where the law of excluded middle ensures that \mbox{@{term "P \<or> \<not>P"}}
    54 is always provable no matter whether @{text P} is constructed by
    59 is always provable no matter whether @{text P} is constructed by
    59 %Isabelle/HOL, like in all HOL-based theorem provers, functions are
    64 %Isabelle/HOL, like in all HOL-based theorem provers, functions are
    60 %represented as inductively defined predicates too.
    65 %represented as inductively defined predicates too.
    61 
    66 
    62 The only satisfying way out of this problem in a theorem prover based
    67 The only satisfying way out of this problem in a theorem prover based
    63 on classical logic is to formalise a theory of computability. Norrish
    68 on classical logic is to formalise a theory of computability. Norrish
    64 provided such a formalisation for the HOL4 theorem prover. He choose
    69 provided such a formalisation for the HOL4. He choose
    65 the $\lambda$-calculus as the starting point for his formalisation of
    70 the $\lambda$-calculus as the starting point for his formalisation of
    66 computability theory, because of its ``simplicity'' \cite[Page
    71 computability theory, because of its ``simplicity'' \cite[Page
    67 297]{Norrish11}.  Part of his formalisation is a clever infrastructure
    72 297]{Norrish11}.  Part of his formalisation is a clever infrastructure
    68 for reducing $\lambda$-terms. He also established the computational
    73 for reducing $\lambda$-terms. He also established the computational
    69 equivalence between the $\lambda$-calculus and recursive functions.
    74 equivalence between the $\lambda$-calculus and recursive functions.
    70 Nevertheless he concluded that it would be ``appealing''
    75 Nevertheless he concluded that it would be ``appealing''
    71  to have formalisations for more operational models of
    76  to have formalisations for more operational models of
    72 computations, such as Turing machines or register machines.  One
    77 computations, such as Turing machines or register machines.  One
    73 reason is that many proofs in the literature use them.  He noted
    78 reason is that many proofs in the literature use them.  He noted
    74 however that in the context of theorem provers \cite[Page 310]{Norrish11}:
    79 however that \cite[Page 310]{Norrish11}:
    75 
    80 
    76 \begin{quote}
    81 \begin{quote}
    77 \it``If register machines are unappealing because of their 
    82 \it``If register machines are unappealing because of their 
    78 general fiddliness,\\ Turing machines are an even more 
    83 general fiddliness,\\ Turing machines are an even more 
    79 daunting prospect.''
    84 daunting prospect.''
    83 In this paper we take on this daunting prospect and provide a
    88 In this paper we take on this daunting prospect and provide a
    84 formalisation of Turing machines, as well as abacus machines (a kind
    89 formalisation of Turing machines, as well as abacus machines (a kind
    85 of register machines) and recursive functions. To see the difficulties
    90 of register machines) and recursive functions. To see the difficulties
    86 involved with this work, one has to understand that Turing machine
    91 involved with this work, one has to understand that Turing machine
    87 programs can be completely \emph{unstructured}, behaving 
    92 programs can be completely \emph{unstructured}, behaving 
    88 similar to Basic's infamous goto. This precludes in the
    93 similar to Basic's infamous goto \cite{Dijkstra68}. This precludes in the
    89 general case a compositional Hoare-style reasoning about Turing
    94 general case a compositional Hoare-style reasoning about Turing
    90 programs.  We provide such Hoare-rules for when it is possible to
    95 programs.  We provide such Hoare-rules for when it is possible to
    91 reason in a compositional manner (which is fortunately quite often), but also tackle 
    96 reason in a compositional manner (which is fortunately quite often), but also tackle 
    92 the more complicated case when we translate abacus programs into 
    97 the more complicated case when we translate abacus programs into 
    93 Turing programs.  This aspect of reasoning about computability theory 
    98 Turing programs.  This aspect of reasoning about computability theory 
   150 \end{quote}
   155 \end{quote}
   151 
   156 
   152 \noindent
   157 \noindent
   153 In this paper we follow the approach by Boolos et al \cite{Boolos87},
   158 In this paper we follow the approach by Boolos et al \cite{Boolos87},
   154 which goes back to Post \cite{Post36}, where all Turing machines
   159 which goes back to Post \cite{Post36}, where all Turing machines
   155 operate on tapes that contain only \emph{blank} or \emph{occupied} cells
   160 operate on tapes that contain only \emph{blank} or \emph{occupied} cells. 
   156 (represented by @{term Bk} and @{term Oc}, respectively, in our
   161 Traditionally the content of a cell can be any
   157 formalisation). Traditionally the content of a cell can be any
       
   158 character from a finite alphabet. Although computationally equivalent,
   162 character from a finite alphabet. Although computationally equivalent,
   159 the more restrictive notion of Turing machines in \cite{Boolos87} makes
   163 the more restrictive notion of Turing machines in \cite{Boolos87} makes
   160 the reasoning more uniform. In addition some proofs \emph{about} Turing
   164 the reasoning more uniform. In addition some proofs \emph{about} Turing
   161 machines are simpler.  The reason is that one often needs to encode
   165 machines are simpler.  The reason is that one often needs to encode
   162 Turing machines---consequently if the Turing machines are simpler, then the coding
   166 Turing machines---consequently if the Turing machines are simpler, then the coding
   184 *}
   188 *}
   185 
   189 
   186 section {* Turing Machines *}
   190 section {* Turing Machines *}
   187 
   191 
   188 text {* \noindent
   192 text {* \noindent
   189   Turing machines can be thought of as having a read-write-unit, also
   193   Turing machines can be thought of as having a \emph{head},
   190   referred to as \emph{head},
       
   191   ``gliding'' over a potentially infinite tape. Boolos et
   194   ``gliding'' over a potentially infinite tape. Boolos et
   192   al~\cite{Boolos87} only consider tapes with cells being either blank
   195   al~\cite{Boolos87} only consider tapes with cells being either blank
   193   or occupied, which we represent by a datatype having two
   196   or occupied, which we represent by a datatype having two
   194   constructors, namely @{text Bk} and @{text Oc}.  One way to
   197   constructors, namely @{text Bk} and @{text Oc}.  One way to
   195   represent such tapes is to use a pair of lists, written @{term "(l,
   198   represent such tapes is to use a pair of lists, written @{term "(l,
   196   r)"}, where @{term l} stands for the tape on the left-hand side of the
   199   r)"}, where @{term l} stands for the tape on the left-hand side of the
   197   head and @{term r} for the tape on the right-hand side. We have the
   200   head and @{term r} for the tape on the right-hand side. We have the
   198   convention that the head, abbreviated @{term hd}, of the right-list is
   201   convention that the head, abbreviated @{term hd}, of the right-list is
   199   the cell on which the head of the Turing machine currently operates. This can
   202   the cell on which the head of the Turing machine currently operates. This can
   200   be pictured as follows:
   203   be pictured as follows:
   201 
   204   %
   202   \begin{center}
   205   \begin{center}
   203   \begin{tikzpicture}
   206   \begin{tikzpicture}
   204   \draw[very thick] (-3.0,0)   -- ( 3.0,0);
   207   \draw[very thick] (-3.0,0)   -- ( 3.0,0);
   205   \draw[very thick] (-3.0,0.5) -- ( 3.0,0.5);
   208   \draw[very thick] (-3.0,0.5) -- ( 3.0,0.5);
   206   \draw[very thick] (-0.25,0)   -- (-0.25,0.5);
   209   \draw[very thick] (-0.25,0)   -- (-0.25,0.5);
   231   whenever the head goes over the ``edge'' of the tape. To 
   234   whenever the head goes over the ``edge'' of the tape. To 
   232   make this formal we define five possible \emph{actions} 
   235   make this formal we define five possible \emph{actions} 
   233   the Turing machine can perform:
   236   the Turing machine can perform:
   234 
   237 
   235   \begin{center}
   238   \begin{center}
   236   \begin{tabular}{rcl@ {\hspace{5mm}}l}
   239   \begin{tabular}[t]{@ {}rcl@ {\hspace{2mm}}l}
   237   @{text "a"} & $::=$  & @{term "W0"} & write blank (@{term Bk})\\
   240   @{text "a"} & $::=$  & @{term "W0"} & (write blank, @{term Bk})\\
   238   & $\mid$ & @{term "W1"} & write occupied (@{term Oc})\\
   241   & $\mid$ & @{term "W1"} & (write occupied, @{term Oc})\\
   239   & $\mid$ & @{term L} & move left\\
   242   \end{tabular}
   240   & $\mid$ & @{term R} & move right\\
   243   \begin{tabular}[t]{rcl@ {\hspace{2mm}}l}
   241   & $\mid$ & @{term Nop} & do-nothing operation\\
   244   & $\mid$ & @{term L} & (move left)\\
       
   245   & $\mid$ & @{term R} & (move right)\\
       
   246   \end{tabular}
       
   247   \begin{tabular}[t]{rcl@ {\hspace{2mm}}l@ {}}
       
   248   & $\mid$ & @{term Nop} & (do-nothing operation)\\
   242   \end{tabular}
   249   \end{tabular}
   243   \end{center}
   250   \end{center}
   244 
   251 
   245   \noindent
   252   \noindent
   246   We slightly deviate
   253   We slightly deviate
   261 
   268 
   262   \noindent
   269   \noindent
   263   The first two clauses replace the head of the right-list
   270   The first two clauses replace the head of the right-list
   264   with a new @{term Bk} or @{term Oc}, respectively. To see that
   271   with a new @{term Bk} or @{term Oc}, respectively. To see that
   265   these two clauses make sense in case where @{text r} is the empty
   272   these two clauses make sense in case where @{text r} is the empty
   266   list, one has to know that the tail function, @{term tl}, is defined in
   273   list, one has to know that the tail function, @{term tl}, is defined
   267   Isabelle/HOL
       
   268   such that @{term "tl [] == []"} holds. The third clause 
   274   such that @{term "tl [] == []"} holds. The third clause 
   269   implements the move of the head one step to the left: we need
   275   implements the move of the head one step to the left: we need
   270   to test if the left-list @{term l} is empty; if yes, then we just prepend a 
   276   to test if the left-list @{term l} is empty; if yes, then we just prepend a 
   271   blank cell to the right-list; otherwise we have to remove the
   277   blank cell to the right-list; otherwise we have to remove the
   272   head from the left-list and prepend it to the right-list. Similarly
   278   head from the left-list and prepend it to the right-list. Similarly
   273   in the fourth clause for a right move action. The @{term Nop} operation
   279   in the fourth clause for a right move action. The @{term Nop} operation
   274   leaves the the tape unchanged (last clause).
   280   leaves the the tape unchanged.
   275 
   281 
   276   %Note that our treatment of the tape is rather ``unsymmetric''---we
   282   %Note that our treatment of the tape is rather ``unsymmetric''---we
   277   %have the convention that the head of the right-list is where the
   283   %have the convention that the head of the right-list is where the
   278   %head is currently positioned. Asperti and Ricciotti
   284   %head is currently positioned. Asperti and Ricciotti
   279   %\cite{AspertiRicciotti12} also considered such a representation, but
   285   %\cite{AspertiRicciotti12} also considered such a representation, but
   289   %by a right-move being the identity on tapes. Since we are not using
   295   %by a right-move being the identity on tapes. Since we are not using
   290   %the notion of tape equality, we can get away with the unsymmetric
   296   %the notion of tape equality, we can get away with the unsymmetric
   291   %definition above, and by using the @{term update} function
   297   %definition above, and by using the @{term update} function
   292   %cover uniformly all cases including corner cases.
   298   %cover uniformly all cases including corner cases.
   293 
   299 
   294   Next we need to define the \emph{states} of a Turing machine.  Given
   300   Next we need to define the \emph{states} of a Turing machine.  
   295   how little is usually said about how to represent them in informal
   301   %Given
   296   presentations, it might be surprising that in a theorem prover we
   302   %how little is usually said about how to represent them in informal
   297   have to select carefully a representation. If we use the naive
   303   %presentations, it might be surprising that in a theorem prover we
   298   representation where a Turing machine consists of a finite set of
   304   %have to select carefully a representation. If we use the naive
   299   states, then we will have difficulties composing two Turing
   305   %representation where a Turing machine consists of a finite set of
   300   machines: we would need to combine two finite sets of states,
   306   %states, then we will have difficulties composing two Turing
   301   possibly renaming states apart whenever both machines share
   307   %machines: we would need to combine two finite sets of states,
   302   states.\footnote{The usual disjoint union operation in Isabelle/HOL
   308   %possibly renaming states apart whenever both machines share
   303   cannot be used as it does not preserve types.} This renaming can be
   309   %states.\footnote{The usual disjoint union operation in Isabelle/HOL
   304   quite cumbersome to reason about. Therefore we made the choice of
   310   %cannot be used as it does not preserve types.} This renaming can be
       
   311   %quite cumbersome to reason about. 
       
   312   We followed the choice made by \cite{AspertiRicciotti12} 
   305   representing a state by a natural number and the states of a Turing
   313   representing a state by a natural number and the states of a Turing
   306   machine will always consist of the initial segment of natural
   314   machine by the initial segment of natural numbers starting from @{text 0}.
   307   numbers starting from @{text 0} up to the number of states of the
   315   In doing so we can compose two Turing machine by
   308   machine. In doing so we can compose two Turing machine by
       
   309   shifting the states of one by an appropriate amount to a higher
   316   shifting the states of one by an appropriate amount to a higher
   310   segment and adjusting some ``next states'' in the other.
   317   segment and adjusting some ``next states'' in the other. {\it composition here?}
   311 
   318 
   312   An \emph{instruction} @{term i} of a Turing machine is a pair consisting of 
   319   An \emph{instruction} @{term i} of a Turing machine is a pair consisting of 
   313   an action and a natural number (the next state). A \emph{program} @{term p} of a Turing
   320   an action and a natural number (the next state). A \emph{program} @{term p} of a Turing
   314   machine is then a list of such pairs. Using as an example the following Turing machine
   321   machine is then a list of such pairs. Using as an example the following Turing machine
   315   program, which consists of four instructions
   322   program, which consists of four instructions
   331   segment determines what action should be taken and which next state
   338   segment determines what action should be taken and which next state
   332   should be transitioned to in case the head reads a @{term Bk};
   339   should be transitioned to in case the head reads a @{term Bk};
   333   similarly the second component determines what should be done in
   340   similarly the second component determines what should be done in
   334   case of reading @{term Oc}. We have the convention that the first
   341   case of reading @{term Oc}. We have the convention that the first
   335   state is always the \emph{starting state} of the Turing machine.
   342   state is always the \emph{starting state} of the Turing machine.
   336   The zeroth state is special in that it will be used as the
   343   The @{text 0}-state is special in that it will be used as the
   337   ``halting state''.  There are no instructions for the @{text
   344   ``halting state''.  There are no instructions for the @{text
   338   0}-state, but it will always perform a @{term Nop}-operation and
   345   0}-state, but it will always perform a @{term Nop}-operation and
   339   remain in the @{text 0}-state.  Unlike Asperti and Riccioti
   346   remain in the @{text 0}-state.  Unlike Asperti and Riccioti
   340   \cite{AspertiRicciotti12}, we have chosen a very concrete
   347   \cite{AspertiRicciotti12}, we have chosen a very concrete
   341   representation for programs, because when constructing a universal
   348   representation for programs, because when constructing a universal
   350  
   357  
   351   \begin{center}
   358   \begin{center}
   352   \begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
   359   \begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
   353   \multicolumn{3}{l}{@{thm fetch.simps(1)[where b=DUMMY]}}\\
   360   \multicolumn{3}{l}{@{thm fetch.simps(1)[where b=DUMMY]}}\\
   354   @{thm (lhs) fetch.simps(2)} & @{text "\<equiv>"} & @{text "case nth_of p (2 * s) of"}\\
   361   @{thm (lhs) fetch.simps(2)} & @{text "\<equiv>"} & @{text "case nth_of p (2 * s) of"}\\
   355   \multicolumn{3}{@ {\hspace{1.4cm}}l}{@{text "None \<Rightarrow> (Nop, 0) | Some i \<Rightarrow> i"}}\\
   362   \multicolumn{3}{@ {\hspace{4cm}}l}{@{text "None \<Rightarrow> (Nop, 0) | Some i \<Rightarrow> i"}}\\
   356   @{thm (lhs) fetch.simps(3)} & @{text "\<equiv>"} & @{text "case nth_of p (2 * s + 1) of"}\\
   363   @{thm (lhs) fetch.simps(3)} & @{text "\<equiv>"} & @{text "case nth_of p (2 * s + 1) of"}\\
   357   \multicolumn{3}{@ {\hspace{1.4cm}}l}{@{text "None \<Rightarrow> (Nop, 0) | Some i \<Rightarrow> i"}}
   364   \multicolumn{3}{@ {\hspace{4cm}}l}{@{text "None \<Rightarrow> (Nop, 0) | Some i \<Rightarrow> i"}}
   358   \end{tabular}
   365   \end{tabular}
   359   \end{center}
   366   \end{center}
   360 
   367 
   361   \noindent
   368   \noindent
   362   In this definition the function @{term nth_of} returns the @{text n}th element
   369   In this definition the function @{term nth_of} returns the @{text n}th element
   363   from a list, provided it exists (@{term Some}-case), or if it does not, it
   370   from a list, provided it exists (@{term Some}-case), or if it does not, it
   364   returns the default action @{term Nop} and the default state @{text 0} 
   371   returns the default action @{term Nop} and the default state @{text 0} 
   365   (@{term None}-case). In doing so we slightly deviate from the description
   372   (@{term None}-case). In doing so we slightly deviate from the description
   366   in \cite{Boolos87}: if their Turing machines transition to a non-existing
   373   in \cite{Boolos87}: if their Turing machines transition to a non-existing
   367   state, then the computation is halted. We will transition in such cases
   374   state, then the computation is halted. We will transition in such cases
   368   to the @{text 0}-state. However, with introducing the
   375   to the @{text 0}-state.\footnote{\it However, with introducing the
   369   notion of \emph{well-formed} Turing machine programs we will later exclude such
   376   notion of \emph{well-formed} Turing machine programs we will later exclude such
   370   cases and make the  @{text 0}-state the only ``halting state''. A program 
   377   cases and make the  @{text 0}-state the only ``halting state''. A program 
   371   @{term p} is said to be well-formed if it satisfies
   378   @{term p} is said to be well-formed if it satisfies
   372   the following three properties:
   379   the following three properties:
   373 
   380 
   382   \noindent
   389   \noindent
   383   The first says that @{text p} must have at least an instruction for the starting 
   390   The first says that @{text p} must have at least an instruction for the starting 
   384   state; the second that @{text p} has a @{term Bk} and @{term Oc} instruction for every 
   391   state; the second that @{text p} has a @{term Bk} and @{term Oc} instruction for every 
   385   state, and the third that every next-state is one of the states mentioned in
   392   state, and the third that every next-state is one of the states mentioned in
   386   the program or being the @{text 0}-state.
   393   the program or being the @{text 0}-state.
       
   394   }
   387 
   395 
   388   A \emph{configuration} @{term c} of a Turing machine is a state together with 
   396   A \emph{configuration} @{term c} of a Turing machine is a state together with 
   389   a tape. This is written as @{text "(s, (l, r))"}. If we have a 
   397   a tape. This is written as @{text "(s, (l, r))"}. If we have a 
   390   configuration and a program, we can calculate
   398   configuration and a program, we can calculate
   391   what the next configuration is by fetching the appropriate action and next state
   399   what the next configuration is by fetching the appropriate action and next state
   483   relatively straightforward, if slightly fiddly. We use the following two
   491   relatively straightforward, if slightly fiddly. We use the following two
   484   auxiliary functions:
   492   auxiliary functions:
   485 
   493 
   486   \begin{center}
   494   \begin{center}
   487   \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1mm}}l@ {}}
   495   \begin{tabular}{@ {}l@ {\hspace{1mm}}c@ {\hspace{1mm}}l@ {}}
   488   @{thm (lhs) shift.simps} @{text "\<equiv>"}\\
   496   @{thm (lhs) shift.simps} @{text "\<equiv>"} @{thm (rhs) shift.simps}\\
   489   \hspace{4mm}@{thm (rhs) shift.simps}\\
   497   @{thm (lhs) adjust.simps} @{text "\<equiv>"} @{thm (rhs) adjust.simps}\\
   490   @{thm (lhs) adjust.simps} @{text "\<equiv>"}\\
       
   491   \hspace{4mm}@{text "map (\<lambda> (a, s)."}\\
       
   492   \hspace{14mm}@{text "(a, if s = 0 then length p div 2 + 1 else s)) p"}\\
       
   493   \end{tabular}
   498   \end{tabular}
   494   \end{center}
   499   \end{center}
   495 
   500 
   496   \noindent
   501   \noindent
   497   The first adds @{text n} to all states, exept the @{text 0}-state,
   502   The first adds @{text n} to all states, exept the @{text 0}-state,
   501   to the first state after the program @{text p}.  With these two
   506   to the first state after the program @{text p}.  With these two
   502   functions in place, we can define the \emph{sequential composition}
   507   functions in place, we can define the \emph{sequential composition}
   503   of two Turing machine programs @{text "p\<^isub>1"} and @{text "p\<^isub>2"}
   508   of two Turing machine programs @{text "p\<^isub>1"} and @{text "p\<^isub>2"}
   504 
   509 
   505   \begin{center}
   510   \begin{center}
   506   @{thm tm_comp.simps[THEN eq_reflection]}
   511   @{thm tm_comp.simps[where ?p1.0="p\<^isub>1" and ?p2.0="p\<^isub>2", THEN eq_reflection]}
   507   \end{center}
   512   \end{center}
   508 
   513 
   509   \noindent
   514   \noindent
   510   This means @{text "p\<^isub>1"} is executed first. Whenever it originally
   515   This means @{text "p\<^isub>1"} is executed first. Whenever it originally
   511   transitioned to the @{text 0}-state, it will in the composed program transition to the starting
   516   transitioned to the @{text 0}-state, it will in the composed program transition to the starting