tm: comparison Paper/Paper.thy

equal deleted inserted replaced

-:559e5c6e5113
+:b388dceee892
 section {* Introduction *}
 text {*
+%\noindent
+%We formalised in earlier work the correctness proofs for two
+%algorithms in Isabelle/HOL---one about type-checking in
+%LF~\cite{UrbanCheneyBerghofer11} and another about deciding requests
+%in access control~\cite{WuZhangUrban12}.  The formalisations
+%uncovered a gap in the informal correctness proof of the former and
+%made us realise that important details were left out in the informal
+%model for the latter. However, in both cases we were unable to
+%formalise in Isabelle/HOL computability arguments about the
+%algorithms.
 \noindent
-We formalised in earlier work the correctness proofs for two
+Suppose you want to mechanise a proof whether a predicate @{term P}, say, is
-algorithms in Isabelle/HOL---one about type-checking in
+decidable or not. Decidability of @{text P} usually amounts to showing
-LF~\cite{UrbanCheneyBerghofer11} and another about deciding requests
-in access control~\cite{WuZhangUrban12}.  The formalisations
-uncovered a gap in the informal correctness proof of the former and
-made us realise that important details were left out in the informal
-model for the latter. However, in both cases we were unable to
-formalise in Isabelle/HOL computability arguments about the
-algorithms. The reason is that both algorithms are formulated in terms
-of inductive predicates. Suppose @{text "P"} stands for one such
-predicate.  Decidability of @{text P} usually amounts to showing
 whether \mbox{@{term "P \<or> \<not>P"}} holds. But this does \emph{not} work
-in Isabelle/HOL, since it is a theorem prover based on classical logic
+in Isabelle/HOL and other HOL theorem provers, since they are based on classical logic
 where the law of excluded middle ensures that \mbox{@{term "P \<or> \<not>P"}}
 is always provable no matter whether @{text P} is constructed by
-computable means. The same problem would arise if we had formulated
+computable means.
-the algorithms as recursive functions, because internally in
-Isabelle/HOL, like in all HOL-based theorem provers, functions are
+%The same problem would arise if we had formulated
-represented as inductively defined predicates too.
+%the algorithms as recursive functions, because internally in
+%Isabelle/HOL, like in all HOL-based theorem provers, functions are
-The only satisfying way out of this problem in a theorem prover based on classical
+%represented as inductively defined predicates too.
-logic is to formalise a theory of computability. Norrish provided such
-a formalisation for the HOL4 theorem prover. He choose the
+The only satisfying way out of this problem in a theorem prover based
-$\lambda$-calculus as the starting point for his formalisation
+on classical logic is to formalise a theory of computability. Norrish
-of computability theory,
+provided such a formalisation for the HOL4 theorem prover. He choose
-because of its ``simplicity'' \cite[Page 297]{Norrish11}.  Part of his
+the $\lambda$-calculus as the starting point for his formalisation of
-formalisation is a clever infrastructure for reducing
+computability theory, because of its ``simplicity'' \cite[Page
-$\lambda$-terms. He also established the computational equivalence
+297]{Norrish11}.  Part of his formalisation is a clever infrastructure
-between the $\lambda$-calculus and recursive functions.  Nevertheless he
+for reducing $\lambda$-terms. He also established the computational
-concluded that it would be ``appealing'' to have formalisations for more
+equivalence between the $\lambda$-calculus and recursive functions.
-operational models of computations, such as Turing machines or register
+Nevertheless he concluded that it would be ``appealing''
-machines.  One reason is that many proofs in the literature use
+to have formalisations for more operational models of
-them.  He noted however that in the context of theorem provers
+computations, such as Turing machines or register machines.  One
-\cite[Page 310]{Norrish11}:
+reason is that many proofs in the literature use them.  He noted
+however that in the context of theorem provers \cite[Page 310]{Norrish11}:
 \begin{quote}
 \it``If register machines are unappealing because of their
-general fiddliness, Turing machines are an even more
+general fiddliness,\\ Turing machines are an even more
 daunting prospect.''
 \end{quote}
 \noindent
 In this paper we take on this daunting prospect and provide a
 formalisation of Turing machines, as well as abacus machines (a kind
 of register machines) and recursive functions. To see the difficulties
-involved with this work, one has to understand that interactive
+involved with this work, one has to understand that Turing machine
-theorem provers, like Isabelle/HOL, are at their best when the
+programs can be completely \emph{unstructured}, behaving
-data-structures at hand are ``structurally'' defined, like lists,
+similar to Basic's infamous goto. This precludes in the
-natural numbers, regular expressions, etc. Such data-structures come
+general case a compositional Hoare-style reasoning about Turing
-with convenient reasoning infrastructures (for example induction
+programs.  We provide such Hoare-rules for when it is possible to
-principles, recursion combinators and so on).  But this is \emph{not}
+reason in a compositional manner (which is fortunately quite often), but also tackle
-the case with Turing machines (and also not with register machines):
+the more complicated case when we translate abacus programs into
-underlying their definitions are sets of states together with
+Turing programs.  This aspect of reasoning about computability theory
-transition functions, all of which are not structurally defined.  This
+is usually completely left out in the informal literature, e.g.~\cite{Boolos87}.
-means we have to implement our own reasoning infrastructure in order
-to prove properties about them. This leads to annoyingly fiddly
+%To see the difficulties
-formalisations.  We noticed first the difference between both,
+%involved with this work, one has to understand that interactive
-structural and non-structural, ``worlds'' when formalising the
+%theorem provers, like Isabelle/HOL, are at their best when the
-Myhill-Nerode theorem, where regular expressions fared much better
+%data-structures at hand are ``structurally'' defined, like lists,
-than automata \cite{WuZhangUrban11}.  However, with Turing machines
+%natural numbers, regular expressions, etc. Such data-structures come
-there seems to be no alternative if one wants to formalise the great
+%with convenient reasoning infrastructures (for example induction
-many proofs from the literature that use them.  We will analyse one
+%principles, recursion combinators and so on).  But this is \emph{not}
-example---undecidability of Wang's tiling problem---in Section~\ref{Wang}. The
+%the case with Turing machines (and also not with register machines):
-standard proof of this property uses the notion of universal
+%underlying their definitions are sets of states together with
-Turing machines.
+%transition functions, all of which are not structurally defined.  This
+%means we have to implement our own reasoning infrastructure in order
-We are not the first who formalised Turing machines in a theorem
+%to prove properties about them. This leads to annoyingly fiddly
-prover: we are aware of the preliminary work by Asperti and Ricciotti
+%formalisations.  We noticed first the difference between both,
+%structural and non-structural, ``worlds'' when formalising the
+%Myhill-Nerode theorem, where regular expressions fared much better
+%than automata \cite{WuZhangUrban11}.  However, with Turing machines
+%there seems to be no alternative if one wants to formalise the great
+%many proofs from the literature that use them.  We will analyse one
+%example---undecidability of Wang's tiling problem---in Section~\ref{Wang}. The
+%standard proof of this property uses the notion of universal
+%Turing machines.
+We are not the first who formalised Turing machines: we are aware
+of the preliminary work by Asperti and Ricciotti
 \cite{AspertiRicciotti12}. They describe a complete formalisation of
 Turing machines in the Matita theorem prover, including a universal
 Turing machine. They report that the informal proofs from which they
 started are \emph{not} ``sufficiently accurate to be directly usable as a
 guideline for formalization'' \cite[Page 2]{AspertiRicciotti12}. For
 machines computing unary functions. We had to figure out a way to
 generalise this result to $n$-ary functions. Similarly, when compiling
 recursive functions to abacus machines, the textbook again only shows
 how it can be done for 2- and 3-ary functions, but in the
 formalisation we need arbitrary functions. But the general ideas for
-how to do this are clear enough in \cite{Boolos87}. However, one
+how to do this are clear enough in \cite{Boolos87}.
-aspect that is completely left out from the informal description in
+%However, one
-\cite{Boolos87}, and similar ones we are aware of, is arguments why certain Turing
+%aspect that is completely left out from the informal description in
-machines are correct. We will introduce Hoare-style proof rules
+%\cite{Boolos87}, and similar ones we are aware of, is arguments why certain Turing
-which help us with such correctness arguments of Turing machines.
+%machines are correct. We will introduce Hoare-style proof rules
+%which help us with such correctness arguments of Turing machines.
 The main difference between our formalisation and the one by Asperti
 and Ricciotti is that their universal Turing machine uses a different
 alphabet than the machines it simulates. They write \cite[Page
 23]{AspertiRicciotti12}:
 whenever the head goes over the ``edge'' of the tape. To
 make this formal we define five possible \emph{actions}
 the Turing machine can perform:
 \begin{center}
-\begin{tabular}{rcll}
+\begin{tabular}{rcl@ {\hspace{5mm}}l}
 @{text "a"} & $::=$  & @{term "W0"} & write blank (@{term Bk})\\
 & $\mid$ & @{term "W1"} & write occupied (@{term Oc})\\
 & $\mid$ & @{term L} & move left\\
 & $\mid$ & @{term R} & move right\\
 & $\mid$ & @{term Nop} & do-nothing operation\\
 \begin{center}
 \begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
 @{thm (lhs) update.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) update.simps(1)}\\
 @{thm (lhs) update.simps(2)} & @{text "\<equiv>"} & @{thm (rhs) update.simps(2)}\\
-@{thm (lhs) update.simps(3)} & @{text "\<equiv>"} & \\
+@{thm (lhs) update.simps(3)} & @{text "\<equiv>"} & @{thm (rhs) update.simps(3)}\\
-\multicolumn{3}{l}{\hspace{1cm}@{thm (rhs) update.simps(3)}}\\
+@{thm (lhs) update.simps(4)} & @{text "\<equiv>"} & @{thm (rhs) update.simps(4)}\\
-@{thm (lhs) update.simps(4)} & @{text "\<equiv>"} & \\
-\multicolumn{3}{l}{\hspace{1cm}@{thm (rhs) update.simps(4)}}\\
 @{thm (lhs) update.simps(5)} & @{text "\<equiv>"} & @{thm (rhs) update.simps(5)}\\
 \end{tabular}
 \end{center}
 \noindent
 blank cell to the right-list; otherwise we have to remove the
 head from the left-list and prepend it to the right-list. Similarly
 in the fourth clause for a right move action. The @{term Nop} operation
 leaves the the tape unchanged (last clause).
-Note that our treatment of the tape is rather ``unsymmetric''---we
+%Note that our treatment of the tape is rather ``unsymmetric''---we
-have the convention that the head of the right-list is where the
+%have the convention that the head of the right-list is where the
-head is currently positioned. Asperti and Ricciotti
+%head is currently positioned. Asperti and Ricciotti
-\cite{AspertiRicciotti12} also considered such a representation, but
+%\cite{AspertiRicciotti12} also considered such a representation, but
-dismiss it as it complicates their definition for \emph{tape
+%dismiss it as it complicates their definition for \emph{tape
-equality}. The reason is that moving the head one step to
+%equality}. The reason is that moving the head one step to
-the left and then back to the right might change the tape (in case
+%the left and then back to the right might change the tape (in case
-of going over the ``edge''). Therefore they distinguish four types
+%of going over the ``edge''). Therefore they distinguish four types
-of tapes: one where the tape is empty; another where the head
+%of tapes: one where the tape is empty; another where the head
-is on the left edge, respectively right edge, and in the middle
+%is on the left edge, respectively right edge, and in the middle
-of the tape. The reading, writing and moving of the tape is then
+%of the tape. The reading, writing and moving of the tape is then
-defined in terms of these four cases.  In this way they can keep the
+%defined in terms of these four cases.  In this way they can keep the
-tape in a ``normalised'' form, and thus making a left-move followed
+%tape in a ``normalised'' form, and thus making a left-move followed
-by a right-move being the identity on tapes. Since we are not using
+%by a right-move being the identity on tapes. Since we are not using
-the notion of tape equality, we can get away with the unsymmetric
+%the notion of tape equality, we can get away with the unsymmetric
-definition above, and by using the @{term update} function
+%definition above, and by using the @{term update} function
-cover uniformly all cases including corner cases.
+%cover uniformly all cases including corner cases.
 Next we need to define the \emph{states} of a Turing machine.  Given
 how little is usually said about how to represent them in informal
 presentations, it might be surprising that in a theorem prover we
 have to select carefully a representation. If we use the naive
 the function @{term fetch}
 \begin{center}
 \begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
 \multicolumn{3}{l}{@{thm fetch.simps(1)[where b=DUMMY]}}\\
-@{thm (lhs) fetch.simps(2)} & @{text "\<equiv>"} & \\
+@{thm (lhs) fetch.simps(2)} & @{text "\<equiv>"} & @{text "case nth_of p (2 * s) of"}\\
-\multicolumn{3}{@ {\hspace{1cm}}l}{@{text "case nth_of p (2 * s) of"}}\\
+\multicolumn{3}{@ {\hspace{1.4cm}}l}{@{text "None \<Rightarrow> (Nop, 0) | Some i \<Rightarrow> i"}}\\
-\multicolumn{3}{@ {\hspace{1.4cm}}l}{@{text "None \<Rightarrow> (Nop, 0) |"}}\\
+@{thm (lhs) fetch.simps(3)} & @{text "\<equiv>"} & @{text "case nth_of p (2 * s + 1) of"}\\
-\multicolumn{3}{@ {\hspace{1.4cm}}l}{@{text "Some i \<Rightarrow> i"}}\\
+\multicolumn{3}{@ {\hspace{1.4cm}}l}{@{text "None \<Rightarrow> (Nop, 0) | Some i \<Rightarrow> i"}}
-@{thm (lhs) fetch.simps(3)} & @{text "\<equiv>"} & \\
-\multicolumn{3}{@ {\hspace{1cm}}l}{@{text "case nth_of p (2 * s + 1) of"}}\\
-\multicolumn{3}{@ {\hspace{1.4cm}}l}{@{text "None \<Rightarrow> (Nop, 0) |"}}\\
-\multicolumn{3}{@ {\hspace{1.4cm}}l}{@{text "Some i \<Rightarrow> i"}}
 \end{tabular}
 \end{center}
 \noindent
 In this definition the function @{term nth_of} returns the @{text n}th element

changeset 49	b388dceee892
parent 48	559e5c6e5113
child 50	816e84ca16d6