shortening a bit the paper and updating various things
authorChristian Urban <christian dot urban at kcl dot ac dot uk>
Fri, 18 Jan 2013 13:03:09 +0000
changeset 49 b388dceee892
parent 48 559e5c6e5113
child 50 816e84ca16d6
shortening a bit the paper and updating various things
Paper/Paper.thy
Paper/document/root.tex
paper.pdf
--- a/Paper/Paper.thy	Fri Jan 18 11:40:01 2013 +0000
+++ b/Paper/Paper.thy	Fri Jan 18 13:03:09 2013 +0000
@@ -33,45 +33,49 @@
 
 text {*
 
+%\noindent
+%We formalised in earlier work the correctness proofs for two
+%algorithms in Isabelle/HOL---one about type-checking in
+%LF~\cite{UrbanCheneyBerghofer11} and another about deciding requests
+%in access control~\cite{WuZhangUrban12}.  The formalisations
+%uncovered a gap in the informal correctness proof of the former and
+%made us realise that important details were left out in the informal
+%model for the latter. However, in both cases we were unable to
+%formalise in Isabelle/HOL computability arguments about the
+%algorithms. 
+
+
 \noindent
-We formalised in earlier work the correctness proofs for two
-algorithms in Isabelle/HOL---one about type-checking in
-LF~\cite{UrbanCheneyBerghofer11} and another about deciding requests
-in access control~\cite{WuZhangUrban12}.  The formalisations
-uncovered a gap in the informal correctness proof of the former and
-made us realise that important details were left out in the informal
-model for the latter. However, in both cases we were unable to
-formalise in Isabelle/HOL computability arguments about the
-algorithms. The reason is that both algorithms are formulated in terms
-of inductive predicates. Suppose @{text "P"} stands for one such
-predicate.  Decidability of @{text P} usually amounts to showing
+Suppose you want to mechanise a proof whether a predicate @{term P}, say, is
+decidable or not. Decidability of @{text P} usually amounts to showing
 whether \mbox{@{term "P \<or> \<not>P"}} holds. But this does \emph{not} work
-in Isabelle/HOL, since it is a theorem prover based on classical logic
+in Isabelle/HOL and other HOL theorem provers, since they are based on classical logic
 where the law of excluded middle ensures that \mbox{@{term "P \<or> \<not>P"}}
 is always provable no matter whether @{text P} is constructed by
-computable means. The same problem would arise if we had formulated
-the algorithms as recursive functions, because internally in
-Isabelle/HOL, like in all HOL-based theorem provers, functions are
-represented as inductively defined predicates too.
+computable means. 
+
+%The same problem would arise if we had formulated
+%the algorithms as recursive functions, because internally in
+%Isabelle/HOL, like in all HOL-based theorem provers, functions are
+%represented as inductively defined predicates too.
 
-The only satisfying way out of this problem in a theorem prover based on classical
-logic is to formalise a theory of computability. Norrish provided such
-a formalisation for the HOL4 theorem prover. He choose the
-$\lambda$-calculus as the starting point for his formalisation
-of computability theory,
-because of its ``simplicity'' \cite[Page 297]{Norrish11}.  Part of his
-formalisation is a clever infrastructure for reducing
-$\lambda$-terms. He also established the computational equivalence
-between the $\lambda$-calculus and recursive functions.  Nevertheless he
-concluded that it would be ``appealing'' to have formalisations for more
-operational models of computations, such as Turing machines or register
-machines.  One reason is that many proofs in the literature use 
-them.  He noted however that in the context of theorem provers
-\cite[Page 310]{Norrish11}:
+The only satisfying way out of this problem in a theorem prover based
+on classical logic is to formalise a theory of computability. Norrish
+provided such a formalisation for the HOL4 theorem prover. He choose
+the $\lambda$-calculus as the starting point for his formalisation of
+computability theory, because of its ``simplicity'' \cite[Page
+297]{Norrish11}.  Part of his formalisation is a clever infrastructure
+for reducing $\lambda$-terms. He also established the computational
+equivalence between the $\lambda$-calculus and recursive functions.
+Nevertheless he concluded that it would be ``appealing''
+ to have formalisations for more operational models of
+computations, such as Turing machines or register machines.  One
+reason is that many proofs in the literature use them.  He noted
+however that in the context of theorem provers \cite[Page 310]{Norrish11}:
 
 \begin{quote}
 \it``If register machines are unappealing because of their 
-general fiddliness, Turing machines are an even more 
+general fiddliness,\\ Turing machines are an even more 
 daunting prospect.''
 \end{quote}
 
@@ -79,29 +83,40 @@
 In this paper we take on this daunting prospect and provide a
 formalisation of Turing machines, as well as abacus machines (a kind
 of register machines) and recursive functions. To see the difficulties
-involved with this work, one has to understand that interactive
-theorem provers, like Isabelle/HOL, are at their best when the
-data-structures at hand are ``structurally'' defined, like lists,
-natural numbers, regular expressions, etc. Such data-structures come
-with convenient reasoning infrastructures (for example induction
-principles, recursion combinators and so on).  But this is \emph{not}
-the case with Turing machines (and also not with register machines):
-underlying their definitions are sets of states together with 
-transition functions, all of which are not structurally defined.  This
-means we have to implement our own reasoning infrastructure in order
-to prove properties about them. This leads to annoyingly fiddly
-formalisations.  We noticed first the difference between both,
-structural and non-structural, ``worlds'' when formalising the
-Myhill-Nerode theorem, where regular expressions fared much better
-than automata \cite{WuZhangUrban11}.  However, with Turing machines
-there seems to be no alternative if one wants to formalise the great
-many proofs from the literature that use them.  We will analyse one
-example---undecidability of Wang's tiling problem---in Section~\ref{Wang}. The
-standard proof of this property uses the notion of universal
-Turing machines.
+involved with this work, one has to understand that Turing machine
+programs can be completely \emph{unstructured}, behaving 
+similar to Basic's infamous goto. This precludes in the
+general case a compositional Hoare-style reasoning about Turing
+programs.  We provide such Hoare-rules for when it is possible to
+reason in a compositional manner (which is fortunately quite often), but also tackle 
+the more complicated case when we translate abacus programs into 
+Turing programs.  This aspect of reasoning about computability theory 
+is usually completely left out in the informal literature, e.g.~\cite{Boolos87}.
 
-We are not the first who formalised Turing machines in a theorem
-prover: we are aware of the preliminary work by Asperti and Ricciotti
+%To see the difficulties
+%involved with this work, one has to understand that interactive
+%theorem provers, like Isabelle/HOL, are at their best when the
+%data-structures at hand are ``structurally'' defined, like lists,
+%natural numbers, regular expressions, etc. Such data-structures come
+%with convenient reasoning infrastructures (for example induction
+%principles, recursion combinators and so on).  But this is \emph{not}
+%the case with Turing machines (and also not with register machines):
+%underlying their definitions are sets of states together with 
+%transition functions, all of which are not structurally defined.  This
+%means we have to implement our own reasoning infrastructure in order
+%to prove properties about them. This leads to annoyingly fiddly
+%formalisations.  We noticed first the difference between both,
+%structural and non-structural, ``worlds'' when formalising the
+%Myhill-Nerode theorem, where regular expressions fared much better
+%than automata \cite{WuZhangUrban11}.  However, with Turing machines
+%there seems to be no alternative if one wants to formalise the great
+%many proofs from the literature that use them.  We will analyse one
+%example---undecidability of Wang's tiling problem---in Section~\ref{Wang}. The
+%standard proof of this property uses the notion of universal
+%Turing machines.
+
+We are not the first who formalised Turing machines: we are aware 
+of the preliminary work by Asperti and Ricciotti
 \cite{AspertiRicciotti12}. They describe a complete formalisation of
 Turing machines in the Matita theorem prover, including a universal
 Turing machine. They report that the informal proofs from which they
@@ -116,11 +131,12 @@
 recursive functions to abacus machines, the textbook again only shows
 how it can be done for 2- and 3-ary functions, but in the
 formalisation we need arbitrary functions. But the general ideas for
-how to do this are clear enough in \cite{Boolos87}. However, one
-aspect that is completely left out from the informal description in
-\cite{Boolos87}, and similar ones we are aware of, is arguments why certain Turing
-machines are correct. We will introduce Hoare-style proof rules
-which help us with such correctness arguments of Turing machines.
+how to do this are clear enough in \cite{Boolos87}. 
+%However, one
+%aspect that is completely left out from the informal description in
+%\cite{Boolos87}, and similar ones we are aware of, is arguments why certain Turing
+%machines are correct. We will introduce Hoare-style proof rules
+%which help us with such correctness arguments of Turing machines.
 
 The main difference between our formalisation and the one by Asperti
 and Ricciotti is that their universal Turing machine uses a different
@@ -217,7 +233,7 @@
   the Turing machine can perform:
 
   \begin{center}
-  \begin{tabular}{rcll}
+  \begin{tabular}{rcl@ {\hspace{5mm}}l}
   @{text "a"} & $::=$  & @{term "W0"} & write blank (@{term Bk})\\
   & $\mid$ & @{term "W1"} & write occupied (@{term Oc})\\
   & $\mid$ & @{term L} & move left\\
@@ -237,10 +253,8 @@
   \begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
   @{thm (lhs) update.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) update.simps(1)}\\
   @{thm (lhs) update.simps(2)} & @{text "\<equiv>"} & @{thm (rhs) update.simps(2)}\\
-  @{thm (lhs) update.simps(3)} & @{text "\<equiv>"} & \\
-  \multicolumn{3}{l}{\hspace{1cm}@{thm (rhs) update.simps(3)}}\\
-  @{thm (lhs) update.simps(4)} & @{text "\<equiv>"} & \\
-  \multicolumn{3}{l}{\hspace{1cm}@{thm (rhs) update.simps(4)}}\\
+  @{thm (lhs) update.simps(3)} & @{text "\<equiv>"} & @{thm (rhs) update.simps(3)}\\
+  @{thm (lhs) update.simps(4)} & @{text "\<equiv>"} & @{thm (rhs) update.simps(4)}\\
   @{thm (lhs) update.simps(5)} & @{text "\<equiv>"} & @{thm (rhs) update.simps(5)}\\
   \end{tabular}
   \end{center}
@@ -259,23 +273,23 @@
   in the fourth clause for a right move action. The @{term Nop} operation
   leaves the the tape unchanged (last clause).
 
-  Note that our treatment of the tape is rather ``unsymmetric''---we
-  have the convention that the head of the right-list is where the
-  head is currently positioned. Asperti and Ricciotti
-  \cite{AspertiRicciotti12} also considered such a representation, but
-  dismiss it as it complicates their definition for \emph{tape
-  equality}. The reason is that moving the head one step to
-  the left and then back to the right might change the tape (in case
-  of going over the ``edge''). Therefore they distinguish four types
-  of tapes: one where the tape is empty; another where the head
-  is on the left edge, respectively right edge, and in the middle
-  of the tape. The reading, writing and moving of the tape is then
-  defined in terms of these four cases.  In this way they can keep the
-  tape in a ``normalised'' form, and thus making a left-move followed
-  by a right-move being the identity on tapes. Since we are not using
-  the notion of tape equality, we can get away with the unsymmetric
-  definition above, and by using the @{term update} function
-  cover uniformly all cases including corner cases.
+  %Note that our treatment of the tape is rather ``unsymmetric''---we
+  %have the convention that the head of the right-list is where the
+  %head is currently positioned. Asperti and Ricciotti
+  %\cite{AspertiRicciotti12} also considered such a representation, but
+  %dismiss it as it complicates their definition for \emph{tape
+  %equality}. The reason is that moving the head one step to
+  %the left and then back to the right might change the tape (in case
+  %of going over the ``edge''). Therefore they distinguish four types
+  %of tapes: one where the tape is empty; another where the head
+  %is on the left edge, respectively right edge, and in the middle
+  %of the tape. The reading, writing and moving of the tape is then
+  %defined in terms of these four cases.  In this way they can keep the
+  %tape in a ``normalised'' form, and thus making a left-move followed
+  %by a right-move being the identity on tapes. Since we are not using
+  %the notion of tape equality, we can get away with the unsymmetric
+  %definition above, and by using the @{term update} function
+  %cover uniformly all cases including corner cases.
 
   Next we need to define the \emph{states} of a Turing machine.  Given
   how little is usually said about how to represent them in informal
@@ -337,14 +351,10 @@
   \begin{center}
   \begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
   \multicolumn{3}{l}{@{thm fetch.simps(1)[where b=DUMMY]}}\\
-  @{thm (lhs) fetch.simps(2)} & @{text "\<equiv>"} & \\
-  \multicolumn{3}{@ {\hspace{1cm}}l}{@{text "case nth_of p (2 * s) of"}}\\
-  \multicolumn{3}{@ {\hspace{1.4cm}}l}{@{text "None \<Rightarrow> (Nop, 0) |"}}\\
-  \multicolumn{3}{@ {\hspace{1.4cm}}l}{@{text "Some i \<Rightarrow> i"}}\\
-  @{thm (lhs) fetch.simps(3)} & @{text "\<equiv>"} & \\
-  \multicolumn{3}{@ {\hspace{1cm}}l}{@{text "case nth_of p (2 * s + 1) of"}}\\
-  \multicolumn{3}{@ {\hspace{1.4cm}}l}{@{text "None \<Rightarrow> (Nop, 0) |"}}\\
-  \multicolumn{3}{@ {\hspace{1.4cm}}l}{@{text "Some i \<Rightarrow> i"}}
+  @{thm (lhs) fetch.simps(2)} & @{text "\<equiv>"} & @{text "case nth_of p (2 * s) of"}\\
+  \multicolumn{3}{@ {\hspace{1.4cm}}l}{@{text "None \<Rightarrow> (Nop, 0) | Some i \<Rightarrow> i"}}\\
+  @{thm (lhs) fetch.simps(3)} & @{text "\<equiv>"} & @{text "case nth_of p (2 * s + 1) of"}\\
+  \multicolumn{3}{@ {\hspace{1.4cm}}l}{@{text "None \<Rightarrow> (Nop, 0) | Some i \<Rightarrow> i"}}
   \end{tabular}
   \end{center}
 
--- a/Paper/document/root.tex	Fri Jan 18 11:40:01 2013 +0000
+++ b/Paper/document/root.tex	Fri Jan 18 13:03:09 2013 +0000
@@ -27,7 +27,7 @@
 \begin{document}
 
 
-\title{Formalising Computability Theory in Isabelle/HOL}
+\title{Mechanising Computability Theory in Isabelle/HOL}
 \author{Jian Xu\inst{1} \and Xingyuan Zhang\inst{1} \and Christian Urban\inst{2}}
 \institute{PLA University of Science and Technology, China \and King's College London, UK}
 
Binary file paper.pdf has changed