(*<*)
theory Paper
imports UTM
begin
hide_const (open) s
abbreviation
"update p a == new_tape a p"
lemma fetch_def2:
shows "fetch p 0 b = (Nop, 0)"
and "fetch p (Suc s) Bk =
(case nth_of p (2 * s) of
Some i \<Rightarrow> i
| None \<Rightarrow> (Nop, 0))"
and "fetch p (Suc s) Oc =
(case nth_of p ((2 * s) + 1) of
Some i \<Rightarrow> i
| None \<Rightarrow> (Nop, 0))"
by (auto split: block.splits simp add: fetch.simps)
lemma new_tape_def2:
shows "new_tape W0 (l, r) == (l, Bk#(tl r))"
and "new_tape W1 (l, r) == (l, Oc#(tl r))"
and "new_tape L (l, r) ==
(if l = [] then ([], Bk#r) else (tl l, (hd l) # r))"
and "new_tape R (l, r) ==
(if r = [] then (Bk#l,[]) else ((hd r)#l, tl r))"
and "new_tape Nop (l, r) == (l, r)"
apply -
apply(rule_tac [!] eq_reflection)
apply(auto split: taction.splits simp add: new_tape.simps)
done
lemma tstep_def2:
shows "tstep (s, l, []) p = (let (ac, ns) = fetch p s Bk in (ns, new_tape ac (l, [])))"
and "tstep (s, l, x#xs) p = (let (ac, ns) = fetch p s x in (ns, new_tape ac (l, x#xs)))"
by (auto split: prod.split list.split simp add: tstep.simps)
consts DUMMY::'a
notation (latex output)
Cons ("_::_" [78,77] 73) and
W0 ("W\<^bsub>\<^raw:\hspace{-2pt}>Bk\<^esub>") and
W1 ("W\<^bsub>\<^raw:\hspace{-2pt}>Oc\<^esub>") and
DUMMY ("\<^raw:\mbox{$\_$}>")
declare [[show_question_marks = false]]
(*>*)
section {* Introduction *}
text {*
\noindent
We formalised in earlier work the correctness proofs for two
algorithms in Isabelle/HOL---one about type-checking in
LF~\cite{UrbanCheneyBerghofer11} and another about deciding requests
in access control~\cite{WuZhangUrban12}. The formalisations
uncovered a gap in the informal correctness proof of the former and
made us realise that important details were left out in the informal
model for the latter. However, in both cases we were unable to
formalise in Isabelle/HOL computability arguments about the
algorithms. The reason is that both algorithms are formulated in terms
of inductive predicates. Suppose @{text "P"} stands for one such
predicate. Decidability of @{text P} usually amounts to showing
whether \mbox{@{term "P \<or> \<not>P"}} holds. But this does \emph{not} work
in Isabelle/HOL, since it is a theorem prover based on classical logic
where the law of excluded middle ensures that \mbox{@{term "P \<or> \<not>P"}}
is always provable no matter whether @{text P} is constructed by
computable means. The same problem would arise if we had formulated
the algorithms as recursive functions, because internally in
Isabelle/HOL, like in all HOL-based theorem provers, functions are
represented as inductively defined predicates too.
The only satisfying way out of this problem in a theorem prover based on classical
logic is to formalise a theory of computability. Norrish provided such
a formalisation for the HOL4 theorem prover. He choose the
$\lambda$-calculus as the starting point for his formalisation
of computability theory,
because of its ``simplicity'' \cite[Page 297]{Norrish11}. Part of his
formalisation is a clever infrastructure for reducing
$\lambda$-terms. He also established the computational equivalence
between the $\lambda$-calculus and recursive functions. Nevertheless he
concluded that it would be ``appealing'' to have formalisations for more
operational models of computations, such as Turing machines or register
machines. One reason is that many proofs in the literature use
them. He noted however that in the context of theorem provers
\cite[Page 310]{Norrish11}:
\begin{quote}
\it``If register machines are unappealing because of their
general fiddliness, Turing machines are an even more
daunting prospect.''
\end{quote}
\noindent
In this paper we took on this daunting prospect and provide a
formalisation of Turing machines, as well as abacus machines (a kind
of register machines) and recursive functions. To see the difficulties
involved with this work, one has to understand that interactive
theorem provers, like Isabelle/HOL, are at their best when the
data-structures at hand are ``structurally'' defined, like lists,
natural numbers, regular expressions, etc. Such data-structures come
with convenient reasoning infrastructures (for example induction
principles, recursion combinators and so on). But this is \emph{not}
the case with Turing machines (and also not with register machines):
underlying their definition is a set of states together with a
transition function, both of which are not structurally defined. This
means we have to implement our own reasoning infrastructure in order
to prove properties about them. This leads to annoyingly fiddly
formalisations. We noticed first the difference between both,
structural and non-structural, ``worlds'' when formalising the
Myhill-Nerode theorem, where regular expressions fared much better
than automata \cite{WuZhangUrban11}. However, with Turing machines
there seems to be no alternative if one wants to formalise the great
many proofs from the literature that use them. We will analyse one
example---undecidability of Wang's tiling problem---in Section~\ref{Wang}. The
standard proof of this property uses the notion of \emph{universal
Turing machines}.
We are not the first who formalised Turing machines in a theorem
prover: we are aware of the preliminary work by Asperti and Ricciotti
\cite{AspertiRicciotti12}. They describe a complete formalisation of
Turing machines in the Matita theorem prover, including a universal
Turing machine. They report that the informal proofs from which they
started are not ``sufficiently accurate to be directly useable as a
guideline for formalization'' \cite[Page 2]{AspertiRicciotti12}. For
our formalisation we followed mainly the proofs from the textbook
\cite{Boolos87} and found that the description there is quite
detailed. Some details are left out however: for example, it is only
shown how the universal Turing machine is constructed for Turing
machines computing unary functions. We had to figure out a way to
generalize this result to $n$-ary functions. Similarly, when compiling
recursive functions to abacus machines, the textbook again only shows
how it can be done for 2- and 3-ary functions, but in the
formalisation we need arbitrary functions. But the general ideas for
how to do this are clear enough in \cite{Boolos87}. However, one
aspect that is completely left out from the informal description in
\cite{Boolos87}, and similar ones we are aware of, are arguments why certain Turing
machines are correct. We will introduce Hoare-style proof rules
which help us with such correctness arguments of Turing machines.
The main difference between our formalisation and the one by Asperti
and Ricciotti is that their universal Turing machine uses a different
alphabet than the machines it simulates. They write \cite[Page
23]{AspertiRicciotti12}:
\begin{quote}\it
``In particular, the fact that the universal machine operates with a
different alphabet with respect to the machines it simulates is
annoying.''
\end{quote}
\noindent
In this paper we follow the approach by Boolos et al \cite{Boolos87},
which goes back to Post \cite{Post36}, where all Turing machines
operate on tapes that contain only \emph{blank} or \emph{occupied} cells
(represented by @{term Bk} and @{term Oc}, respectively, in our
formalisation). Traditionally the content of a cell can be any
character from a finite alphabet. Although computationally equivalent,
the more restrictive notion of Turing machines in \cite{Boolos87} makes
the reasoning more uniform. In addition some proofs \emph{about} Turing
machines will be simpler. The reason is that one often needs to encode
Turing machines---consequently if the Turing machines are simpler, then the coding
functions are simpler too. Unfortunately, the restrictiveness also makes
it harder to design programs for these Turing machines. Therefore in order
to construct a \emph{universal Turing machine} we follow the proof in
\cite{Boolos87} by relating abacus machines to Turing machines and in
turn recursive functions to abacus machines. The universal Turing
machine can then be constructed as a recursive function.
\smallskip
\noindent
{\bf Contributions:}
*}
section {* Turing Machines *}
text {* \noindent
Turing machines can be thought of as having a read-write-unit
``gliding'' over a potentially infinite tape. Boolos et
al~\cite{Boolos87} only consider tapes with cells being either blank
or occupied, which we represent by a datatype having two
constructors, namely @{text Bk} and @{text Oc}. One way to
represent such tapes is to use a pair of lists, written @{term "(l,
r)"}, where @{term l} stands for the tape on the left-hand side of the
read-write-unit and @{term r} for the tape on the right-hand side. We have the
convention that the head, written @{term hd}, of the right-list is
the cell on which the read-write-unit currently operates. This can
be pictured as follows:
\begin{center}
\begin{tikzpicture}
\draw[very thick] (-3.0,0) -- ( 3.0,0);
\draw[very thick] (-3.0,0.5) -- ( 3.0,0.5);
\draw[very thick] (-0.25,0) -- (-0.25,0.5);
\draw[very thick] ( 0.25,0) -- ( 0.25,0.5);
\draw[very thick] (-0.75,0) -- (-0.75,0.5);
\draw[very thick] ( 0.75,0) -- ( 0.75,0.5);
\draw[very thick] (-1.25,0) -- (-1.25,0.5);
\draw[very thick] ( 1.25,0) -- ( 1.25,0.5);
\draw[very thick] (-1.75,0) -- (-1.75,0.5);
\draw[very thick] ( 1.75,0) -- ( 1.75,0.5);
\draw[rounded corners=1mm] (-0.35,-0.1) rectangle (0.35,0.6);
\draw[fill] (1.35,0.1) rectangle (1.65,0.4);
\draw[fill] (0.85,0.1) rectangle (1.15,0.4);
\draw[fill] (-0.35,0.1) rectangle (-0.65,0.4);
\draw (-0.25,0.8) -- (-0.25,-0.8);
\draw[<->] (-1.25,-0.7) -- (0.75,-0.7);
\node [anchor=base] at (-0.8,-0.5) {\small left list};
\node [anchor=base] at (0.35,-0.5) {\small right list};
\node [anchor=base] at (0.1,0.7) {\small head};
\node [anchor=base] at (-2.2,0.2) {\ldots};
\node [anchor=base] at ( 2.3,0.2) {\ldots};
\end{tikzpicture}
\end{center}
\noindent
Note that by using lists each side of the tape is only finite. The
potential infinity is achieved by adding an appropriate blank cell
whenever the read-write unit goes over the ``edge'' of the tape. To
make this formal we define five possible \emph{actions}
the Turing machine can perform:
\begin{center}
\begin{tabular}{rcll}
@{text "a"} & $::=$ & @{term "W0"} & write blank (@{term Bk})\\
& $\mid$ & @{term "W1"} & write occupied (@{term Oc})\\
& $\mid$ & @{term L} & move left\\
& $\mid$ & @{term R} & move right\\
& $\mid$ & @{term Nop} & do-nothing operation\\
\end{tabular}
\end{center}
\noindent
We slightly deviate
from the presentation in \cite{Boolos87} by using the @{term Nop} operation; however its use
will become important when we formalise universal Turing
machines later. Given a tape and an action, we can define the
following updating function:
\begin{center}
\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
@{thm (lhs) new_tape_def2(1)} & @{text "\<equiv>"} & @{thm (rhs) new_tape_def2(1)}\\
@{thm (lhs) new_tape_def2(2)} & @{text "\<equiv>"} & @{thm (rhs) new_tape_def2(2)}\\
@{thm (lhs) new_tape_def2(3)} & @{text "\<equiv>"} & \\
\multicolumn{3}{l}{\hspace{1cm}@{thm (rhs) new_tape_def2(3)}}\\
@{thm (lhs) new_tape_def2(4)} & @{text "\<equiv>"} & \\
\multicolumn{3}{l}{\hspace{1cm}@{thm (rhs) new_tape_def2(4)}}\\
@{thm (lhs) new_tape_def2(5)} & @{text "\<equiv>"} & @{thm (rhs) new_tape_def2(5)}\\
\end{tabular}
\end{center}
\noindent
The first two clauses replace the head of the right-list
with new @{term Bk} or @{term Oc}, repsectively. To see that
these clauses make sense in case where @{text r} is the empty
list, one has to know that the tail function, @{term tl}, is defined
such that @{term "tl [] == []"} holds. The third clause
implements the move of the read-write unit to the left: we need
to test if the left-list is empty; if yes, then we just add a
blank cell to the right-list; otherwise we have to remove the
head from the left-list and add it to the right-list. Similarly
in the clause for the right move. The @{term Nop} operation
leaves the the tape unchanged.
Note that our treatment of the tape is rather ``unsymmetric''---we
have the convention that the head of the right-list is where
the read-write unit is currently possitioned. Asperti and
Ricciotti \cite{AspertiRicciotti12} also consider such a
representation, but dismiss it as it complicates their
definition for \emph{tape equality}. The reason is that
moving the read-write unit to the left and then back
to the right can change the tape (in case of going
over the ``edge''). Therefore they distinguish four
cases where the tape is empty as well as the read-write unit
on the left edge, respectively right edge, or in the
middle. The reading and moving of the tape is then
defined in terms of these four cases. Since we are not
going to use the notion of tape equality, we can
get away with the definition above and be done with
all corner cases.
Next we need to define the \emph{states} of a Turing machine. Given
how little is usually said about how to represent states in informal
presentaions, it might be surprising that in a theorem prover we have
to select carfully a representation. If we use the naive representation
as a Turing machine consiting of a finite set of states, then we
will have difficulties composing two Turing machines. We would need
to somehow combine the two finite sets of states, possibly renaming
states apart if both machines share states. This renaming can be
quite cumbersome to reason about. Therefore we made the choice
of representing a state by a natural number and the states of a
Turing machine will always consist of the initial segment
of natural numbers starting from @{text 0} up to number of states
of the machine minus @{text 1}. In doing so we can compose
two Turing machine by ``shifting'' the states of one by an appropriate
amount to a higher segment.
An \emph{instruction} of a Turing machine is a pair consisting of a
natural number (the next state) and an action. A \emph{program} of a Turing
machine is then a list of such pairs. Given a program @{term p}, a state
and a cell being read by the read-write unnit, we need to fetch
the corresponding instruction in the program. For this we define
the function @{term fetch}
\begin{center}
\begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l}
@{thm (lhs) fetch_def2(1)[where b=DUMMY]} & @{text "\<equiv>"} & @{thm (rhs) fetch_def2(1)}\\
@{thm (lhs) fetch_def2(2)} & @{text "\<equiv>"} & \\
\multicolumn{3}{l}{\hspace{1cm}@{thm (rhs) fetch_def2(2)}}\\
@{thm (lhs) fetch_def2(3)} & @{text "\<equiv>"} & \\
\multicolumn{3}{l}{\hspace{1cm}@{thm (rhs) fetch_def2(3)}}\\
\end{tabular}
\end{center}
For showing the undecidability of the halting problem, we need to consider
two specific Turing machines.
No evaluator in HOL, because of totality.
*}
section {* Abacus Machines *}
section {* Recursive Functions *}
section {* Wang Tiles\label{Wang} *}
text {*
Used in texture mapings - graphics
*}
section {* Related Work *}
text {*
The most closely related work is by Norrish \cite{Norrish11}, and Asperti and
Ricciotti \cite{AspertiRicciotti12}. Norrish bases his approach on
lambda-terms. For this he introduced a clever rewriting technology
based on combinators and de-Bruijn indices for
rewriting modulo $\beta$-equivalence (to keep it manageable)
*}
(*
Questions:
Can this be done: Ackerman function is not primitive
recursive (Nora Szasz)
Tape is represented as two lists (finite - usually infinite tape)?
*)
(*<*)
end
(*>*)