(*<*)theory Paperimports UTMbeginhide_const (open) s abbreviation "update p a == new_tape a p"lemma fetch_def2: shows "fetch p 0 b == (Nop, 0)" and "fetch p (Suc s) Bk == (case nth_of p (2 * s) of Some i \<Rightarrow> i | None \<Rightarrow> (Nop, 0))" and "fetch p (Suc s) Oc == (case nth_of p ((2 * s) + 1) of Some i \<Rightarrow> i | None \<Rightarrow> (Nop, 0))"apply -apply(rule_tac [!] eq_reflection)by (auto split: block.splits simp add: fetch.simps)lemma new_tape_def2: shows "new_tape W0 (l, r) == (l, Bk#(tl r))" and "new_tape W1 (l, r) == (l, Oc#(tl r))" and "new_tape L (l, r) == (if l = [] then ([], Bk#r) else (tl l, (hd l) # r))" and "new_tape R (l, r) == (if r = [] then (Bk#l,[]) else ((hd r)#l, tl r))" and "new_tape Nop (l, r) == (l, r)"apply -apply(rule_tac [!] eq_reflection)apply(auto split: taction.splits simp add: new_tape.simps)doneabbreviation "read r == if (r = []) then Bk else hd r"lemma tstep_def2: shows "tstep (s, l, r) p == (let (a, s') = fetch p s (read r) in (s', new_tape a (l, r)))"apply -apply(rule_tac [!] eq_reflection)by (auto split: if_splits prod.split list.split simp add: tstep.simps)consts DUMMY::'anotation (latex output) Cons ("_::_" [78,77] 73) and W0 ("W\<^bsub>\<^raw:\hspace{-2pt}>Bk\<^esub>") and W1 ("W\<^bsub>\<^raw:\hspace{-2pt}>Oc\<^esub>") and tstep ("step") and steps ("nsteps") and abc_lm_v ("lookup") and abc_lm_s ("set") and DUMMY ("\<^raw:\mbox{$\_$}>")declare [[show_question_marks = false]](*>*)section {* Introduction *}text {*\noindentWe formalised in earlier work the correctness proofs for twoalgorithms in Isabelle/HOL---one about type-checking inLF~\cite{UrbanCheneyBerghofer11} and another about deciding requestsin access control~\cite{WuZhangUrban12}. The formalisationsuncovered a gap in the informal correctness proof of the former andmade us realise that important details were left out in the informalmodel for the latter. However, in both cases we were unable toformalise in Isabelle/HOL computability arguments about thealgorithms. The reason is that both algorithms are formulated in termsof inductive predicates. Suppose @{text "P"} stands for one suchpredicate. Decidability of @{text P} usually amounts to showingwhether \mbox{@{term "P \<or> \<not>P"}} holds. But this does \emph{not} workin Isabelle/HOL, since it is a theorem prover based on classical logicwhere the law of excluded middle ensures that \mbox{@{term "P \<or> \<not>P"}}is always provable no matter whether @{text P} is constructed bycomputable means. The same problem would arise if we had formulatedthe algorithms as recursive functions, because internally inIsabelle/HOL, like in all HOL-based theorem provers, functions arerepresented as inductively defined predicates too.The only satisfying way out of this problem in a theorem prover based on classicallogic is to formalise a theory of computability. Norrish provided sucha formalisation for the HOL4 theorem prover. He choose the$\lambda$-calculus as the starting point for his formalisationof computability theory,because of its ``simplicity'' \cite[Page 297]{Norrish11}. Part of hisformalisation is a clever infrastructure for reducing$\lambda$-terms. He also established the computational equivalencebetween the $\lambda$-calculus and recursive functions. Nevertheless heconcluded that it would be ``appealing'' to have formalisations for moreoperational models of computations, such as Turing machines or registermachines. One reason is that many proofs in the literature use them. He noted however that in the context of theorem provers\cite[Page 310]{Norrish11}:\begin{quote}\it``If register machines are unappealing because of their general fiddliness, Turing machines are an even more daunting prospect.''\end{quote}\noindentIn this paper we took on this daunting prospect and provide aformalisation of Turing machines, as well as abacus machines (a kindof register machines) and recursive functions. To see the difficultiesinvolved with this work, one has to understand that interactivetheorem provers, like Isabelle/HOL, are at their best when thedata-structures at hand are ``structurally'' defined, like lists,natural numbers, regular expressions, etc. Such data-structures comewith convenient reasoning infrastructures (for example inductionprinciples, recursion combinators and so on). But this is \emph{not}the case with Turing machines (and also not with register machines):underlying their definition is a set of states together with atransition function, both of which are not structurally defined. Thismeans we have to implement our own reasoning infrastructure in orderto prove properties about them. This leads to annoyingly fiddlyformalisations. We noticed first the difference between both,structural and non-structural, ``worlds'' when formalising theMyhill-Nerode theorem, where regular expressions fared much betterthan automata \cite{WuZhangUrban11}. However, with Turing machinesthere seems to be no alternative if one wants to formalise the greatmany proofs from the literature that use them. We will analyse oneexample---undecidability of Wang's tiling problem---in Section~\ref{Wang}. Thestandard proof of this property uses the notion of \emph{universalTuring machines}.We are not the first who formalised Turing machines in a theoremprover: we are aware of the preliminary work by Asperti and Ricciotti\cite{AspertiRicciotti12}. They describe a complete formalisation ofTuring machines in the Matita theorem prover, including a universalTuring machine. They report that the informal proofs from which theystarted are not ``sufficiently accurate to be directly useable as aguideline for formalization'' \cite[Page 2]{AspertiRicciotti12}. Forour formalisation we followed mainly the proofs from the textbook\cite{Boolos87} and found that the description there is quitedetailed. Some details are left out however: for example, it is onlyshown how the universal Turing machine is constructed for Turingmachines computing unary functions. We had to figure out a way togeneralize this result to $n$-ary functions. Similarly, when compilingrecursive functions to abacus machines, the textbook again only showshow it can be done for 2- and 3-ary functions, but in theformalisation we need arbitrary functions. But the general ideas forhow to do this are clear enough in \cite{Boolos87}. However, oneaspect that is completely left out from the informal description in\cite{Boolos87}, and similar ones we are aware of, are arguments why certain Turingmachines are correct. We will introduce Hoare-style proof ruleswhich help us with such correctness arguments of Turing machines.The main difference between our formalisation and the one by Aspertiand Ricciotti is that their universal Turing machine uses a differentalphabet than the machines it simulates. They write \cite[Page23]{AspertiRicciotti12}:\begin{quote}\it``In particular, the fact that the universal machine operates with adifferent alphabet with respect to the machines it simulates isannoying.'' \end{quote}\noindentIn this paper we follow the approach by Boolos et al \cite{Boolos87},which goes back to Post \cite{Post36}, where all Turing machinesoperate on tapes that contain only \emph{blank} or \emph{occupied} cells(represented by @{term Bk} and @{term Oc}, respectively, in ourformalisation). Traditionally the content of a cell can be anycharacter from a finite alphabet. Although computationally equivalent,the more restrictive notion of Turing machines in \cite{Boolos87} makesthe reasoning more uniform. In addition some proofs \emph{about} Turingmachines will be simpler. The reason is that one often needs to encodeTuring machines---consequently if the Turing machines are simpler, then the codingfunctions are simpler too. Unfortunately, the restrictiveness also makesit harder to design programs for these Turing machines. Therefore in orderto construct a \emph{universal Turing machine} we follow the proof in\cite{Boolos87} by relating abacus machines to Turing machines and inturn recursive functions to abacus machines. The universal Turingmachine can then be constructed as a recursive function.\smallskip\noindent{\bf Contributions:} *}section {* Turing Machines *}text {* \noindent Turing machines can be thought of as having a read-write-unit, also referred to as \emph{head}, ``gliding'' over a potentially infinite tape. Boolos et al~\cite{Boolos87} only consider tapes with cells being either blank or occupied, which we represent by a datatype having two constructors, namely @{text Bk} and @{text Oc}. One way to represent such tapes is to use a pair of lists, written @{term "(l, r)"}, where @{term l} stands for the tape on the left-hand side of the head and @{term r} for the tape on the right-hand side. We have the convention that the head, written @{term hd}, of the right-list is the cell on which the head of the Turing machine currently operates. This can be pictured as follows: \begin{center} \begin{tikzpicture} \draw[very thick] (-3.0,0) -- ( 3.0,0); \draw[very thick] (-3.0,0.5) -- ( 3.0,0.5); \draw[very thick] (-0.25,0) -- (-0.25,0.5); \draw[very thick] ( 0.25,0) -- ( 0.25,0.5); \draw[very thick] (-0.75,0) -- (-0.75,0.5); \draw[very thick] ( 0.75,0) -- ( 0.75,0.5); \draw[very thick] (-1.25,0) -- (-1.25,0.5); \draw[very thick] ( 1.25,0) -- ( 1.25,0.5); \draw[very thick] (-1.75,0) -- (-1.75,0.5); \draw[very thick] ( 1.75,0) -- ( 1.75,0.5); \draw[rounded corners=1mm] (-0.35,-0.1) rectangle (0.35,0.6); \draw[fill] (1.35,0.1) rectangle (1.65,0.4); \draw[fill] (0.85,0.1) rectangle (1.15,0.4); \draw[fill] (-0.35,0.1) rectangle (-0.65,0.4); \draw (-0.25,0.8) -- (-0.25,-0.8); \draw[<->] (-1.25,-0.7) -- (0.75,-0.7); \node [anchor=base] at (-0.8,-0.5) {\small left list}; \node [anchor=base] at (0.35,-0.5) {\small right list}; \node [anchor=base] at (0.1,0.7) {\small head}; \node [anchor=base] at (-2.2,0.2) {\ldots}; \node [anchor=base] at ( 2.3,0.2) {\ldots}; \end{tikzpicture} \end{center} \noindent Note that by using lists each side of the tape is only finite. The potential infinity is achieved by adding an appropriate blank cell whenever the read-write unit goes over the ``edge'' of the tape. To make this formal we define five possible \emph{actions} the Turing machine can perform: \begin{center} \begin{tabular}{rcll} @{text "a"} & $::=$ & @{term "W0"} & write blank (@{term Bk})\\ & $\mid$ & @{term "W1"} & write occupied (@{term Oc})\\ & $\mid$ & @{term L} & move left\\ & $\mid$ & @{term R} & move right\\ & $\mid$ & @{term Nop} & do-nothing operation\\ \end{tabular} \end{center} \noindent We slightly deviate from the presentation in \cite{Boolos87} by using the @{term Nop} operation; however its use will become important when we formalise universal Turing machines later. Given a tape and an action, we can define the following tape updating function: \begin{center} \begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l} @{thm (lhs) new_tape_def2(1)} & @{text "\<equiv>"} & @{thm (rhs) new_tape_def2(1)}\\ @{thm (lhs) new_tape_def2(2)} & @{text "\<equiv>"} & @{thm (rhs) new_tape_def2(2)}\\ @{thm (lhs) new_tape_def2(3)} & @{text "\<equiv>"} & \\ \multicolumn{3}{l}{\hspace{1cm}@{thm (rhs) new_tape_def2(3)}}\\ @{thm (lhs) new_tape_def2(4)} & @{text "\<equiv>"} & \\ \multicolumn{3}{l}{\hspace{1cm}@{thm (rhs) new_tape_def2(4)}}\\ @{thm (lhs) new_tape_def2(5)} & @{text "\<equiv>"} & @{thm (rhs) new_tape_def2(5)}\\ \end{tabular} \end{center} \noindent The first two clauses replace the head of the right-list with a new @{term Bk} or @{term Oc}, repsectively. To see that these two clauses make sense in case where @{text r} is the empty list, one has to know that the tail function, @{term tl}, is in Isabelle/HOL defined such that @{term "tl [] == []"} holds. The third clause implements the move of the head one step to the left: we need to test if the left-list @{term l} is empty; if yes, then we just prepend a blank cell to the right-list; otherwise we have to remove the head from the left-list and prepend it to the right-list. Similarly in the clause for a right move action. The @{term Nop} operation leaves the the tape unchanged. Note that our treatment of the tape is rather ``unsymmetric''---we have the convention that the head of the right-list is where the head is currently positioned. Asperti and Ricciotti \cite{AspertiRicciotti12} also consider such a representation, but dismiss it as it complicates their definition for \emph{tape equality}. The reason is that moving the head one step to the left and then back to the right might change the tape (in case of going over the ``edge''). Therefore they distinguish four types of tapes: one where the tape is empty; another where the head is on the left edge, respectively right edge, and in the middle of the tape. The reading, writing and moving of the tape is then defined in terms of these four cases. In this way they can keep the tape in a ``normalised'' form, and thus making a left-move followed by a right-move being the identity on tapes. Since we are not using the notion of tape equality, we can get away with the unsymmetric definition above, and by using the @{term update} function cover uniformely all cases including corner cases. Next we need to define the \emph{states} of a Turing machine. Given how little is usually said about how to represent them in informal presentations, it might be surprising that in a theorem prover we have to select carfully a representation. If we use the naive representation where a Turing machine consists of a finite set of states, then we will have difficulties composing two Turing machines. In this case we would need to combine two finite sets of states, possibly requiring renaming states apart whenever both machines share states. This renaming can be quite cumbersome to reason about. Therefore we made the choice of representing a state by a natural number and the states of a Turing machine will always consist of the initial segment of natural numbers starting from @{text 0} up to number of states of the machine minus @{text 1}. In doing so we can compose two Turing machine by ``shifting'' the states of one by an appropriate amount to a higher segment and adjust some ``next states''. An \emph{instruction} @{term i} of a Turing machine is a pair consisting of an action and a natural number (the next state). A \emph{program} @{term p} of a Turing machine is then a list of such pairs. Using the following Turing machine program (consisting of four instructions) as an example \begin{center} \begin{tikzpicture} \node [anchor=base] at (0,0) {@{thm dither_def}}; \node [anchor=west] at (-1.5,-0.42) {$\underbrace{\hspace{21mm}}_{\text{1st state}}$}; \node [anchor=west] at ( 1.1,-0.42) {$\underbrace{\hspace{17mm}}_{\text{2nd state}}$}; \node [anchor=west] at (-1.5,0.65) {$\overbrace{\hspace{10mm}}^{\text{@{term Bk}-case}}$}; \node [anchor=west] at (-0.1,0.65) {$\overbrace{\hspace{6mm}}^{\text{@{term Oc}-case}}$}; \end{tikzpicture} \end{center} \noindent the reader can see we have organised our Turing machine programs so that segments of two belong to a state. The first component of the segment determines what action should be taken and which next state should be transitioned to in case the head read a @{term Bk}; similarly the second component determines what should be done in case of reading @{term Oc}. We have the convention that the first state is always the \emph{starting state} of the Turing machine. The 0-state is special in that it will be used as \emph{halting state}. There are no instructions for the 0-state, but it will always perform a @{term Nop}-operation and remain in the 0-state. Given a program @{term p}, a state and the cell being read by the head, we need to fetch the corresponding instruction from the program. For this we define the function @{term fetch} \begin{center} \begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l} \multicolumn{3}{l}{@{thm fetch_def2(1)[where b=DUMMY]}}\\ @{thm (lhs) fetch_def2(2)} & @{text "\<equiv>"} & \\ \multicolumn{3}{@ {\hspace{1cm}}l}{@{text "case nth_of p (2 * s) of"}}\\ \multicolumn{3}{@ {\hspace{1.4cm}}l}{@{text "None \<Rightarrow> (Nop, 0) |"}}\\ \multicolumn{3}{@ {\hspace{1.4cm}}l}{@{text "Some i \<Rightarrow> i"}}\\ @{thm (lhs) fetch_def2(3)} & @{text "\<equiv>"} & \\ \multicolumn{3}{@ {\hspace{1cm}}l}{@{text "case nth_of p (2 * s + 1) of"}}\\ \multicolumn{3}{@ {\hspace{1.4cm}}l}{@{text "None \<Rightarrow> (Nop, 0) |"}}\\ \multicolumn{3}{@ {\hspace{1.4cm}}l}{@{text "Some i \<Rightarrow> i"}} \end{tabular} \end{center} \noindent A \emph{configuration} @{term c} of a Turing machine is a state together with a tape. This is written as the triple @{term "(s, l, r)"}. If we have a configuration and a program, we can calculate what the next configuration is by fetching the appropriate next state and action from the program. Such a single step of execution can be defined as \begin{center} \begin{tabular}{l} @{thm (lhs) tstep_def2(1)} @{text "\<equiv>"}\\ \hspace{10mm}@{text "let (a, s) = fetch p s (read r)"}\\ \hspace{10mm}@{text "in (s', update (l, r) a)"} \end{tabular} \end{center} No evaluator in HOL, because of totality. \begin{center} \begin{tabular}{l@ {\hspace{1mm}}c@ {\hspace{1mm}}l} @{thm (lhs) steps.simps(1)} & @{text "\<equiv>"} & @{thm (rhs) steps.simps(1)}\\ @{thm (lhs) steps.simps(2)} & @{text "\<equiv>"} & @{thm (rhs) steps.simps(2)}\\ \end{tabular} \end{center} \emph{well-formedness} of a Turing program programs halts shift and change of a p composition of two ps assertion holds for all tapes Hoare rule for composition For showing the undecidability of the halting problem, we need to consider two specific Turing machines. copying TM and dithering TM correctness of the copying TM measure for the copying TM, which we however omit. standard configuration halting problem*}section {* Abacus Machines *}text {* \noindent Boolos et al \cite{Boolos87} use abacus machines as a stepping stone for making it less laborious to write programs for Turing machines. Abacus machines operate over an unlimited number of registers $R_0$, $R_1$, \ldots each being able to hold an arbitrary large natural number. We use natural numbers to refer to registers, but also to refer to \emph{opcodes} of abacus machines. Obcodes are given by the datatype \begin{center} \begin{tabular}{rcll} @{text "o"} & $::=$ & @{term "Inc R\<iota>"} & increment register $R$ by one\\ & $\mid$ & @{term "Dec R\<iota> o\<iota>"} & if content of $R$ is non-zero,\\ & & & then decrement it by one\\ & & & otherwise jump to opcode $o$\\ & $\mid$ & @{term "Goto o\<iota>"} & jump to opcode $o$ \end{tabular} \end{center} \noindent A \emph{program} of an abacus machine is a list of such obcodes. For example the program clearing the register $R$ (setting it to 0) can be defined as follows: \begin{center} @{thm clear.simps[where n="R\<iota>" and e="o\<iota>", THEN eq_reflection]} \end{center} \noindent The second opcode @{term "Goto 0"} in this programm means we jump back to the first opcode, namely @{text "Dec R o"}. The \emph{memory} $m$ of an abacus machine holding the values of the registers is represented as a list of natural numbers. We have a lookup function for this memory, written @{term "abc_lm_v m R\<iota>"}, which looks up the content of register $R$; if $R$ is not in this list, then we return 0. Similarly we have a setting function, written @{term "abc_lm_s m R\<iota> n"}, which sets the value of $R$ to $n$, and if $R$ was not yet in $m$ it pads it approriately with 0s. Abacus machine halts when it jumps out of range.*}section {* Recursive Functions *}section {* Wang Tiles\label{Wang} *}text {* Used in texture mapings - graphics*}section {* Related Work *}text {* The most closely related work is by Norrish \cite{Norrish11}, and Asperti and Ricciotti \cite{AspertiRicciotti12}. Norrish bases his approach on lambda-terms. For this he introduced a clever rewriting technology based on combinators and de-Bruijn indices for rewriting modulo $\beta$-equivalence (to keep it manageable)*}(*Questions:Can this be done: Ackerman function is not primitive recursive (Nora Szasz)Tape is represented as two lists (finite - usually infinite tape)?*)(*<*)end(*>*)