CookBook/FirstSteps.thy
author Christian Urban <urbanc@in.tum.de>
Mon, 12 Jan 2009 16:49:15 +0000
changeset 67 5fbeeac2901b
parent 66 d563f8ff6aa0
child 68 e7519207c2b7
permissions -rw-r--r--
tuned

theory FirstSteps
imports Base
begin


chapter {* First Steps *}

text {*

  Isabelle programming is done in ML.  Just like lemmas and proofs, ML-code
  in Isabelle is part of a theory. If you want to follow the code written in
  this chapter, we assume you are working inside the theory starting with

  \begin{center}
  \begin{tabular}{@ {}l}
  \isacommand{theory} FirstSteps\\
  \isacommand{imports} Main\\
  \isacommand{begin}\\
  \ldots
  \end{tabular}
  \end{center}
*}

section {* Including ML-Code *}



text {*
  The easiest and quickest way to include code in a theory is
  by using the \isacommand{ML}-command. For example\smallskip

\isa{\isacommand{ML}
\isacharverbatimopen\isanewline
\hspace{5mm}@{ML "3 + 4"}\isanewline
\isacharverbatimclose\isanewline
@{text "> 7"}\smallskip}

  The @{text [quotes] "> 7"} indicates the response Isabelle prints out when the 
  \isacommand{ML}-command is evaluated. Like ``normal'' Isabelle proof scripts, 
  \isacommand{ML}-commands can be evaluated by using the advance and undo buttons of 
  your Isabelle environment. The code inside the \isacommand{ML}-command 
  can also contain value and function bindings, and even those can be
  undone when the proof script is retracted. For better readability, we will in what 
  follows drop the \isacommand{ML} \isa{\isacharverbatimopen \ldots \isacharverbatimclose} 
  whenever we show code and its response.

  Once a portion of code is relatively stable, one usually wants to 
  export it to a separate ML-file. Such files can then be included in a 
  theory by using \isacommand{uses} in the header of the theory, like

  \begin{center}
  \begin{tabular}{@ {}l}
  \isacommand{theory} FirstSteps\\
  \isacommand{imports} Main\\
  \isacommand{uses} @{text "\"file_to_be_included.ML\""} @{text "\<dots>"}\\
  \isacommand{begin}\\
  \ldots
  \end{tabular}
  \end{center}
  
*}

section {* Debugging and Printing *}

text {*

  During development you might find it necessary to inspect some data
  in your code. This can be done in a ``quick-and-dirty'' fashion using 
  the function @{ML "warning"}. For example 

  @{ML_response_fake [display] "warning \"any string\"" "\"any string\""}

  will print out @{text [quotes] "any string"} inside the response buffer
  of Isabelle.  This function expects a string as argument. If you develop under PolyML,
  then there is a convenient, though again ``quick-and-dirty'', method for
  converting values into strings, namely @{ML makestring}:

  @{ML_response_fake [display] "warning (makestring 1)" "\"1\""} 

  However @{ML makestring} only works if the type of what is converted is monomorphic 
  and is not a function.

  The function @{ML "warning"} should only be used for testing purposes, because any
  output this function generates will be overwritten as soon as an error is
  raised. For printing anything more serious and elaborate, the
  function @{ML tracing} is more appropriate. This function writes all output into
  a separate tracing buffer. For example

  @{ML_response_fake [display] "tracing \"foo\"" "\"foo\""}

  It is also possible to redirect the ``channel'' where the string @{text "foo"} is 
  printed to a separate file, e.g. to prevent ProofGeneral from choking on massive 
  amounts of trace output. This redirection can be achieved using the code
*}

ML{* 
val strip_specials =
let
  fun strip ("\^A" :: _ :: cs) = strip cs
    | strip (c :: cs) = c :: strip cs
    | strip [] = [];
in implode o strip o explode end;

fun redirect_tracing stream =
 Output.tracing_fn := (fn s =>
    (TextIO.output (stream, (strip_specials s));
     TextIO.output (stream, "\n");
     TextIO.flushOut stream)); 
*}

text {*

  Calling @{ML "redirect_tracing"} with @{ML "(TextIO.openOut \"foo.bar\")"} 
  will cause that all tracing information is printed into the file @{text "foo.bar"}.

*}


section {* Antiquotations *}

text {*
  The main advantage of embedding all code in a theory is that the code can
  contain references to entities defined on the logical level of Isabelle. By
  this we mean definitions, theorems, terms and so on. This kind of reference is
  realised with antiquotations.  For example, one can print out the name of the current
  theory by typing

  
  @{ML_response [display] "Context.theory_name @{theory}" "\"FirstSteps\""}
 
  where @{text "@{theory}"} is an antiquotation that is substituted with the
  current theory (remember that we assumed we are inside the theory 
  @{text FirstSteps}). The name of this theory can be extracted with
  the function @{ML "Context.theory_name"}. 

  Note, however, that antiquotations are statically scoped, that is the value is
  determined at ``compile-time'', not ``run-time''. For example the function
*}

ML {*
fun not_current_thyname () = Context.theory_name @{theory}
*}

text {*

  does, as its name suggest, \emph{not} return the name of the current theory, if it is run in a 
  different theory. Instead, the code above defines the constant function 
  that always returns the string @{text [quotes] "FirstSteps"}, no matter where the
  function is called. Operationally speaking,  the antiquotation @{text "@{theory}"} is 
  \emph{not} replaced with code that will look up the current theory in 
  some data structure and return it. Instead, it is literally
  replaced with the value representing the theory name.

  In a similar way you can use antiquotations to refer to theorems or simpsets:

  @{ML_response_fake [display] "@{thm allI}" "(\<And>x. ?P x) \<Longrightarrow> \<forall>x. ?P x"}
  @{ML_response_fake [display] "@{simpset}" "\<dots>"}

  (FIXME: what does a simpset look like? It seems to be the same problem
  as with tokens.)

  While antiquotations have many applications, they were originally introduced to 
  avoid explicit bindings for theorems such as
*}

ML {*
val allI = thm "allI"
*}

text {*
  These bindings were difficult to maintain and also could accidentally
  be overwritten by the user. This usually broke definitional
  packages. Antiquotations solve this problem, since they are ``linked''
  statically at compile-time. However, that also sometimes limits their
  usefulness. In the course of this introduction, we will learn more about
  these antiquotations: they greatly simplify Isabelle programming since one
  can directly access all kinds of logical elements from ML.

*}

section {* Terms and Types *}

text {*
  One way to construct terms of Isabelle on the ML-level is by using the antiquotation 
  \mbox{@{text "@{term \<dots>}"}}. For example

  @{ML_response [display] "@{term \"(a::nat) + b = c\"}" 
                          "Const (\"op =\", \<dots>)  $ (Const (\"HOL.plus_class.plus\", \<dots>) $ \<dots> $ \<dots>) $ \<dots>"}

  This will show the term @{term "(a::nat) + b = c"}, but printed using the internal
  representation of this term. This internal representation corresponds to the 
  datatype @{ML_type "term"}.
  
  The internal representation of terms uses the usual de Bruijn index mechanism where bound 
  variables are represented by the constructor @{ML Bound}. The index in @{ML Bound} refers to
  the number of Abstractions (@{ML Abs}) we have to skip until we hit the @{ML Abs} that
  binds the corresponding variable. However, in Isabelle the names of bound variables are 
  kept at abstractions for printing purposes, and so should be treated only as comments. 

  \begin{readmore}
  Terms are described in detail in \isccite{sec:terms}. Their
  definition and many useful operations can be found in @{ML_file "Pure/term.ML"}.
  \end{readmore}

  Sometimes the internal representation of terms can be surprisingly different
  from what you see at the user level, because the layers of
  parsing/type-checking/pretty printing can be quite elaborate. 

  \begin{exercise}
  Look at the internal term representation of the following terms, and
  find out why they are represented like this.

  \begin{itemize}
  \item @{term "case x of 0 \<Rightarrow> 0 | Suc y \<Rightarrow> y"}  
  \item @{term "\<lambda>(x,y). P y x"}  
  \item @{term "{ [x::int] | x. x \<le> -2 }"}  
  \end{itemize}

  Hint: The third term is already quite big, and the pretty printer
  may omit parts of it by default. If you want to see all of it, you
  can use the following ML function to set the limit to a value high 
  enough:
  \end{exercise}

  @{ML [display] "print_depth 50"}

  The antiquotation @{text "@{prop \<dots>}"} constructs terms of propositional type, 
  inserting the invisible @{text "Trueprop"}-coercions whenever necessary. 
  Consider for example

  @{ML_response [display] "(@{term \"P x\"}, @{prop \"P x\"})" "(Free (\"P\", \<dots>) $ Free (\"x\", \<dots>),
 Const (\"Trueprop\", \<dots>) $ (Free (\"P\", \<dots>) $ Free (\"x\", \<dots>)))"}
 
  which inserts the coercion in the second component and 

  @{ML_response [display] "(@{term \"P x \<Longrightarrow> Q x\"}, @{prop \"P x \<Longrightarrow> Q x\"})" 
  "(Const (\"==>\", \<dots>) $ \<dots> $ \<dots>, Const (\"==>\", \<dots>) $ \<dots> $ \<dots>)"}

  which does not (since it is already constructed using the meta-implication). 

  Types can be constructed using the antiquotation @{text "@{typ \<dots>}"}. For example

  @{ML_response_fake [display] "@{typ \"bool \<Rightarrow> nat\"}" "bool \<Rightarrow> nat"}

  \begin{readmore}
  Types are described in detail in \isccite{sec:types}. Their
  definition and many useful operations can also be found in @{ML_file "Pure/type.ML"}.
  \end{readmore}
*}


section {* Constructing Terms and Types Manually *} 

text {*

  While antiquotations are very convenient for constructing terms and types, 
  they can only construct fixed terms (remember they are ``linked'' at compile-time). 
  However, one often needs to construct terms
  dynamically. For example, a function that returns the implication 
  @{text "\<And>(x::\<tau>). P x \<Longrightarrow> Q x"} taking @{term P}, @{term Q} and the type @{term "\<tau>"}
  as arguments can only be written as
*}

ML {*
fun make_imp P Q tau =
  let
    val x = Free ("x",tau)
  in Logic.all x (Logic.mk_implies (P $ x, Q $ x))
  end
*}

text {*

  The reason is that one cannot pass the arguments @{term P}, @{term Q} and 
  @{term "tau"} into an antiquotation. For example the following does \emph{not} work:
*}

ML {*
fun make_wrong_imp P Q tau = @{prop "\<And>x. P x \<Longrightarrow> Q x"} 
*}

text {*
  To see this apply @{text "@{term S}"}, @{text "@{term T}"} and @{text "@{typ nat}"} 
  to both functions: 

  @{ML_response [display] "make_imp @{term S} @{term T} @{typ nat}" 
    "Const \<dots> $ 
    Abs (\"x\", Type (\"nat\",[]),
      Const \<dots> $ (Free (\"S\",\<dots>) $ \<dots>) $ 
                  (Free (\"T\",\<dots>) $ \<dots>))"}
  @{ML_response [display] "make_wrong_imp @{term S} @{term T} @{typ nat}" 
    "Const \<dots> $ 
    Abs (\"x\", \<dots>,
      Const \<dots> $ (Const \<dots> $ (Free (\"P\",\<dots>) $ \<dots>)) $ 
                  (Const \<dots> $ (Free (\"Q\",\<dots>) $ \<dots>)))"}

  As can be seen, in the first case the arguments are correctly used, while the 
  second generates a term involving the @{term "P"} and @{text "Q"} from the 
  antiquotation.

  One tricky point in constructing terms by hand is to obtain the 
  fully qualified name for constants. For example the names for @{text "zero"} and
  @{text "+"} are more complex than one first expects, namely 

  \begin{center}
  @{text "HOL.zero_class.zero"} and @{text "HOL.plus_class.plus"}. 
  \end{center}
  
  The extra prefixes @{text zero_class} and @{text plus_class} are present because 
  these constants are defined within type classes; the prefix @{text "HOL"} indicates in 
  which theory they are defined. Guessing such internal names can sometimes be quite hard. 
  Therefore Isabelle provides the antiquotation @{text "@{const_name \<dots>}"} which does the 
  expansion automatically, for example:

  @{ML_response_fake [display] "@{const_name \"Nil\"}" "List.list.Nil"}

  (FIXME: Is it useful to explain @{text "@{const_syntax}"}?)

  Similarly, types can be constructed manually, for example as follows:

*} 

ML {*
fun make_fun_type tau1 tau2 = Type ("fun",[tau1,tau2])
*}

text {*
  which can be equally written as 
*}

ML {*
fun make_fun_type tau1 tau2 = tau1 --> tau2
*}

text {*

  \begin{readmore}
  There are many functions in @{ML_file "Pure/logic.ML"} and
  @{ML_file "HOL/hologic.ML"} that make such manual constructions of terms 
  and types easier.\end{readmore}

  Have a look at these files and try to solve the following two exercises:

*}

text {*

  \begin{exercise}\label{fun:revsum}
  Write a function @{text "rev_sum : term -> term"} that takes a
  term of the form @{text "t\<^isub>1 + t\<^isub>2 + \<dots> + t\<^isub>n"} (whereby @{text "i"} might be zero)
  and returns the reversed sum @{text "t\<^isub>n + \<dots> + t\<^isub>2 + t\<^isub>1"}. Assume
  the @{text "t\<^isub>i"} can be arbitrary expressions and also note that @{text "+"} 
  associates to the left. Try your function on some examples. 
  \end{exercise}

  \begin{exercise}\label{fun:makesum}
  Write a function which takes two terms representing natural numbers
  in unary (like @{term "Suc (Suc (Suc 0))"}), and produce the unary
  number representing their sum.
  \end{exercise}

*}

section {* Type-Checking *}

text {* 
  
  We can freely construct and manipulate terms, since they are just
  arbitrary unchecked trees. However, we eventually want to see if a
  term is well-formed, or type-checks, relative to a theory.
  Type-checking is done via the function @{ML cterm_of}, which converts 
  a @{ML_type term} into a  @{ML_type cterm}, a \emph{certified} term. 
  Unlike @{ML_type term}s, which are just trees, @{ML_type
  "cterm"}s are abstract objects that are guaranteed to be
  type-correct, and that can only be constructed via the ``official
  interfaces''.

  Type-checking is always relative to a theory context. For now we use
  the @{ML "@{theory}"} antiquotation to get hold of the current theory.
  For example we can write

  @{ML_response_fake [display] "cterm_of @{theory} @{term \"a + b = c\"}" "a + b = c"}

  or use the antiquotation

  @{ML_response_fake [display] "@{cterm \"(a::nat) + b = c\"}" "a + b = c"}

  Attempting

  @{ML_response_fake_both [display] "@{cterm \"1 + True\"}" "Type unification failed \<dots>"}

  yields an error. A slightly more elaborate example is

@{ML_response_fake [display] 
"let
  val natT = @{typ \"nat\"}
  val zero = @{term \"0::nat\"}
in
  cterm_of @{theory} 
  (Const (@{const_name plus}, natT --> natT --> natT) 
  $ zero $ zero)
end" "0 + 0"}

  \begin{exercise}
  Check that the function defined in Exercise~\ref{fun:revsum} returns a
  result that type-checks.
  \end{exercise}

*}

section {* Theorems *}

text {*
  Just like @{ML_type cterm}s, theorems are abstract objects of type @{ML_type thm} 
  that can only be built by going through interfaces. In effect all proofs 
  are checked. 

  To see theorems in ``action'', let us give a proof for the following statement
*}

  lemma 
   assumes assm\<^isub>1: "\<And>(x::nat). P x \<Longrightarrow> Q x" 
   and     assm\<^isub>2: "P t"
   shows "Q t" (*<*)oops(*>*) 

text {*
  on the ML-level:\footnote{Note that @{text "|>"} is reverse
  application. This combinator, and several variants are defined in
  @{ML_file "Pure/General/basics.ML"}.}

@{ML_response_fake [display]
"let
  val thy = @{theory}

  val assm1 = cterm_of thy @{prop \"\<And>(x::nat). P x \<Longrightarrow> Q x\"}
  val assm2 = cterm_of thy @{prop \"(P::nat\<Rightarrow>bool) t\"}

  val Pt_implies_Qt = 
        assume assm1
        |> forall_elim (cterm_of thy @{term \"t::nat\"});
  
  val Qt = implies_elim Pt_implies_Qt (assume assm2);
in

  Qt 
  |> implies_intr assm2
  |> implies_intr assm1
end" "\<lbrakk>\<And>x. P x \<Longrightarrow> Q x; P t\<rbrakk> \<Longrightarrow> Q t"}

  This code-snippet constructs the following proof:

  \[
  \infer[(@{text "\<Longrightarrow>"}$-$intro)]{\vdash @{prop "(\<And>x. P x \<Longrightarrow> Q x) \<Longrightarrow> P t \<Longrightarrow> Q t"}}
    {\infer[(@{text "\<Longrightarrow>"}$-$intro)]{@{prop "\<And>x. P x \<Longrightarrow> Q x"} \vdash @{prop "P t \<Longrightarrow> Q t"}}
       {\infer[(@{text "\<Longrightarrow>"}$-$elim)]{@{prop "\<And>x. P x \<Longrightarrow> Q x"}, @{prop "P t"} \vdash @{prop "Q t"}}
          {\infer[(@{text "\<And>"}$-$elim)]{@{prop "\<And>x. P x \<Longrightarrow> Q x"} \vdash @{prop "P t \<Longrightarrow> Q t"}}
                 {\infer[(assume)]{@{prop "\<And>x. P x \<Longrightarrow> Q x"} \vdash @{prop "\<And>x. P x \<Longrightarrow> Q x"}}{}}
                 &
           \infer[(assume)]{@{prop "P t"} \vdash @{prop "P t"}}{}
          }
       }
    }
  \]


  \begin{readmore}
  For the functions @{text "assume"}, @{text "forall_elim"} etc 
  see \isccite{sec:thms}. The basic functions for theorems are defined in
  @{ML_file "Pure/thm.ML"}. 
  \end{readmore}

*}


section {* Tactical Reasoning *}

text {*
  The goal-oriented tactical style reasoning of the ML level is similar 
  to the @{text apply}-style at the user level, i.e.~the reasoning is centred 
  around a \emph{goal}, which is modified in a sequence of proof steps 
  until it is solved.

  A goal (or goal state) is a special @{ML_type thm}, which by
  convention is an implication of the form:

  @{text[display] "A\<^isub>1 \<Longrightarrow> \<dots> \<Longrightarrow> A\<^isub>n \<Longrightarrow> #(C)"}

  where @{term C} is the goal to be proved and the @{term "A\<^isub>i"} are the open 
  subgoals. 
  Since the goal @{term C} can potentially be an implication, there is a
  @{text "#"} wrapped around it, which prevents that premises are 
  misinterpreted as open subgoals. The wrapper @{text "# :: prop \<Rightarrow>
  prop"} is just the identity function and used as a syntactic marker. 
  
  \begin{readmore}
  For more on goals see \isccite{sec:tactical-goals}. (FIXME: in which
  file is most code for dealing with goals?)
  \end{readmore}

  Tactics are functions that map a goal state to a (lazy)
  sequence of successor states, hence the type of a tactic is
  @{ML_type[display] "thm -> thm Seq.seq"}
  
  \begin{readmore}
  See @{ML_file "Pure/General/seq.ML"} for the implementation of lazy
  sequences. However in day-to-day Isabelle programming, one rarely 
  constructs sequences explicitly, but uses the predefined tactic 
  combinators (tacticals) instead. See @{ML_file "Pure/tctical.ML"} 
  for the code; see Chapters 3 and 4 in the old Isabelle Reference Manual.
  \end{readmore}

  While tactics can operate on the subgoals (the @{text "A\<^isub>i"} above), they 
  are expected to leave the conclusion @{term C} intact, with the 
  exception of possibly instantiating schematic variables. 
 
  To see how tactics work, let us transcribe a simple @{text apply}-style 
  proof into ML:
*}

lemma disj_swap: "P \<or> Q \<Longrightarrow> Q \<or> P"
apply (erule disjE)
 apply (rule disjI2)
 apply assumption
apply (rule disjI1)
apply assumption
done

text {*
  To start the proof, the function  @{ML "Goal.prove"}~@{text "ctxt xs As C tac"} sets 
  up a goal state for proving the goal @{text C} under the assumptions @{text As} 
  (empty in the proof at hand) 
  with the variables @{text xs} that will be generalised once the
  goal is proved. The @{text "tac"} is the tactic which proves the goal and which 
  can make use of the local assumptions (there are none in this example).

@{ML_response_fake [display]
"let
  val ctxt = @{context}
  val goal = @{prop \"P \<or> Q \<Longrightarrow> Q \<or> P\"}
in
  Goal.prove ctxt [\"P\", \"Q\"] [] goal (fn _ => 
    eresolve_tac [disjE] 1
    THEN resolve_tac [disjI2] 1
    THEN assume_tac 1
    THEN resolve_tac [disjI1] 1
    THEN assume_tac 1)
end" "?P \<or> ?Q \<Longrightarrow> ?Q \<or> ?P"}

  \begin{readmore}
  To learn more about the function @{ML Goal.prove} see \isccite{sec:results} and
  the file @{ML_file "Pure/goal.ML"}.
  \end{readmore}


  An alternative way to transcribe this proof is as follows 

@{ML_response_fake [display]
"let
  val ctxt = @{context}
  val goal = @{prop \"P \<or> Q \<Longrightarrow> Q \<or> P\"}
in
  Goal.prove ctxt [\"P\", \"Q\"] [] goal (fn _ => 
    (eresolve_tac [disjE] 
    THEN' resolve_tac [disjI2] 
    THEN' assume_tac 
    THEN' resolve_tac [disjI1] 
    THEN' assume_tac) 1)
end" "?P \<or> ?Q \<Longrightarrow> ?Q \<or> ?P"}

  (FIXME: are there any advantages/disadvantages about this way?) 
*}

section {* Storing Theorems *}

section {* Theorem Attributes *}




end