--- a/ProgTutorial/General.thy Mon Nov 09 10:07:08 2009 +0100
+++ b/ProgTutorial/General.thy Tue Nov 10 04:57:42 2009 +0100
@@ -677,25 +677,25 @@
section {* Unification and Matching (TBD) *}
text {*
- Isabelle's terms and types contain schematic term variables
+ Isabelle's terms and types may contain schematic term variables
(term-constructor @{ML Var}) and schematic type variables (term-constructor
- @{ML TVar}). Both are printed out as variables with a leading question
- mark. They stand for unknown terms and types, which can be made more concrete
- by instantiations. Such instantiations can be calculated by unification or
- matching. While in case of types, unification and matching is relatively
- straightforward, in case of terms the algorithms are substantially more
- complicated, because terms need higher-order versions of the unification and
- matching algorithms. Below we shall use the antiquotation @{text "@{typ_pat \<dots>}"} and
- @{text "@{term_pat \<dots>}"} from Section~\ref{sec:antiquote} in order to
- construct examples involving schematic variables.
-
- Let us begin with describing unification and matching of types. Both
- matching and unification of types return a type environment, ML-type
- @{ML_type "Type.tyenv"}, which maps schematic type variables to types (and
- sorts). Below we use the function @{ML_ind typ_unify in Sign} from the
- structure @{ML_struct Sign} for unifying
- the types @{text "?'a * ?'b"} and @{text "?'b list * nat"}. This will
- produce the mapping @{text "[?'a := ?'b list, ?'b := nat]"}.
+ @{ML TVar}). These variables stand for unknown entities, which can be made
+ more concrete by instantiations. Such instantiations might be a result of
+ unification or matching. While in case of types, unification and matching is
+ relatively straightforward, in case of terms the algorithms are
+ substantially more complicated, because terms need higher-order versions of
+ the unification and matching algorithms. Below we shall use the
+ antiquotations @{text "@{typ_pat \<dots>}"} and @{text "@{term_pat \<dots>}"} from
+ Section~\ref{sec:antiquote} in order to construct examples involving
+ schematic variables.
+
+ Let us begin with describing the unification and the matching function for
+ types. Both return type environments, ML-type @{ML_type "Type.tyenv"},
+ which map schematic type variables to types (and sorts). Below we use the
+ function @{ML_ind typ_unify in Sign} from the structure @{ML_struct Sign}
+ for unifying the types @{text "?'a * ?'b"} and @{text "?'b list *
+ nat"}. This will produce the mapping, or type environment, @{text "[?'a :=
+ ?'b list, ?'b := nat]"}.
*}
ML{*val (tyenv_unif, _) = let
@@ -706,49 +706,124 @@
end*}
text {*
- The value @{ML_ind empty in Vartab} stands for the empty type environment,
- which is the value for starting the unification without any
- instantiations.\footnote{\bf FIXME: what is 0?} In case of a failure
- @{ML typ_unify in Sign} will throw the exception @{ML_text TUNIFY}.
+ The environment @{ML_ind "Vartab.empty"} stands for the empty type environment,
+ which is needed for starting the unification without any
+ (pre)instantiations. In case of a failure
+ @{ML typ_unify in Sign} will throw the exception @{text TUNIFY}.
We can print out the resulting type environment @{ML tyenv_unif}
- with the built-in function @{ML_ind dest in Vartab}.
+ with the built-in function @{ML_ind dest in Vartab} from the structure
+ @{ML_struct Vartab}.
@{ML_response_fake [display,gray]
"Vartab.dest tyenv_unif"
"[((\"'a\", 0), ([\"HOL.type\"], \"?'b List.list\")),
- ((\"'b\", 0), ([\"HOL.type\"], \"nat\"))]"}
-
- For what follows we will use the following pretty-printing function
- for @{ML_type "Type.tyenv"}s
+ ((\"'b\", 0), ([\"HOL.type\"], \"nat\"))]"}
*}
-ML{*fun pretty_tyenv ctxt tyenv =
+text_raw {*
+ \begin{figure}[t]
+ \begin{minipage}{\textwidth}
+ \begin{isabelle}*}
+ML{*fun pretty_helper f env =
+ env |> Vartab.dest
+ |> map f
+ |> commas
+ |> enclose "[" "]"
+ |> tracing
+
+fun pretty_tyenv ctxt tyenv =
let
fun aux (v, (s, T)) =
Syntax.string_of_typ ctxt (TVar (v, s))
- ^ " := " ^ Syntax.string_of_typ ctxt T
-in
- tyenv |> Vartab.dest
- |> map aux
- |> commas
- |> enclose "[" "]"
- |> tracing
+ ^ " := " ^ Syntax.string_of_typ ctxt T
+in
+ pretty_helper aux tyenv
end*}
+text_raw {*
+ \end{isabelle}
+ \end{minipage}
+ \caption{A pretty printing function for type environments, which are
+ produced by unification and matching.\label{fig:prettyenv}}
+ \end{figure}
+*}
text {*
- which prints for the type environment above
+ The first components in this list stand for the schematic type variables and
+ the second are the associated sorts and types. In what follows we will use
+ the pretty-printing function in Figure~\ref{fig:prettyenv} for @{ML_type "Type.tyenv"}s.
+ For the type environment from the example this function prints out the
+ more legible:
@{ML_response_fake [display, gray]
"pretty_tyenv @{context} tyenv_unif"
"[?'a := ?'b list, ?'b := nat]"}
- Observe that the type environment that the function @{ML typ_unify in
+ The index @{text 0} in the example above is the maximal index of the schematic
+ type variables occuring in @{text ty1} and @{text ty2}. It will be increased
+ whenever a new schematic type variable is introduced during unification.
+ This is for example the case when two schematic type variables have different,
+ incomparable sorts. Then a new schematic type variable is introduced with
+ the combined sorts. To show this let us use two sorts, @{text "s1"} and @{text "s2"},
+ which we attach to the schematic type variables @{text "?'a"} and @{text "?'b"}.
+*}
+
+ML{*val (tyenv, index) = let
+ val ty2 = @{typ_pat "?'a::s1"}
+ val ty1 = @{typ_pat "?'b::s2"}
+in
+ Sign.typ_unify @{theory} (ty1, ty2) (Vartab.empty, 0)
+end*}
+
+text {*
+ To print out the calculated type environment we also switch on the printing
+ of sort information.
+*}
+
+ML{*show_sorts := true*}
+
+text {*
+ Now we can see
+
+ @{ML_response_fake [display,gray]
+ "pretty_tyenv @{context} tyenv"
+ "[?'a::s1 := ?'a1::{s1,s2}, ?'b::s2 := ?'a1::{s1,s2}]"}
+
+ the type variables @{text "?'a"} and @{text "?'b"} are instantiated
+ with a new type variable @{text "?'a1"} with sort @{text "{s1,s2}"}. Since a new
+ type variable has been introduced the @{ML index}, originally being @{text 0},
+ has been increased by @{text 1}.
+
+ @{ML_response [display,gray]
+ "index"
+ "1"}
+*}(*<*)ML %linenos{*show_sorts := false*}(*>*)
+
+text {*
+ The way the unification function @{ML typ_unify in Sign} is implemented
+ using an initial type environment and initial index makes it easy to
+ unifying of more than two terms. For example
+*}
+
+ML{*val (tyenvs, _) = let
+ val tys1 = (@{typ_pat "?'a"}, @{typ_pat "?'b list"})
+ val tys2 = (@{typ_pat "?'b"}, @{typ_pat "nat"})
+in
+ fold (Sign.typ_unify @{theory}) [tys1, tys2] (Vartab.empty, 0)
+end*}
+
+text {*
+ Let us now return again to the unifier from the first example.
+
+ @{ML_response_fake [display, gray]
+ "pretty_tyenv @{context} tyenv_unif"
+ "[?'a := ?'b list, ?'b := nat]"}
+
+ Observe that the type environment which the function @{ML typ_unify in
Sign} returns is in \emph{not} an instantiation in fully solved form: while @{text
"?'b"} is instantiated to @{typ nat}, this is not propagated to the
instantiation for @{text "?'a"}. In unification theory, this is often
- called an instantiation in \emph{triangular form}. These instantiations
- are used because of performance reasons.
-
+ called an instantiation in \emph{triangular form}. These not fully solved
+ instantiations are used because of performance reasons.
To apply a type environment in triangular form to a type, say @{text "?'a *
?'b"}, you should use the function @{ML_ind norm_type in Envir}:
@@ -756,12 +831,14 @@
@{ML_response_fake [display, gray]
"Envir.norm_type tyenv_unif @{typ_pat \"?'a * ?'b\"}"
"nat list * nat"}
-*}
-
-text {*
+
+ With this function the type variables @{text "?'a"} and @{text "?'b"} are
+ correctly instantiated.
+
Matching of types can be done with the function @{ML_ind typ_match in Sign}
from the structure @{ML_struct Sign}. It also returns a @{ML_type
- Type.tyenv}. For example
+ Type.tyenv} and might raise the exception @{text TYPE_MATCH} in case
+ of failure. For example
*}
ML{*val tyenv_match = let
@@ -772,23 +849,24 @@
end*}
text {*
- In case of failure, @{ML typ_match in Sign} will throw the exception
- @{ML_text TYPE_MATCH}. Printing out the calculated matcher gives
+ Printing out the calculated matcher gives
@{ML_response_fake [display,gray]
"pretty_tyenv @{context} tyenv_match"
"[?'a := bool list, ?'b := nat]"}
- Applying the matcher to a type needs to be done with the function
+ Unlike the unification, which uses the function @{ML norm_type in Envir},
+ applying the matcher to a type needs to be done with the function
@{ML_ind subst_type in Envir}.
@{ML_response_fake [display, gray]
"Envir.subst_type tyenv_match @{typ_pat \"?'a * ?'b\"}"
"bool list * nat"}
- Be careful to observe the difference: use @{ML subst_type in Envir}
- for matchers and @{ML norm_type in Envir} for unifiers. To make the
- difference explicit let us calculate the following matcher.
+ Be careful to observe the difference: use always
+ @{ML subst_type in Envir} for matchers and @{ML norm_type in Envir}
+ for unifiers. To make the difference explicit, let us calculate the
+ following matcher:
*}
ML{*val tyenv_match' = let
@@ -799,17 +877,15 @@
end*}
text {*
- Now @{ML tyenv_unif} is equal to @{ML tyenv_match'}. Consider
- again the type @{text "?'a * ?'b"}, which we used in the unification
- and matching problems. If we apply @{ML norm_type in Envir} to
- the matcher we obtain
+ Now @{ML tyenv_unif} is equal to @{ML tyenv_match'}. If we apply
+ @{ML norm_type in Envir} to the matcher we obtain
@{ML_response_fake [display, gray]
"Envir.norm_type tyenv_match' @{typ_pat \"?'a * ?'b\"}"
"nat list * nat"}
which is not a matcher for the second matching problem, and if
- we apply @{ML subst_type in Envir} to the unifier we obtain
+ we apply @{ML subst_type in Envir} to the unifyier we obtain
@{ML_response_fake [display, gray]
"Envir.subst_type tyenv_unif @{typ_pat \"?'a * ?'b\"}"
@@ -829,16 +905,129 @@
*}
text {*
- TBD below
-
-
+ Unification and matching of terms is substantially more complicated than the
+ type-case. The reason is that terms have abstractions and unification
+ or matching modulo plain equality is often not meaningful in this
+ context. Nevertheless, Isabelle implements the function @{ML_ind
+ first_order_match in Pattern} for terms. This matching function returns a
+ type environment and a term environment. To pretty print the latter we use
+ the function @{text "pretty_env"}:
+*}
+
+ML{*fun pretty_env ctxt env =
+let
+ fun aux (v, (T, t)) =
+ string_of_term ctxt (Var (v, T)) ^ " := " ^ string_of_term ctxt t
+in
+ pretty_helper aux env
+end*}
+
+text {*
+ As can be seen from the @{text "aux"}-function, a term environment associates
+ a schematic term variable with a type and a term. An example of a first-order
+ matching problem is the term @{term "P (\<lambda>a b. Q b a)"} and the pattern
+ @{text "?X ?Y"}.
+*}
+
+ML{*val (_, fo_env) = let
+ val fo_pat = @{term_pat "(?X::(nat\<Rightarrow>nat\<Rightarrow>nat)\<Rightarrow>bool) ?Y"}
+ val trm = @{term "P::(nat\<Rightarrow>nat\<Rightarrow>nat)\<Rightarrow>bool"}
+ $ @{term "\<lambda>a b. (Q::nat\<Rightarrow>nat\<Rightarrow>nat) b a"}
+ val init = (Vartab.empty, Vartab.empty)
+in
+ Pattern.first_order_match @{theory} (fo_pat, trm) init
+end *}
+
+text {*
+ In this example we annotated explicitly types because then
+ the type environment is empty and can be ignored. The
+ resulting term environment is:
+
+ @{ML_response_fake [display, gray]
+ "pretty_env @{context} fo_env"
+ "[?X := P, ?Y := \<lambda>a b. Q b a]"}
+
+ The matcher can be applied to a term using the function @{ML_ind subst_term
+ in Envir} (remember the same convention for types applies to terms: @{ML
+ subst_term in Envir} is for matchers and @{ML norm_term in Envir} is for
+ unifyiers). The function @{ML subst_term in Envir} expects a type environment,
+ which is set to empty in the example below, and a term environment.
+
+ @{ML_response_fake [display, gray]
+ "let
+ val trm = @{term_pat \"(?X::(nat\<Rightarrow>nat\<Rightarrow>nat)\<Rightarrow>bool) ?Y\"}
+in
+ Envir.subst_term (Vartab.empty, fo_env) trm
+ |> string_of_term @{context}
+ |> tracing
+end"
+ "P (\<lambda>a b. Q b a)"}
+
+ First-order matching is useful for matching against applications and
+ variables. It can deal also with abstractions and a limited form of
+ alpha-equivalence, but this kind of matching should be used with care, since
+ it is not clear whether the result is meaningful. A working example is
+ matching @{text "\<lambda>x. P x"} against the pattern @{text "\<lambda>y. ?X y"}. In this
+ case first-order matching produces @{text "[?X := P]"}.
+
+ @{ML_response_fake [display, gray]
+ "let
+ val fo_pat = @{term_pat \"\<lambda>y. (?X::nat\<Rightarrow>bool) y\"}
+ val trm = @{term \"\<lambda>x. (P::nat\<Rightarrow>bool) x\"}
+ val init = (Vartab.empty, Vartab.empty)
+in
+ Pattern.first_order_match @{theory} (fo_pat, trm) init
+ |> snd
+ |> pretty_env @{context}
+end"
+ "[?X := P]"}
+*}
+
+text {*
+ In the context of unifying abstractions, more thouroughly studied is
+ higher-order pattern unification and higher-order pattern matching. A
+ \emph{\index*{pattern}} is a lambda term whose ``head symbol'' (that is the
+ first symbol under abstractions) is either a constant, or a schematic or a free
+ variable. If it is a schematic variable then it can be only applied by
+ distinct bound variables. This excludes that a schematic variable is an
+ argument of a schematic variable and that a schematic variable is applied
+ twice to the same bound variable. The function @{ML_ind pattern in Pattern}
+ in the structure @{ML_struct Pattern} tests whether a term satisfies these
+ restrictions.
+
+
+ @{ML_response [display, gray]
+ "let
+ val trm_list =
+ [@{term_pat \"?X\"}, @{term_pat \"a\"},
+ @{term_pat \"\<lambda>a b. ?X a b\"}, @{term_pat \"\<lambda>a b. (op +) a b\"},
+ @{term_pat \"\<lambda>a. (op +) a ?Y\"}, @{term_pat \"?X ?Y\"},
+ @{term_pat \"\<lambda>a b. ?X a b ?Y\"}, @{term_pat \"\<lambda>a. ?X a a\"}]
+in
+ map Pattern.pattern trm_list
+end"
+ "[true, true, true, true, true, false, false, false]"}
+
+ The point of restricting unification and matching to patterns is that
+ it is decidable and produces most general unifiers. In this way
+ matching and unify can be implemented so that they produce a type
+ and term environment (the latter actually produces only a term
+ environment).
+*}
+
+
+text {*
\begin{readmore}
For term, unification and matching of higher-order patterns is implemented in
- @{ML_file "Pure/pattern.ML"}. Full higher-order unification is implemented
+ @{ML_file "Pure/pattern.ML"}. This file also contains a first-order matcher
+ for terms. Full higher-order unification is implemented
in @{ML_file "Pure/unify.ML"}.
\end{readmore}
*}
+
+
+
section {* Type-Checking\label{sec:typechecking} *}
text {*