isabelle-cookbook: comparison ProgTutorial/General.thy

equal deleted inserted replaced

-:0dc727055c11
+:97518188ef0e
 *}
 section {* Unification and Matching (TBD) *}
 text {*
-Isabelle's terms and types contain schematic term variables
+Isabelle's terms and types may contain schematic term variables
 (term-constructor @{ML Var}) and schematic type variables (term-constructor
-@{ML TVar}). Both are printed out as variables with a leading question
+@{ML TVar}). These variables stand for unknown entities, which can be made
-mark. They stand for unknown terms and types, which can be made more concrete
+more concrete by instantiations. Such instantiations might be a result of
-by instantiations. Such instantiations can be calculated by unification or
+unification or matching. While in case of types, unification and matching is
-matching. While in case of types, unification and matching is relatively
+relatively straightforward, in case of terms the algorithms are
-straightforward, in case of terms the algorithms are substantially more
+substantially more complicated, because terms need higher-order versions of
-complicated, because terms need higher-order versions of the unification and
+the unification and matching algorithms. Below we shall use the
-matching algorithms. Below we shall use the antiquotation @{text "@{typ_pat \<dots>}"} and
+antiquotations @{text "@{typ_pat \<dots>}"} and @{text "@{term_pat \<dots>}"} from
-@{text "@{term_pat \<dots>}"} from Section~\ref{sec:antiquote} in order to
+Section~\ref{sec:antiquote} in order to construct examples involving
-construct examples involving schematic variables.
+schematic variables.
-Let us begin with describing unification and matching of types.  Both
+Let us begin with describing the unification and the matching function for
-matching and unification of types return a type environment, ML-type
+types.  Both return type environments, ML-type @{ML_type "Type.tyenv"},
-@{ML_type "Type.tyenv"}, which maps schematic type variables to types (and
+which map schematic type variables to types (and sorts). Below we use the
-sorts). Below we use the function @{ML_ind typ_unify in Sign} from the
+function @{ML_ind typ_unify in Sign} from the structure @{ML_struct Sign}
-structure @{ML_struct Sign} for unifying
+for unifying the types @{text "?'a * ?'b"} and @{text "?'b list *
-the types @{text "?'a * ?'b"} and @{text "?'b list * nat"}. This will
+nat"}. This will produce the mapping, or type environment, @{text "[?'a :=
-produce the mapping @{text "[?'a := ?'b list, ?'b := nat]"}.
+?'b list, ?'b := nat]"}.
 *}
 ML{*val (tyenv_unif, _) = let
 val ty1 = @{typ_pat "?'a * ?'b"}
 val ty2 = @{typ_pat "?'b list * nat"}
 in
 Sign.typ_unify @{theory} (ty1, ty2) (Vartab.empty, 0)
 end*}
 text {*
-The value @{ML_ind empty in Vartab} stands for the empty type environment,
+The environment @{ML_ind "Vartab.empty"} stands for the empty type environment,
-which is the value for starting the unification without any
+which is needed for starting the unification without any
-instantiations.\footnote{\bf FIXME: what is 0?}  In case of a failure
+(pre)instantiations.  In case of a failure
-@{ML typ_unify in Sign} will throw the exception @{ML_text TUNIFY}.
+@{ML typ_unify in Sign} will throw the exception @{text TUNIFY}.
 We can print out the resulting type environment @{ML tyenv_unif}
-with the built-in function @{ML_ind dest in Vartab}.
+with the built-in function @{ML_ind dest in Vartab} from the structure
+@{ML_struct Vartab}.
 @{ML_response_fake [display,gray]
 "Vartab.dest tyenv_unif"
 "[((\"'a\", 0), ([\"HOL.type\"], \"?'b List.list\")),
 ((\"'b\", 0), ([\"HOL.type\"], \"nat\"))]"}
+*}
-For what follows we will use the following pretty-printing function
-for @{ML_type "Type.tyenv"}s
+text_raw {*
-*}
+\begin{figure}[t]
+\begin{minipage}{\textwidth}
-ML{*fun pretty_tyenv ctxt tyenv =
+\begin{isabelle}*}
+ML{*fun pretty_helper f env =
+env |> Vartab.dest
+|> map f
+|> commas
+|> enclose "[" "]"
+|> tracing
+fun pretty_tyenv ctxt tyenv =
 let
 fun aux (v, (s, T)) =
 Syntax.string_of_typ ctxt (TVar (v, s))
 ^ " := " ^ Syntax.string_of_typ ctxt T
 in
-tyenv |> Vartab.dest
+pretty_helper aux tyenv
-|> map aux
-|> commas
-|> enclose "[" "]"
-|> tracing
 end*}
+text_raw {*
-text {*
+\end{isabelle}
-which prints for the type environment above
+\end{minipage}
+\caption{A pretty printing function for type environments, which are
+produced by unification and matching.\label{fig:prettyenv}}
+\end{figure}
+*}
+text {*
+The first components in this list stand for the schematic type variables and
+the second are the associated sorts and types. In what follows we will use
+the pretty-printing function in Figure~\ref{fig:prettyenv} for @{ML_type "Type.tyenv"}s.
+For the type environment from the example this function prints out the
+more legible:
 @{ML_response_fake [display, gray]
 "pretty_tyenv @{context} tyenv_unif"
 "[?'a := ?'b list, ?'b := nat]"}
-Observe that the type environment that the function @{ML typ_unify in
+The index @{text 0} in the example above is the maximal index of the schematic
+type variables occuring in @{text ty1} and @{text ty2}. It will be increased
+whenever a new schematic type variable is introduced during unification.
+This is for example the case when two schematic type variables have different,
+incomparable sorts. Then a new schematic type variable is introduced with
+the combined sorts. To show this let us use two sorts, @{text "s1"} and @{text "s2"},
+which we attach to the schematic type variables @{text "?'a"} and @{text "?'b"}.
+*}
+ML{*val (tyenv, index) = let
+val ty2 = @{typ_pat "?'a::s1"}
+val ty1 = @{typ_pat "?'b::s2"}
+in
+Sign.typ_unify @{theory} (ty1, ty2) (Vartab.empty, 0)
+end*}
+text {*
+To print out the calculated type environment we also switch on the printing
+of sort information.
+*}
+ML{*show_sorts := true*}
+text {*
+Now we can see
+@{ML_response_fake [display,gray]
+"pretty_tyenv @{context} tyenv"
+"[?'a::s1 := ?'a1::{s1,s2}, ?'b::s2 := ?'a1::{s1,s2}]"}
+the type variables @{text "?'a"} and @{text "?'b"} are instantiated
+with a new type variable @{text "?'a1"} with sort @{text "{s1,s2}"}. Since a new
+type variable has been introduced the @{ML index}, originally being @{text 0},
+has been increased by @{text 1}.
+@{ML_response [display,gray]
+"index"
+"1"}
+*}(*<*)ML %linenos{*show_sorts := false*}(*>*)
+text {*
+The way the unification function @{ML typ_unify in Sign} is implemented
+using an initial type environment and initial index makes it easy to
+unifying of more than two terms. For example
+*}
+ML{*val (tyenvs, _) = let
+val tys1 = (@{typ_pat "?'a"}, @{typ_pat "?'b list"})
+val tys2 = (@{typ_pat "?'b"}, @{typ_pat "nat"})
+in
+fold (Sign.typ_unify @{theory}) [tys1, tys2] (Vartab.empty, 0)
+end*}
+text {*
+Let us now return again to the unifier from the first example.
+@{ML_response_fake [display, gray]
+"pretty_tyenv @{context} tyenv_unif"
+"[?'a := ?'b list, ?'b := nat]"}
+Observe that the type environment which the function @{ML typ_unify in
 Sign} returns is in \emph{not} an instantiation in fully solved form: while @{text
 "?'b"} is instantiated to @{typ nat}, this is not propagated to the
 instantiation for @{text "?'a"}.  In unification theory, this is often
-called an instantiation in \emph{triangular form}. These instantiations
+called an instantiation in \emph{triangular form}. These not fully solved
-are used because of performance reasons.
+instantiations are used because of performance reasons.
 To apply a type environment in triangular form to a type, say @{text "?'a *
 ?'b"}, you should use the function @{ML_ind norm_type in Envir}:
 @{ML_response_fake [display, gray]
 "Envir.norm_type tyenv_unif @{typ_pat \"?'a * ?'b\"}"
 "nat list * nat"}
-*}
+With this function the type variables @{text "?'a"} and @{text "?'b"} are
-text {*
+correctly instantiated.
 Matching of types can be done with the function @{ML_ind typ_match in Sign}
 from the structure @{ML_struct Sign}. It also returns a @{ML_type
-Type.tyenv}. For example
+Type.tyenv} and might raise the exception @{text TYPE_MATCH} in case
+of failure. For example
 *}
 ML{*val tyenv_match = let
 val pat = @{typ_pat "?'a * ?'b"}
 and ty  = @{typ_pat "bool list * nat"}
 in
 Sign.typ_match @{theory} (pat, ty) Vartab.empty
 end*}
 text {*
-In case of failure, @{ML typ_match in Sign} will throw the exception
+Printing out the calculated matcher gives
-@{ML_text TYPE_MATCH}. Printing out the calculated matcher gives
 @{ML_response_fake [display,gray]
 "pretty_tyenv @{context} tyenv_match"
 "[?'a := bool list, ?'b := nat]"}
-Applying the matcher to a type needs to be done with the function
+Unlike the unification, which uses the function @{ML norm_type in Envir},
+applying the matcher to a type needs to be done with the function
 @{ML_ind subst_type in Envir}.
 @{ML_response_fake [display, gray]
 "Envir.subst_type tyenv_match @{typ_pat \"?'a * ?'b\"}"
 "bool list * nat"}
-Be careful to observe the difference: use @{ML subst_type in Envir}
+Be careful to observe the difference: use always
-for matchers and @{ML norm_type in Envir} for unifiers. To make the
+@{ML subst_type in Envir} for matchers and @{ML norm_type in Envir}
-difference explicit let us calculate the following matcher.
+for unifiers. To make the difference explicit, let us calculate the
+following matcher:
 *}
 ML{*val tyenv_match' = let
 val pat = @{typ_pat "?'a * ?'b"}
 and ty  = @{typ_pat "?'b list * nat"}
 in
 Sign.typ_match @{theory} (pat, ty) Vartab.empty
 end*}
 text {*
-Now @{ML tyenv_unif} is equal to @{ML tyenv_match'}. Consider
+Now @{ML tyenv_unif} is equal to @{ML tyenv_match'}. If we apply
-again the type @{text "?'a * ?'b"}, which we used in the unification
+@{ML norm_type in Envir} to the matcher we obtain
-and matching problems. If we apply @{ML norm_type in Envir} to
-the matcher we obtain
 @{ML_response_fake [display, gray]
 "Envir.norm_type tyenv_match' @{typ_pat \"?'a * ?'b\"}"
 "nat list * nat"}
 which is not a matcher for the second matching problem, and if
-we apply @{ML subst_type in Envir} to the unifier we obtain
+we apply @{ML subst_type in Envir} to the unifyier we obtain
 @{ML_response_fake [display, gray]
 "Envir.subst_type tyenv_unif @{typ_pat \"?'a * ?'b\"}"
 "?'b list * nat"}
 tables which are implemented in @{ML_file "Pure/term_ord.ML"}.
 \end{readmore}
 *}
 text {*
-TBD below
+Unification and matching of terms is substantially more complicated than the
+type-case. The reason is that terms have abstractions and unification
+or matching modulo plain equality is often not meaningful in this
+context. Nevertheless, Isabelle implements the function @{ML_ind
+first_order_match in Pattern} for terms.  This matching function returns a
+type environment and a term environment.  To pretty print the latter we use
+the function @{text "pretty_env"}:
+*}
+ML{*fun pretty_env ctxt env =
+let
+fun aux (v, (T, t)) =
+string_of_term ctxt (Var (v, T)) ^ " := " ^ string_of_term ctxt t
+in
+pretty_helper aux env
+end*}
+text {*
+As can be seen from the @{text "aux"}-function, a term environment associates
+a schematic term variable with a type and a term.  An example of a first-order
+matching problem is the term @{term "P (\<lambda>a b. Q b a)"} and the pattern
+@{text "?X ?Y"}.
+*}
+ML{*val (_, fo_env) = let
+val fo_pat = @{term_pat "(?X::(nat\<Rightarrow>nat\<Rightarrow>nat)\<Rightarrow>bool) ?Y"}
+val trm = @{term "P::(nat\<Rightarrow>nat\<Rightarrow>nat)\<Rightarrow>bool"}
+$ @{term "\<lambda>a b. (Q::nat\<Rightarrow>nat\<Rightarrow>nat) b a"}
+val init = (Vartab.empty, Vartab.empty)
+in
+Pattern.first_order_match @{theory} (fo_pat, trm) init
+end *}
+text {*
+In this example we annotated explicitly types because then
+the type environment is empty and can be ignored. The
+resulting term environment is:
+@{ML_response_fake [display, gray]
+"pretty_env @{context} fo_env"
+"[?X := P, ?Y := \<lambda>a b. Q b a]"}
+The matcher can be applied to a term using the function @{ML_ind subst_term
+in Envir} (remember the same convention for types applies to terms: @{ML
+subst_term in Envir} is for matchers and @{ML norm_term in Envir} is for
+unifyiers). The function @{ML subst_term in Envir} expects a type environment,
+which is set to empty in the example below, and a term environment.
+@{ML_response_fake [display, gray]
+"let
+val trm = @{term_pat \"(?X::(nat\<Rightarrow>nat\<Rightarrow>nat)\<Rightarrow>bool) ?Y\"}
+in
+Envir.subst_term (Vartab.empty, fo_env) trm
+|> string_of_term @{context}
+|> tracing
+end"
+"P (\<lambda>a b. Q b a)"}
+First-order matching is useful for matching against applications and
+variables. It can deal also with abstractions and a limited form of
+alpha-equivalence, but this kind of matching should be used with care, since
+it is not clear whether the result is meaningful. A working example is
+matching @{text "\<lambda>x. P x"} against the pattern @{text "\<lambda>y. ?X y"}. In this
+case first-order matching produces @{text "[?X := P]"}.
+@{ML_response_fake [display, gray]
+"let
+val fo_pat = @{term_pat \"\<lambda>y. (?X::nat\<Rightarrow>bool) y\"}
+val trm = @{term \"\<lambda>x. (P::nat\<Rightarrow>bool) x\"}
+val init = (Vartab.empty, Vartab.empty)
+in
+Pattern.first_order_match @{theory} (fo_pat, trm) init
+|> snd
+|> pretty_env @{context}
+end"
+"[?X := P]"}
+*}
+text {*
+In the context of unifying abstractions, more thouroughly studied is
+higher-order pattern unification and higher-order pattern matching.  A
+\emph{\index*{pattern}} is a lambda term whose ``head symbol'' (that is the
+first symbol under abstractions) is either a constant, or a schematic or a free
+variable. If it is a schematic variable then it can be only applied by
+distinct bound variables. This excludes that a schematic variable is an
+argument of a schematic variable and that a schematic variable is applied
+twice to the same bound variable. The function @{ML_ind pattern in Pattern}
+in the structure @{ML_struct Pattern} tests whether a term satisfies these
+restrictions.
+@{ML_response [display, gray]
+"let
+val trm_list =
+[@{term_pat \"?X\"},              @{term_pat \"a\"},
+@{term_pat \"\<lambda>a b. ?X a b\"},    @{term_pat \"\<lambda>a b. (op +) a b\"},
+@{term_pat \"\<lambda>a. (op +) a ?Y\"}, @{term_pat \"?X ?Y\"},
+@{term_pat \"\<lambda>a b. ?X a b ?Y\"}, @{term_pat \"\<lambda>a. ?X a a\"}]
+in
+map Pattern.pattern trm_list
+end"
+"[true, true, true, true, true, false, false, false]"}
+The point of restricting unification and matching to patterns is that
+it is decidable and produces most general unifiers. In this way
+matching and unify can be implemented so that they produce a type
+and term environment (the latter actually produces only a term
+environment).
+*}
+text {*
 \begin{readmore}
 For term, unification and matching of higher-order patterns is implemented in
-@{ML_file "Pure/pattern.ML"}. Full higher-order unification is implemented
+@{ML_file "Pure/pattern.ML"}. This file also contains a first-order matcher
+for terms. Full higher-order unification is implemented
 in @{ML_file "Pure/unify.ML"}.
 \end{readmore}
 *}
 section {* Type-Checking\label{sec:typechecking} *}
 text {*
 Remember Isabelle follows the Church-style typing for terms, i.e., a term contains

changeset 381	97518188ef0e
parent 380	0dc727055c11
child 382	3f153aa4f231