ProgTutorial/General.thy
changeset 381 97518188ef0e
parent 380 0dc727055c11
child 382 3f153aa4f231
equal deleted inserted replaced
380:0dc727055c11 381:97518188ef0e
   675 *}
   675 *}
   676 
   676 
   677 section {* Unification and Matching (TBD) *}
   677 section {* Unification and Matching (TBD) *}
   678 
   678 
   679 text {* 
   679 text {* 
   680   Isabelle's terms and types contain schematic term variables
   680   Isabelle's terms and types may contain schematic term variables
   681   (term-constructor @{ML Var}) and schematic type variables (term-constructor
   681   (term-constructor @{ML Var}) and schematic type variables (term-constructor
   682   @{ML TVar}). Both are printed out as variables with a leading question
   682   @{ML TVar}). These variables stand for unknown entities, which can be made
   683   mark. They stand for unknown terms and types, which can be made more concrete
   683   more concrete by instantiations. Such instantiations might be a result of
   684   by instantiations. Such instantiations can be calculated by unification or
   684   unification or matching. While in case of types, unification and matching is
   685   matching. While in case of types, unification and matching is relatively
   685   relatively straightforward, in case of terms the algorithms are
   686   straightforward, in case of terms the algorithms are substantially more
   686   substantially more complicated, because terms need higher-order versions of
   687   complicated, because terms need higher-order versions of the unification and
   687   the unification and matching algorithms. Below we shall use the
   688   matching algorithms. Below we shall use the antiquotation @{text "@{typ_pat \<dots>}"} and
   688   antiquotations @{text "@{typ_pat \<dots>}"} and @{text "@{term_pat \<dots>}"} from
   689   @{text "@{term_pat \<dots>}"} from Section~\ref{sec:antiquote} in order to
   689   Section~\ref{sec:antiquote} in order to construct examples involving
   690   construct examples involving schematic variables.
   690   schematic variables.
   691 
   691 
   692   Let us begin with describing unification and matching of types.  Both
   692   Let us begin with describing the unification and the matching function for
   693   matching and unification of types return a type environment, ML-type
   693   types.  Both return type environments, ML-type @{ML_type "Type.tyenv"},
   694   @{ML_type "Type.tyenv"}, which maps schematic type variables to types (and
   694   which map schematic type variables to types (and sorts). Below we use the
   695   sorts). Below we use the function @{ML_ind typ_unify in Sign} from the 
   695   function @{ML_ind typ_unify in Sign} from the structure @{ML_struct Sign}
   696   structure @{ML_struct Sign} for unifying
   696   for unifying the types @{text "?'a * ?'b"} and @{text "?'b list *
   697   the types @{text "?'a * ?'b"} and @{text "?'b list * nat"}. This will
   697   nat"}. This will produce the mapping, or type environment, @{text "[?'a :=
   698   produce the mapping @{text "[?'a := ?'b list, ?'b := nat]"}.
   698   ?'b list, ?'b := nat]"}.
   699 *}
   699 *}
   700 
   700 
   701 ML{*val (tyenv_unif, _) = let
   701 ML{*val (tyenv_unif, _) = let
   702   val ty1 = @{typ_pat "?'a * ?'b"}
   702   val ty1 = @{typ_pat "?'a * ?'b"}
   703   val ty2 = @{typ_pat "?'b list * nat"}
   703   val ty2 = @{typ_pat "?'b list * nat"}
   704 in
   704 in
   705   Sign.typ_unify @{theory} (ty1, ty2) (Vartab.empty, 0) 
   705   Sign.typ_unify @{theory} (ty1, ty2) (Vartab.empty, 0) 
   706 end*}
   706 end*}
   707 
   707 
   708 text {* 
   708 text {* 
   709   The value @{ML_ind empty in Vartab} stands for the empty type environment, 
   709   The environment @{ML_ind "Vartab.empty"} stands for the empty type environment, 
   710   which is the value for starting the unification without any 
   710   which is needed for starting the unification without any 
   711   instantiations.\footnote{\bf FIXME: what is 0?}  In case of a failure 
   711   (pre)instantiations.  In case of a failure 
   712   @{ML typ_unify in Sign} will throw the exception @{ML_text TUNIFY}.
   712   @{ML typ_unify in Sign} will throw the exception @{text TUNIFY}.
   713   We can print out the resulting type environment @{ML tyenv_unif} 
   713   We can print out the resulting type environment @{ML tyenv_unif} 
   714   with the built-in function @{ML_ind dest in Vartab}.
   714   with the built-in function @{ML_ind dest in Vartab} from the structure
       
   715   @{ML_struct Vartab}.
   715 
   716 
   716   @{ML_response_fake [display,gray]
   717   @{ML_response_fake [display,gray]
   717   "Vartab.dest tyenv_unif"
   718   "Vartab.dest tyenv_unif"
   718   "[((\"'a\", 0), ([\"HOL.type\"], \"?'b List.list\")), 
   719   "[((\"'a\", 0), ([\"HOL.type\"], \"?'b List.list\")), 
   719  ((\"'b\", 0), ([\"HOL.type\"], \"nat\"))]"}
   720  ((\"'b\", 0), ([\"HOL.type\"], \"nat\"))]"} 
   720 
   721 *}
   721   For what follows we will use the following pretty-printing function
   722 
   722   for @{ML_type "Type.tyenv"}s
   723 text_raw {*
   723 *}
   724   \begin{figure}[t]
   724 
   725   \begin{minipage}{\textwidth}
   725 ML{*fun pretty_tyenv ctxt tyenv =
   726   \begin{isabelle}*}
       
   727 ML{*fun pretty_helper f env =
       
   728   env |> Vartab.dest  
       
   729       |> map f 
       
   730       |> commas 
       
   731       |> enclose "[" "]" 
       
   732       |> tracing
       
   733 
       
   734 fun pretty_tyenv ctxt tyenv =
   726 let
   735 let
   727   fun aux (v, (s, T)) =
   736   fun aux (v, (s, T)) =
   728         Syntax.string_of_typ ctxt (TVar (v, s)) 
   737         Syntax.string_of_typ ctxt (TVar (v, s)) 
   729         ^ " := " ^ Syntax.string_of_typ ctxt T
   738           ^ " := " ^ Syntax.string_of_typ ctxt T
   730 in
   739 in 
   731   tyenv |> Vartab.dest
   740   pretty_helper aux tyenv
   732         |> map aux 
       
   733         |> commas
       
   734         |> enclose "[" "]"
       
   735         |> tracing
       
   736 end*}
   741 end*}
   737 
   742 text_raw {*
   738 text {*
   743   \end{isabelle}
   739   which prints for the type environment above
   744   \end{minipage}
       
   745   \caption{A pretty printing function for type environments, which are 
       
   746   produced by unification and matching.\label{fig:prettyenv}}
       
   747   \end{figure}
       
   748 *}
       
   749 
       
   750 text {*
       
   751   The first components in this list stand for the schematic type variables and
       
   752   the second are the associated sorts and types. In what follows we will use
       
   753   the pretty-printing function in Figure~\ref{fig:prettyenv} for @{ML_type "Type.tyenv"}s.
       
   754   For the type environment from the example this function prints out the 
       
   755   more legible: 
   740 
   756 
   741   @{ML_response_fake [display, gray]
   757   @{ML_response_fake [display, gray]
   742   "pretty_tyenv @{context} tyenv_unif"
   758   "pretty_tyenv @{context} tyenv_unif"
   743   "[?'a := ?'b list, ?'b := nat]"}
   759   "[?'a := ?'b list, ?'b := nat]"}
   744 
   760 
   745   Observe that the type environment that the function @{ML typ_unify in
   761   The index @{text 0} in the example above is the maximal index of the schematic 
       
   762   type variables occuring in @{text ty1} and @{text ty2}. It will be increased
       
   763   whenever a new schematic type variable is introduced during unification.
       
   764   This is for example the case when two schematic type variables have different, 
       
   765   incomparable sorts. Then a new schematic type variable is introduced with 
       
   766   the combined sorts. To show this let us use two sorts, @{text "s1"} and @{text "s2"}, 
       
   767   which we attach to the schematic type variables @{text "?'a"} and @{text "?'b"}. 
       
   768 *}
       
   769 
       
   770 ML{*val (tyenv, index) = let
       
   771   val ty2 = @{typ_pat "?'a::s1"}
       
   772   val ty1 = @{typ_pat "?'b::s2"}
       
   773 in
       
   774   Sign.typ_unify @{theory} (ty1, ty2) (Vartab.empty, 0) 
       
   775 end*}
       
   776 
       
   777 text {*
       
   778   To print out the calculated type environment we also switch on the printing 
       
   779   of sort information.
       
   780 *}
       
   781 
       
   782 ML{*show_sorts := true*}
       
   783 
       
   784 text {*
       
   785   Now we can see
       
   786 
       
   787   @{ML_response_fake [display,gray]
       
   788   "pretty_tyenv @{context} tyenv"
       
   789   "[?'a::s1 := ?'a1::{s1,s2}, ?'b::s2 := ?'a1::{s1,s2}]"}
       
   790 
       
   791   the type variables @{text "?'a"} and @{text "?'b"} are instantiated
       
   792   with a new type variable @{text "?'a1"} with sort @{text "{s1,s2}"}. Since a new
       
   793   type variable has been introduced the @{ML index}, originally being @{text 0}, 
       
   794   has been increased by @{text 1}.
       
   795 
       
   796   @{ML_response [display,gray]
       
   797   "index"
       
   798   "1"}
       
   799 *}(*<*)ML %linenos{*show_sorts := false*}(*>*)
       
   800 
       
   801 text {*
       
   802   The way the unification function @{ML typ_unify in Sign} is implemented 
       
   803   using an initial type environment and initial index makes it easy to
       
   804   unifying of more than two terms. For example 
       
   805 *}
       
   806 
       
   807 ML{*val (tyenvs, _) = let
       
   808   val tys1 = (@{typ_pat "?'a"}, @{typ_pat "?'b list"})
       
   809   val tys2 = (@{typ_pat "?'b"}, @{typ_pat "nat"})
       
   810 in
       
   811   fold (Sign.typ_unify @{theory}) [tys1, tys2] (Vartab.empty, 0) 
       
   812 end*}
       
   813 
       
   814 text {*
       
   815   Let us now return again to the unifier from the first example.
       
   816 
       
   817   @{ML_response_fake [display, gray]
       
   818   "pretty_tyenv @{context} tyenv_unif"
       
   819   "[?'a := ?'b list, ?'b := nat]"}
       
   820 
       
   821   Observe that the type environment which the function @{ML typ_unify in
   746   Sign} returns is in \emph{not} an instantiation in fully solved form: while @{text
   822   Sign} returns is in \emph{not} an instantiation in fully solved form: while @{text
   747   "?'b"} is instantiated to @{typ nat}, this is not propagated to the
   823   "?'b"} is instantiated to @{typ nat}, this is not propagated to the
   748   instantiation for @{text "?'a"}.  In unification theory, this is often
   824   instantiation for @{text "?'a"}.  In unification theory, this is often
   749   called an instantiation in \emph{triangular form}. These instantiations
   825   called an instantiation in \emph{triangular form}. These not fully solved 
   750   are used because of performance reasons.
   826   instantiations are used because of performance reasons.
   751 
       
   752 
   827 
   753   To apply a type environment in triangular form to a type, say @{text "?'a *
   828   To apply a type environment in triangular form to a type, say @{text "?'a *
   754   ?'b"}, you should use the function @{ML_ind norm_type in Envir}:
   829   ?'b"}, you should use the function @{ML_ind norm_type in Envir}:
   755 
   830 
   756   @{ML_response_fake [display, gray]
   831   @{ML_response_fake [display, gray]
   757   "Envir.norm_type tyenv_unif @{typ_pat \"?'a * ?'b\"}"
   832   "Envir.norm_type tyenv_unif @{typ_pat \"?'a * ?'b\"}"
   758   "nat list * nat"}
   833   "nat list * nat"}
   759 *}
   834 
   760 
   835   With this function the type variables @{text "?'a"} and @{text "?'b"} are 
   761 text {* 
   836   correctly instantiated.
       
   837 
   762   Matching of types can be done with the function @{ML_ind typ_match in Sign}
   838   Matching of types can be done with the function @{ML_ind typ_match in Sign}
   763   from the structure @{ML_struct Sign}. It also returns a @{ML_type
   839   from the structure @{ML_struct Sign}. It also returns a @{ML_type
   764   Type.tyenv}. For example
   840   Type.tyenv} and might raise the exception @{text TYPE_MATCH} in case
       
   841   of failure. For example
   765 *}
   842 *}
   766 
   843 
   767 ML{*val tyenv_match = let
   844 ML{*val tyenv_match = let
   768   val pat = @{typ_pat "?'a * ?'b"}
   845   val pat = @{typ_pat "?'a * ?'b"}
   769   and ty  = @{typ_pat "bool list * nat"}
   846   and ty  = @{typ_pat "bool list * nat"}
   770 in
   847 in
   771   Sign.typ_match @{theory} (pat, ty) Vartab.empty 
   848   Sign.typ_match @{theory} (pat, ty) Vartab.empty 
   772 end*}
   849 end*}
   773 
   850 
   774 text {*
   851 text {*
   775   In case of failure, @{ML typ_match in Sign} will throw the exception 
   852   Printing out the calculated matcher gives
   776   @{ML_text TYPE_MATCH}. Printing out the calculated matcher gives
       
   777 
   853 
   778   @{ML_response_fake [display,gray]
   854   @{ML_response_fake [display,gray]
   779   "pretty_tyenv @{context} tyenv_match"
   855   "pretty_tyenv @{context} tyenv_match"
   780   "[?'a := bool list, ?'b := nat]"}
   856   "[?'a := bool list, ?'b := nat]"}
   781   
   857   
   782   Applying the matcher to a type needs to be done with the function
   858   Unlike the unification, which uses the function @{ML norm_type in Envir}, 
       
   859   applying the matcher to a type needs to be done with the function
   783   @{ML_ind subst_type in Envir}.
   860   @{ML_ind subst_type in Envir}.
   784 
   861 
   785   @{ML_response_fake [display, gray]
   862   @{ML_response_fake [display, gray]
   786   "Envir.subst_type tyenv_match @{typ_pat \"?'a * ?'b\"}"
   863   "Envir.subst_type tyenv_match @{typ_pat \"?'a * ?'b\"}"
   787   "bool list * nat"}
   864   "bool list * nat"}
   788 
   865 
   789   Be careful to observe the difference: use @{ML subst_type in Envir} 
   866   Be careful to observe the difference: use always 
   790   for matchers and @{ML norm_type in Envir} for unifiers. To make the 
   867   @{ML subst_type in Envir} for matchers and @{ML norm_type in Envir} 
   791   difference explicit let us calculate the following matcher.
   868   for unifiers. To make the difference explicit, let us calculate the 
       
   869   following matcher:
   792 *}
   870 *}
   793 
   871 
   794 ML{*val tyenv_match' = let
   872 ML{*val tyenv_match' = let
   795   val pat = @{typ_pat "?'a * ?'b"}
   873   val pat = @{typ_pat "?'a * ?'b"}
   796   and ty  = @{typ_pat "?'b list * nat"}
   874   and ty  = @{typ_pat "?'b list * nat"}
   797 in
   875 in
   798   Sign.typ_match @{theory} (pat, ty) Vartab.empty 
   876   Sign.typ_match @{theory} (pat, ty) Vartab.empty 
   799 end*}
   877 end*}
   800 
   878 
   801 text {*
   879 text {*
   802   Now @{ML tyenv_unif} is equal to @{ML tyenv_match'}. Consider
   880   Now @{ML tyenv_unif} is equal to @{ML tyenv_match'}. If we apply 
   803   again the type @{text "?'a * ?'b"}, which we used in the unification
   881   @{ML norm_type in Envir} to the matcher we obtain
   804   and matching problems. If we apply @{ML norm_type in Envir} to 
       
   805   the matcher we obtain
       
   806 
   882 
   807   @{ML_response_fake [display, gray]
   883   @{ML_response_fake [display, gray]
   808   "Envir.norm_type tyenv_match' @{typ_pat \"?'a * ?'b\"}"
   884   "Envir.norm_type tyenv_match' @{typ_pat \"?'a * ?'b\"}"
   809   "nat list * nat"}
   885   "nat list * nat"}
   810 
   886 
   811   which is not a matcher for the second matching problem, and if
   887   which is not a matcher for the second matching problem, and if
   812   we apply @{ML subst_type in Envir} to the unifier we obtain
   888   we apply @{ML subst_type in Envir} to the unifyier we obtain
   813 
   889 
   814   @{ML_response_fake [display, gray]
   890   @{ML_response_fake [display, gray]
   815   "Envir.subst_type tyenv_unif @{typ_pat \"?'a * ?'b\"}"
   891   "Envir.subst_type tyenv_unif @{typ_pat \"?'a * ?'b\"}"
   816   "?'b list * nat"}
   892   "?'b list * nat"}
   817   
   893   
   827   tables which are implemented in @{ML_file "Pure/term_ord.ML"}.
   903   tables which are implemented in @{ML_file "Pure/term_ord.ML"}.
   828   \end{readmore}
   904   \end{readmore}
   829 *}
   905 *}
   830 
   906 
   831 text {*
   907 text {*
   832   TBD below
   908   Unification and matching of terms is substantially more complicated than the
   833 
   909   type-case. The reason is that terms have abstractions and unification
   834 
   910   or matching modulo plain equality is often not meaningful in this
       
   911   context. Nevertheless, Isabelle implements the function @{ML_ind
       
   912   first_order_match in Pattern} for terms.  This matching function returns a
       
   913   type environment and a term environment.  To pretty print the latter we use
       
   914   the function @{text "pretty_env"}:
       
   915 *}
       
   916 
       
   917 ML{*fun pretty_env ctxt env =
       
   918 let
       
   919   fun aux (v, (T, t)) =
       
   920         string_of_term ctxt (Var (v, T)) ^ " := " ^ string_of_term ctxt t
       
   921 in
       
   922   pretty_helper aux env 
       
   923 end*}
       
   924 
       
   925 text {*
       
   926   As can be seen from the @{text "aux"}-function, a term environment associates 
       
   927   a schematic term variable with a type and a term.  An example of a first-order 
       
   928   matching problem is the term @{term "P (\<lambda>a b. Q b a)"} and the pattern 
       
   929   @{text "?X ?Y"}.
       
   930 *}
       
   931 
       
   932 ML{*val (_, fo_env) = let
       
   933   val fo_pat = @{term_pat "(?X::(nat\<Rightarrow>nat\<Rightarrow>nat)\<Rightarrow>bool) ?Y"}
       
   934   val trm = @{term "P::(nat\<Rightarrow>nat\<Rightarrow>nat)\<Rightarrow>bool"} 
       
   935               $ @{term "\<lambda>a b. (Q::nat\<Rightarrow>nat\<Rightarrow>nat) b a"}
       
   936   val init = (Vartab.empty, Vartab.empty) 
       
   937 in
       
   938   Pattern.first_order_match @{theory} (fo_pat, trm) init
       
   939 end *}
       
   940 
       
   941 text {*
       
   942   In this example we annotated explicitly types because then 
       
   943   the type environment is empty and can be ignored. The 
       
   944   resulting term environment is:
       
   945 
       
   946   @{ML_response_fake [display, gray]
       
   947   "pretty_env @{context} fo_env"
       
   948   "[?X := P, ?Y := \<lambda>a b. Q b a]"}
       
   949 
       
   950   The matcher can be applied to a term using the function @{ML_ind subst_term
       
   951   in Envir} (remember the same convention for types applies to terms: @{ML
       
   952   subst_term in Envir} is for matchers and @{ML norm_term in Envir} is for
       
   953   unifyiers). The function @{ML subst_term in Envir} expects a type environment,
       
   954   which is set to empty in the example below, and a term environment.
       
   955 
       
   956   @{ML_response_fake [display, gray]
       
   957   "let
       
   958   val trm = @{term_pat \"(?X::(nat\<Rightarrow>nat\<Rightarrow>nat)\<Rightarrow>bool) ?Y\"}
       
   959 in
       
   960   Envir.subst_term (Vartab.empty, fo_env) trm
       
   961   |> string_of_term @{context}
       
   962   |> tracing
       
   963 end"
       
   964   "P (\<lambda>a b. Q b a)"}
       
   965 
       
   966   First-order matching is useful for matching against applications and
       
   967   variables. It can deal also with abstractions and a limited form of
       
   968   alpha-equivalence, but this kind of matching should be used with care, since
       
   969   it is not clear whether the result is meaningful. A working example is
       
   970   matching @{text "\<lambda>x. P x"} against the pattern @{text "\<lambda>y. ?X y"}. In this
       
   971   case first-order matching produces @{text "[?X := P]"}.
       
   972 
       
   973   @{ML_response_fake [display, gray]
       
   974   "let
       
   975   val fo_pat = @{term_pat \"\<lambda>y. (?X::nat\<Rightarrow>bool) y\"}
       
   976   val trm = @{term \"\<lambda>x. (P::nat\<Rightarrow>bool) x\"} 
       
   977   val init = (Vartab.empty, Vartab.empty) 
       
   978 in
       
   979   Pattern.first_order_match @{theory} (fo_pat, trm) init
       
   980   |> snd 
       
   981   |> pretty_env @{context} 
       
   982 end"
       
   983   "[?X := P]"}
       
   984 *}
       
   985 
       
   986 text {*
       
   987   In the context of unifying abstractions, more thouroughly studied is
       
   988   higher-order pattern unification and higher-order pattern matching.  A
       
   989   \emph{\index*{pattern}} is a lambda term whose ``head symbol'' (that is the
       
   990   first symbol under abstractions) is either a constant, or a schematic or a free
       
   991   variable. If it is a schematic variable then it can be only applied by
       
   992   distinct bound variables. This excludes that a schematic variable is an
       
   993   argument of a schematic variable and that a schematic variable is applied
       
   994   twice to the same bound variable. The function @{ML_ind pattern in Pattern}
       
   995   in the structure @{ML_struct Pattern} tests whether a term satisfies these
       
   996   restrictions.
       
   997 
       
   998 
       
   999   @{ML_response [display, gray]
       
  1000   "let
       
  1001   val trm_list = 
       
  1002         [@{term_pat \"?X\"},              @{term_pat \"a\"},
       
  1003          @{term_pat \"\<lambda>a b. ?X a b\"},    @{term_pat \"\<lambda>a b. (op +) a b\"},
       
  1004          @{term_pat \"\<lambda>a. (op +) a ?Y\"}, @{term_pat \"?X ?Y\"},
       
  1005          @{term_pat \"\<lambda>a b. ?X a b ?Y\"}, @{term_pat \"\<lambda>a. ?X a a\"}] 
       
  1006 in
       
  1007   map Pattern.pattern trm_list
       
  1008 end"
       
  1009   "[true, true, true, true, true, false, false, false]"}
       
  1010 
       
  1011   The point of restricting unification and matching to patterns is that
       
  1012   it is decidable and produces most general unifiers. In this way 
       
  1013   matching and unify can be implemented so that they produce a type
       
  1014   and term environment (the latter actually produces only a term 
       
  1015   environment).
       
  1016 *}
       
  1017 
       
  1018 
       
  1019 text {*
   835   \begin{readmore}
  1020   \begin{readmore}
   836   For term, unification and matching of higher-order patterns is implemented in 
  1021   For term, unification and matching of higher-order patterns is implemented in 
   837   @{ML_file "Pure/pattern.ML"}. Full higher-order unification is implemented
  1022   @{ML_file "Pure/pattern.ML"}. This file also contains a first-order matcher
       
  1023   for terms. Full higher-order unification is implemented
   838   in @{ML_file "Pure/unify.ML"}. 
  1024   in @{ML_file "Pure/unify.ML"}. 
   839   \end{readmore}
  1025   \end{readmore}
   840 *}
  1026 *}
       
  1027 
       
  1028 
       
  1029 
   841 
  1030 
   842 section {* Type-Checking\label{sec:typechecking} *}
  1031 section {* Type-Checking\label{sec:typechecking} *}
   843 
  1032 
   844 text {* 
  1033 text {* 
   845   Remember Isabelle follows the Church-style typing for terms, i.e., a term contains 
  1034   Remember Isabelle follows the Church-style typing for terms, i.e., a term contains