675 *} |
675 *} |
676 |
676 |
677 section {* Unification and Matching *} |
677 section {* Unification and Matching *} |
678 |
678 |
679 text {* |
679 text {* |
680 Isabelle's terms and types may contain schematic term variables |
680 As seen earlier, Isabelle's terms and types may contain schematic term variables |
681 (term-constructor @{ML Var}) and schematic type variables (term-constructor |
681 (term-constructor @{ML Var}) and schematic type variables (term-constructor |
682 @{ML TVar}). These variables stand for unknown entities, which can be made |
682 @{ML TVar}). These variables stand for unknown entities, which can be made |
683 more concrete by instantiations. Such instantiations might be a result of |
683 more concrete by instantiations. Such instantiations might be a result of |
684 unification or matching. While in case of types, unification and matching is |
684 unification or matching. While in case of types, unification and matching is |
685 relatively straightforward, in case of terms the algorithms are |
685 relatively straightforward, in case of terms the algorithms are |
706 end*} |
706 end*} |
707 |
707 |
708 text {* |
708 text {* |
709 The environment @{ML_ind "Vartab.empty"} in line 5 stands for the empty type |
709 The environment @{ML_ind "Vartab.empty"} in line 5 stands for the empty type |
710 environment, which is needed for starting the unification without any |
710 environment, which is needed for starting the unification without any |
711 (pre)instantiations. The @{text 0} is an integer index that we will explain |
711 (pre)instantiations. The @{text 0} is an integer index that will be explained |
712 below. In case of failure @{ML typ_unify in Sign} will throw the exception |
712 below. In case of failure @{ML typ_unify in Sign} will throw the exception |
713 @{text TUNIFY}. We can print out the resulting type environment @{ML |
713 @{text TUNIFY}. We can print out the resulting type environment bound to |
714 tyenv_unif} with the built-in function @{ML_ind dest in Vartab} from the |
714 @{ML tyenv_unif} with the built-in function @{ML_ind dest in Vartab} from the |
715 structure @{ML_struct Vartab}. |
715 structure @{ML_struct Vartab}. |
716 |
716 |
717 @{ML_response_fake [display,gray] |
717 @{ML_response_fake [display,gray] |
718 "Vartab.dest tyenv_unif" |
718 "Vartab.dest tyenv_unif" |
719 "[((\"'a\", 0), ([\"HOL.type\"], \"?'b List.list\")), |
719 "[((\"'a\", 0), ([\"HOL.type\"], \"?'b List.list\")), |
748 *} |
748 *} |
749 |
749 |
750 text {* |
750 text {* |
751 The first components in this list stand for the schematic type variables and |
751 The first components in this list stand for the schematic type variables and |
752 the second are the associated sorts and types. In this example the sort is |
752 the second are the associated sorts and types. In this example the sort is |
753 the default sort @{text "HOL.type"}. We will use in what follows the |
753 the default sort @{text "HOL.type"}. Instead of @{ML "Vartab.dest"}, we will |
754 pretty-printing function from Figure~\ref{fig:prettyenv} for @{ML_type |
754 use in what follows our own pretty-printing function from |
755 "Type.tyenv"}s. For the type environment in the example this function prints |
755 Figure~\ref{fig:prettyenv} for @{ML_type "Type.tyenv"}s. For the type |
756 out the more legible: |
756 environment in the example this function prints out the more legible: |
|
757 |
757 |
758 |
758 @{ML_response_fake [display, gray] |
759 @{ML_response_fake [display, gray] |
759 "pretty_tyenv @{context} tyenv_unif" |
760 "pretty_tyenv @{context} tyenv_unif" |
760 "[?'a := ?'b list, ?'b := nat]"} |
761 "[?'a := ?'b list, ?'b := nat]"} |
761 |
762 |
806 |
807 |
807 @{ML_response [display,gray] |
808 @{ML_response [display,gray] |
808 "index" |
809 "index" |
809 "1"} |
810 "1"} |
810 |
811 |
811 Let us now return again to the unification problem @{text "?'a * ?'b"} and |
812 Let us now return to the unification problem @{text "?'a * ?'b"} and |
812 @{text "?'b list * nat"} from the beginning of the section, and the |
813 @{text "?'b list * nat"} from the beginning of this section, and the |
813 calculated type environment @{ML tyenv_unif}: |
814 calculated type environment @{ML tyenv_unif}: |
814 |
815 |
815 @{ML_response_fake [display, gray] |
816 @{ML_response_fake [display, gray] |
816 "pretty_tyenv @{context} tyenv_unif" |
817 "pretty_tyenv @{context} tyenv_unif" |
817 "[?'a := ?'b list, ?'b := nat]"} |
818 "[?'a := ?'b list, ?'b := nat]"} |
818 |
819 |
819 Observe that the type environment which the function @{ML typ_unify in |
820 Observe that the type environment which the function @{ML typ_unify in |
820 Sign} returns is in \emph{not} an instantiation in fully solved form: while @{text |
821 Sign} returns is in \emph{not} an instantiation in fully solved form: while @{text |
821 "?'b"} is instantiated to @{typ nat}, this is not propagated to the |
822 "?'b"} is instantiated to @{typ nat}, this is not propagated to the |
822 instantiation for @{text "?'a"}. In unification theory, this is often |
823 instantiation for @{text "?'a"}. In unification theory, this is often |
823 called an instantiation in \emph{triangular form}. These not fully solved |
824 called an instantiation in \emph{triangular form}. These triangular |
824 instantiations are used because of performance reasons. |
825 instantiations, or triangular type environments, are used because of |
825 To apply a type environment in triangular form to a type, say @{text "?'a * |
826 performance reasons. To apply such a type environment to a type, say @{text "?'a * |
826 ?'b"}, you should use the function @{ML_ind norm_type in Envir}: |
827 ?'b"}, you should use the function @{ML_ind norm_type in Envir}: |
827 |
828 |
828 @{ML_response_fake [display, gray] |
829 @{ML_response_fake [display, gray] |
829 "Envir.norm_type tyenv_unif @{typ_pat \"?'a * ?'b\"}" |
830 "Envir.norm_type tyenv_unif @{typ_pat \"?'a * ?'b\"}" |
830 "nat list * nat"} |
831 "nat list * nat"} |
831 |
|
832 With this function the type variables @{text "?'a"} and @{text "?'b"} are |
|
833 correctly instantiated. |
|
834 |
832 |
835 Matching of types can be done with the function @{ML_ind typ_match in Sign} |
833 Matching of types can be done with the function @{ML_ind typ_match in Sign} |
836 also from the structure @{ML_struct Sign}. This function returns a @{ML_type |
834 also from the structure @{ML_struct Sign}. This function returns a @{ML_type |
837 Type.tyenv} as well, but might raise the exception @{text TYPE_MATCH} in case |
835 Type.tyenv} as well, but might raise the exception @{text TYPE_MATCH} in case |
838 of failure. For example |
836 of failure. For example |
852 "pretty_tyenv @{context} tyenv_match" |
850 "pretty_tyenv @{context} tyenv_match" |
853 "[?'a := bool list, ?'b := nat]"} |
851 "[?'a := bool list, ?'b := nat]"} |
854 |
852 |
855 Unlike unification, which uses the function @{ML norm_type in Envir}, |
853 Unlike unification, which uses the function @{ML norm_type in Envir}, |
856 applying the matcher to a type needs to be done with the function |
854 applying the matcher to a type needs to be done with the function |
857 @{ML_ind subst_type in Envir}. |
855 @{ML_ind subst_type in Envir}. For example |
858 |
856 |
859 @{ML_response_fake [display, gray] |
857 @{ML_response_fake [display, gray] |
860 "Envir.subst_type tyenv_match @{typ_pat \"?'a * ?'b\"}" |
858 "Envir.subst_type tyenv_match @{typ_pat \"?'a * ?'b\"}" |
861 "bool list * nat"} |
859 "bool list * nat"} |
862 |
860 |
863 Be careful to observe the difference: use always |
861 Be careful to observe the difference: use always |
864 @{ML subst_type in Envir} for matchers and @{ML norm_type in Envir} |
862 @{ML subst_type in Envir} for matchers and @{ML norm_type in Envir} |
865 for unifiers. To make the difference explicit, let us calculate the |
863 for unifiers. To show the difference, let us calculate the |
866 following matcher: |
864 following matcher: |
867 *} |
865 *} |
868 |
866 |
869 ML{*val tyenv_match' = let |
867 ML{*val tyenv_match' = let |
870 val pat = @{typ_pat "?'a * ?'b"} |
868 val pat = @{typ_pat "?'a * ?'b"} |
894 Unification and matching for types is implemented |
892 Unification and matching for types is implemented |
895 in @{ML_file "Pure/type.ML"}. The ``interface'' functions for them |
893 in @{ML_file "Pure/type.ML"}. The ``interface'' functions for them |
896 are in @{ML_file "Pure/sign.ML"}. Matching and unification produce type environments |
894 are in @{ML_file "Pure/sign.ML"}. Matching and unification produce type environments |
897 as results. These are implemented in @{ML_file "Pure/envir.ML"}. |
895 as results. These are implemented in @{ML_file "Pure/envir.ML"}. |
898 This file also includes the substitution and normalisation functions, |
896 This file also includes the substitution and normalisation functions, |
899 that apply a type environment to a type. Type environments are lookup |
897 which apply a type environment to a type. Type environments are lookup |
900 tables which are implemented in @{ML_file "Pure/term_ord.ML"}. |
898 tables which are implemented in @{ML_file "Pure/term_ord.ML"}. |
901 \end{readmore} |
899 \end{readmore} |
902 *} |
900 *} |
903 |
901 |
904 text {* |
902 text {* |
982 |
980 |
983 text {* |
981 text {* |
984 Unification of abstractions is more thoroughly studied in the context |
982 Unification of abstractions is more thoroughly studied in the context |
985 of higher-order pattern unification and higher-order pattern matching. A |
983 of higher-order pattern unification and higher-order pattern matching. A |
986 \emph{\index*{pattern}} is a lambda term whose ``head symbol'' (that is the |
984 \emph{\index*{pattern}} is a lambda term whose ``head symbol'' (that is the |
987 first symbol under abstractions) is either a constant, or a schematic or a free |
985 first symbol under an abstraction) is either a constant, a schematic or a free |
988 variable. If it is a schematic variable then it can be only applied by |
986 variable. If it is a schematic variable then it can be only applied with |
989 distinct bound variables. This excludes that a schematic variable is an |
987 distinct bound variables. This excludes that a schematic variable is an |
990 argument of another one and that a schematic variable is applied |
988 argument of another one and that a schematic variable is applied |
991 twice with the same bound variable. The function @{ML_ind pattern in Pattern} |
989 twice with the same bound variable. The function @{ML_ind pattern in Pattern} |
992 in the structure @{ML_struct Pattern} tests whether a term satisfies these |
990 in the structure @{ML_struct Pattern} tests whether a term satisfies these |
993 restrictions. |
991 restrictions. |
1007 The point of the restriction to patterns is that unification and matching |
1005 The point of the restriction to patterns is that unification and matching |
1008 are decidable and produce most general unifiers, respectively matchers. |
1006 are decidable and produce most general unifiers, respectively matchers. |
1009 In this way, matching and unification can be implemented as functions that |
1007 In this way, matching and unification can be implemented as functions that |
1010 produce a type and term environment (unification actually returns a |
1008 produce a type and term environment (unification actually returns a |
1011 record of type @{ML_type Envir.env} containing a maxind, a type environment |
1009 record of type @{ML_type Envir.env} containing a maxind, a type environment |
1012 and a term environment). The former function is @{ML_ind match in Pattern}, |
1010 and a term environment). The corresponding functions are @{ML_ind match in Pattern}, |
1013 the latter @{ML_ind unify in Pattern} both implemented in the structure |
1011 and @{ML_ind unify in Pattern} both implemented in the structure |
1014 @{ML_struct Pattern}. An example for higher-order pattern unification is |
1012 @{ML_struct Pattern}. An example for higher-order pattern unification is |
1015 |
1013 |
1016 @{ML_response_fake [display, gray] |
1014 @{ML_response_fake [display, gray] |
1017 "let |
1015 "let |
1018 val trm1 = @{term_pat \"\<lambda>x y. g (?X y x) (f (?Y x))\"} |
1016 val trm1 = @{term_pat \"\<lambda>x y. g (?X y x) (f (?Y x))\"} |
1027 |
1025 |
1028 An assumption of this function is that the terms to be unified have |
1026 An assumption of this function is that the terms to be unified have |
1029 already the same type. In case of failure, the exceptions that are raised |
1027 already the same type. In case of failure, the exceptions that are raised |
1030 are either @{text Pattern}, @{text MATCH} or @{text Unif}. |
1028 are either @{text Pattern}, @{text MATCH} or @{text Unif}. |
1031 |
1029 |
1032 As mentioned before, ``full'' higher-order unification, respectively, matching |
1030 As mentioned before, ``full'' higher-order unification, respectively |
1033 problems are in general undecidable and might in general not posses a single |
1031 higher-order matching problems, are in general undecidable and might also not posses a |
1034 most general solution. Therefore Isabelle implements Huet's pre-unification |
1032 single most general solution. Therefore Isabelle implements the unification |
1035 algorithm which does not return a single result, rather a lazy list of potentially |
1033 function @{ML_ind unifiers in Unify} so that it returns a lazy list of |
1036 infinite unifiers. The corresponding function is called @{ML_ind unifiers in Unify}. |
1034 potentially infinite unifiers. An example is as follows |
1037 An example is as follows |
|
1038 |
1035 |
1039 \ldots |
1036 \ldots |
1040 |
1037 |
1041 In case of failure this function does not raise an exception, rather returns |
1038 In case of failure this function does not raise an exception, rather returns |
1042 the empty sequence. In order to find a reasonable solution for a unification |
1039 the empty sequence. In order to find a reasonable solution for a unification |
1043 problem, Isabelle also tries first to solve the problem by higher-order |
1040 problem, Isabelle also tries first to solve the problem by higher-order |
1044 pattern unification. |
1041 pattern unification. For higher-order matching the function is called |
|
1042 @{ML_ind matchers in Unify}. Also this function might potentially return |
|
1043 sequences with more than one matcher. |
1045 |
1044 |
1046 \ldots |
1045 \ldots |
1047 |
|
1048 For higher-order matching the function is called @{ML_ind matchers in Unify}. |
|
1049 Also this function might potentially return sequences with more than one |
|
1050 element. |
|
1051 |
|
1052 |
1046 |
1053 Like @{ML unifiers in Unify}, this function does not raise an exception |
1047 Like @{ML unifiers in Unify}, this function does not raise an exception |
1054 in case of failure. It also tries out first whether the matching problem |
1048 in case of failure. It also first tries out whether the matching problem |
1055 can be solved by first-order matching. |
1049 can be solved by first-order matching. |
1056 |
1050 |
1057 \begin{readmore} |
1051 \begin{readmore} |
1058 Unification and matching of higher-order patterns is implemented in |
1052 Unification and matching of higher-order patterns is implemented in |
1059 @{ML_file "Pure/pattern.ML"}. This file also contains a first-order matcher |
1053 @{ML_file "Pure/pattern.ML"}. This file also contains a first-order matcher |