675 *} |
675 *} |
676 |
676 |
677 section {* Unification and Matching (TBD) *} |
677 section {* Unification and Matching (TBD) *} |
678 |
678 |
679 text {* |
679 text {* |
680 Isabelle's terms and types contain schematic term variables |
680 Isabelle's terms and types may contain schematic term variables |
681 (term-constructor @{ML Var}) and schematic type variables (term-constructor |
681 (term-constructor @{ML Var}) and schematic type variables (term-constructor |
682 @{ML TVar}). Both are printed out as variables with a leading question |
682 @{ML TVar}). These variables stand for unknown entities, which can be made |
683 mark. They stand for unknown terms and types, which can be made more concrete |
683 more concrete by instantiations. Such instantiations might be a result of |
684 by instantiations. Such instantiations can be calculated by unification or |
684 unification or matching. While in case of types, unification and matching is |
685 matching. While in case of types, unification and matching is relatively |
685 relatively straightforward, in case of terms the algorithms are |
686 straightforward, in case of terms the algorithms are substantially more |
686 substantially more complicated, because terms need higher-order versions of |
687 complicated, because terms need higher-order versions of the unification and |
687 the unification and matching algorithms. Below we shall use the |
688 matching algorithms. Below we shall use the antiquotation @{text "@{typ_pat \<dots>}"} and |
688 antiquotations @{text "@{typ_pat \<dots>}"} and @{text "@{term_pat \<dots>}"} from |
689 @{text "@{term_pat \<dots>}"} from Section~\ref{sec:antiquote} in order to |
689 Section~\ref{sec:antiquote} in order to construct examples involving |
690 construct examples involving schematic variables. |
690 schematic variables. |
691 |
691 |
692 Let us begin with describing unification and matching of types. Both |
692 Let us begin with describing the unification and the matching function for |
693 matching and unification of types return a type environment, ML-type |
693 types. Both return type environments, ML-type @{ML_type "Type.tyenv"}, |
694 @{ML_type "Type.tyenv"}, which maps schematic type variables to types (and |
694 which map schematic type variables to types (and sorts). Below we use the |
695 sorts). Below we use the function @{ML_ind typ_unify in Sign} from the |
695 function @{ML_ind typ_unify in Sign} from the structure @{ML_struct Sign} |
696 structure @{ML_struct Sign} for unifying |
696 for unifying the types @{text "?'a * ?'b"} and @{text "?'b list * |
697 the types @{text "?'a * ?'b"} and @{text "?'b list * nat"}. This will |
697 nat"}. This will produce the mapping, or type environment, @{text "[?'a := |
698 produce the mapping @{text "[?'a := ?'b list, ?'b := nat]"}. |
698 ?'b list, ?'b := nat]"}. |
699 *} |
699 *} |
700 |
700 |
701 ML{*val (tyenv_unif, _) = let |
701 ML{*val (tyenv_unif, _) = let |
702 val ty1 = @{typ_pat "?'a * ?'b"} |
702 val ty1 = @{typ_pat "?'a * ?'b"} |
703 val ty2 = @{typ_pat "?'b list * nat"} |
703 val ty2 = @{typ_pat "?'b list * nat"} |
704 in |
704 in |
705 Sign.typ_unify @{theory} (ty1, ty2) (Vartab.empty, 0) |
705 Sign.typ_unify @{theory} (ty1, ty2) (Vartab.empty, 0) |
706 end*} |
706 end*} |
707 |
707 |
708 text {* |
708 text {* |
709 The value @{ML_ind empty in Vartab} stands for the empty type environment, |
709 The environment @{ML_ind "Vartab.empty"} stands for the empty type environment, |
710 which is the value for starting the unification without any |
710 which is needed for starting the unification without any |
711 instantiations.\footnote{\bf FIXME: what is 0?} In case of a failure |
711 (pre)instantiations. In case of a failure |
712 @{ML typ_unify in Sign} will throw the exception @{ML_text TUNIFY}. |
712 @{ML typ_unify in Sign} will throw the exception @{text TUNIFY}. |
713 We can print out the resulting type environment @{ML tyenv_unif} |
713 We can print out the resulting type environment @{ML tyenv_unif} |
714 with the built-in function @{ML_ind dest in Vartab}. |
714 with the built-in function @{ML_ind dest in Vartab} from the structure |
|
715 @{ML_struct Vartab}. |
715 |
716 |
716 @{ML_response_fake [display,gray] |
717 @{ML_response_fake [display,gray] |
717 "Vartab.dest tyenv_unif" |
718 "Vartab.dest tyenv_unif" |
718 "[((\"'a\", 0), ([\"HOL.type\"], \"?'b List.list\")), |
719 "[((\"'a\", 0), ([\"HOL.type\"], \"?'b List.list\")), |
719 ((\"'b\", 0), ([\"HOL.type\"], \"nat\"))]"} |
720 ((\"'b\", 0), ([\"HOL.type\"], \"nat\"))]"} |
720 |
721 *} |
721 For what follows we will use the following pretty-printing function |
722 |
722 for @{ML_type "Type.tyenv"}s |
723 text_raw {* |
723 *} |
724 \begin{figure}[t] |
724 |
725 \begin{minipage}{\textwidth} |
725 ML{*fun pretty_tyenv ctxt tyenv = |
726 \begin{isabelle}*} |
|
727 ML{*fun pretty_helper f env = |
|
728 env |> Vartab.dest |
|
729 |> map f |
|
730 |> commas |
|
731 |> enclose "[" "]" |
|
732 |> tracing |
|
733 |
|
734 fun pretty_tyenv ctxt tyenv = |
726 let |
735 let |
727 fun aux (v, (s, T)) = |
736 fun aux (v, (s, T)) = |
728 Syntax.string_of_typ ctxt (TVar (v, s)) |
737 Syntax.string_of_typ ctxt (TVar (v, s)) |
729 ^ " := " ^ Syntax.string_of_typ ctxt T |
738 ^ " := " ^ Syntax.string_of_typ ctxt T |
730 in |
739 in |
731 tyenv |> Vartab.dest |
740 pretty_helper aux tyenv |
732 |> map aux |
|
733 |> commas |
|
734 |> enclose "[" "]" |
|
735 |> tracing |
|
736 end*} |
741 end*} |
737 |
742 text_raw {* |
738 text {* |
743 \end{isabelle} |
739 which prints for the type environment above |
744 \end{minipage} |
|
745 \caption{A pretty printing function for type environments, which are |
|
746 produced by unification and matching.\label{fig:prettyenv}} |
|
747 \end{figure} |
|
748 *} |
|
749 |
|
750 text {* |
|
751 The first components in this list stand for the schematic type variables and |
|
752 the second are the associated sorts and types. In what follows we will use |
|
753 the pretty-printing function in Figure~\ref{fig:prettyenv} for @{ML_type "Type.tyenv"}s. |
|
754 For the type environment from the example this function prints out the |
|
755 more legible: |
740 |
756 |
741 @{ML_response_fake [display, gray] |
757 @{ML_response_fake [display, gray] |
742 "pretty_tyenv @{context} tyenv_unif" |
758 "pretty_tyenv @{context} tyenv_unif" |
743 "[?'a := ?'b list, ?'b := nat]"} |
759 "[?'a := ?'b list, ?'b := nat]"} |
744 |
760 |
745 Observe that the type environment that the function @{ML typ_unify in |
761 The index @{text 0} in the example above is the maximal index of the schematic |
|
762 type variables occuring in @{text ty1} and @{text ty2}. It will be increased |
|
763 whenever a new schematic type variable is introduced during unification. |
|
764 This is for example the case when two schematic type variables have different, |
|
765 incomparable sorts. Then a new schematic type variable is introduced with |
|
766 the combined sorts. To show this let us use two sorts, @{text "s1"} and @{text "s2"}, |
|
767 which we attach to the schematic type variables @{text "?'a"} and @{text "?'b"}. |
|
768 *} |
|
769 |
|
770 ML{*val (tyenv, index) = let |
|
771 val ty2 = @{typ_pat "?'a::s1"} |
|
772 val ty1 = @{typ_pat "?'b::s2"} |
|
773 in |
|
774 Sign.typ_unify @{theory} (ty1, ty2) (Vartab.empty, 0) |
|
775 end*} |
|
776 |
|
777 text {* |
|
778 To print out the calculated type environment we also switch on the printing |
|
779 of sort information. |
|
780 *} |
|
781 |
|
782 ML{*show_sorts := true*} |
|
783 |
|
784 text {* |
|
785 Now we can see |
|
786 |
|
787 @{ML_response_fake [display,gray] |
|
788 "pretty_tyenv @{context} tyenv" |
|
789 "[?'a::s1 := ?'a1::{s1,s2}, ?'b::s2 := ?'a1::{s1,s2}]"} |
|
790 |
|
791 the type variables @{text "?'a"} and @{text "?'b"} are instantiated |
|
792 with a new type variable @{text "?'a1"} with sort @{text "{s1,s2}"}. Since a new |
|
793 type variable has been introduced the @{ML index}, originally being @{text 0}, |
|
794 has been increased by @{text 1}. |
|
795 |
|
796 @{ML_response [display,gray] |
|
797 "index" |
|
798 "1"} |
|
799 *}(*<*)ML %linenos{*show_sorts := false*}(*>*) |
|
800 |
|
801 text {* |
|
802 The way the unification function @{ML typ_unify in Sign} is implemented |
|
803 using an initial type environment and initial index makes it easy to |
|
804 unifying of more than two terms. For example |
|
805 *} |
|
806 |
|
807 ML{*val (tyenvs, _) = let |
|
808 val tys1 = (@{typ_pat "?'a"}, @{typ_pat "?'b list"}) |
|
809 val tys2 = (@{typ_pat "?'b"}, @{typ_pat "nat"}) |
|
810 in |
|
811 fold (Sign.typ_unify @{theory}) [tys1, tys2] (Vartab.empty, 0) |
|
812 end*} |
|
813 |
|
814 text {* |
|
815 Let us now return again to the unifier from the first example. |
|
816 |
|
817 @{ML_response_fake [display, gray] |
|
818 "pretty_tyenv @{context} tyenv_unif" |
|
819 "[?'a := ?'b list, ?'b := nat]"} |
|
820 |
|
821 Observe that the type environment which the function @{ML typ_unify in |
746 Sign} returns is in \emph{not} an instantiation in fully solved form: while @{text |
822 Sign} returns is in \emph{not} an instantiation in fully solved form: while @{text |
747 "?'b"} is instantiated to @{typ nat}, this is not propagated to the |
823 "?'b"} is instantiated to @{typ nat}, this is not propagated to the |
748 instantiation for @{text "?'a"}. In unification theory, this is often |
824 instantiation for @{text "?'a"}. In unification theory, this is often |
749 called an instantiation in \emph{triangular form}. These instantiations |
825 called an instantiation in \emph{triangular form}. These not fully solved |
750 are used because of performance reasons. |
826 instantiations are used because of performance reasons. |
751 |
|
752 |
827 |
753 To apply a type environment in triangular form to a type, say @{text "?'a * |
828 To apply a type environment in triangular form to a type, say @{text "?'a * |
754 ?'b"}, you should use the function @{ML_ind norm_type in Envir}: |
829 ?'b"}, you should use the function @{ML_ind norm_type in Envir}: |
755 |
830 |
756 @{ML_response_fake [display, gray] |
831 @{ML_response_fake [display, gray] |
757 "Envir.norm_type tyenv_unif @{typ_pat \"?'a * ?'b\"}" |
832 "Envir.norm_type tyenv_unif @{typ_pat \"?'a * ?'b\"}" |
758 "nat list * nat"} |
833 "nat list * nat"} |
759 *} |
834 |
760 |
835 With this function the type variables @{text "?'a"} and @{text "?'b"} are |
761 text {* |
836 correctly instantiated. |
|
837 |
762 Matching of types can be done with the function @{ML_ind typ_match in Sign} |
838 Matching of types can be done with the function @{ML_ind typ_match in Sign} |
763 from the structure @{ML_struct Sign}. It also returns a @{ML_type |
839 from the structure @{ML_struct Sign}. It also returns a @{ML_type |
764 Type.tyenv}. For example |
840 Type.tyenv} and might raise the exception @{text TYPE_MATCH} in case |
|
841 of failure. For example |
765 *} |
842 *} |
766 |
843 |
767 ML{*val tyenv_match = let |
844 ML{*val tyenv_match = let |
768 val pat = @{typ_pat "?'a * ?'b"} |
845 val pat = @{typ_pat "?'a * ?'b"} |
769 and ty = @{typ_pat "bool list * nat"} |
846 and ty = @{typ_pat "bool list * nat"} |
770 in |
847 in |
771 Sign.typ_match @{theory} (pat, ty) Vartab.empty |
848 Sign.typ_match @{theory} (pat, ty) Vartab.empty |
772 end*} |
849 end*} |
773 |
850 |
774 text {* |
851 text {* |
775 In case of failure, @{ML typ_match in Sign} will throw the exception |
852 Printing out the calculated matcher gives |
776 @{ML_text TYPE_MATCH}. Printing out the calculated matcher gives |
|
777 |
853 |
778 @{ML_response_fake [display,gray] |
854 @{ML_response_fake [display,gray] |
779 "pretty_tyenv @{context} tyenv_match" |
855 "pretty_tyenv @{context} tyenv_match" |
780 "[?'a := bool list, ?'b := nat]"} |
856 "[?'a := bool list, ?'b := nat]"} |
781 |
857 |
782 Applying the matcher to a type needs to be done with the function |
858 Unlike the unification, which uses the function @{ML norm_type in Envir}, |
|
859 applying the matcher to a type needs to be done with the function |
783 @{ML_ind subst_type in Envir}. |
860 @{ML_ind subst_type in Envir}. |
784 |
861 |
785 @{ML_response_fake [display, gray] |
862 @{ML_response_fake [display, gray] |
786 "Envir.subst_type tyenv_match @{typ_pat \"?'a * ?'b\"}" |
863 "Envir.subst_type tyenv_match @{typ_pat \"?'a * ?'b\"}" |
787 "bool list * nat"} |
864 "bool list * nat"} |
788 |
865 |
789 Be careful to observe the difference: use @{ML subst_type in Envir} |
866 Be careful to observe the difference: use always |
790 for matchers and @{ML norm_type in Envir} for unifiers. To make the |
867 @{ML subst_type in Envir} for matchers and @{ML norm_type in Envir} |
791 difference explicit let us calculate the following matcher. |
868 for unifiers. To make the difference explicit, let us calculate the |
|
869 following matcher: |
792 *} |
870 *} |
793 |
871 |
794 ML{*val tyenv_match' = let |
872 ML{*val tyenv_match' = let |
795 val pat = @{typ_pat "?'a * ?'b"} |
873 val pat = @{typ_pat "?'a * ?'b"} |
796 and ty = @{typ_pat "?'b list * nat"} |
874 and ty = @{typ_pat "?'b list * nat"} |
797 in |
875 in |
798 Sign.typ_match @{theory} (pat, ty) Vartab.empty |
876 Sign.typ_match @{theory} (pat, ty) Vartab.empty |
799 end*} |
877 end*} |
800 |
878 |
801 text {* |
879 text {* |
802 Now @{ML tyenv_unif} is equal to @{ML tyenv_match'}. Consider |
880 Now @{ML tyenv_unif} is equal to @{ML tyenv_match'}. If we apply |
803 again the type @{text "?'a * ?'b"}, which we used in the unification |
881 @{ML norm_type in Envir} to the matcher we obtain |
804 and matching problems. If we apply @{ML norm_type in Envir} to |
|
805 the matcher we obtain |
|
806 |
882 |
807 @{ML_response_fake [display, gray] |
883 @{ML_response_fake [display, gray] |
808 "Envir.norm_type tyenv_match' @{typ_pat \"?'a * ?'b\"}" |
884 "Envir.norm_type tyenv_match' @{typ_pat \"?'a * ?'b\"}" |
809 "nat list * nat"} |
885 "nat list * nat"} |
810 |
886 |
811 which is not a matcher for the second matching problem, and if |
887 which is not a matcher for the second matching problem, and if |
812 we apply @{ML subst_type in Envir} to the unifier we obtain |
888 we apply @{ML subst_type in Envir} to the unifyier we obtain |
813 |
889 |
814 @{ML_response_fake [display, gray] |
890 @{ML_response_fake [display, gray] |
815 "Envir.subst_type tyenv_unif @{typ_pat \"?'a * ?'b\"}" |
891 "Envir.subst_type tyenv_unif @{typ_pat \"?'a * ?'b\"}" |
816 "?'b list * nat"} |
892 "?'b list * nat"} |
817 |
893 |
827 tables which are implemented in @{ML_file "Pure/term_ord.ML"}. |
903 tables which are implemented in @{ML_file "Pure/term_ord.ML"}. |
828 \end{readmore} |
904 \end{readmore} |
829 *} |
905 *} |
830 |
906 |
831 text {* |
907 text {* |
832 TBD below |
908 Unification and matching of terms is substantially more complicated than the |
833 |
909 type-case. The reason is that terms have abstractions and unification |
834 |
910 or matching modulo plain equality is often not meaningful in this |
|
911 context. Nevertheless, Isabelle implements the function @{ML_ind |
|
912 first_order_match in Pattern} for terms. This matching function returns a |
|
913 type environment and a term environment. To pretty print the latter we use |
|
914 the function @{text "pretty_env"}: |
|
915 *} |
|
916 |
|
917 ML{*fun pretty_env ctxt env = |
|
918 let |
|
919 fun aux (v, (T, t)) = |
|
920 string_of_term ctxt (Var (v, T)) ^ " := " ^ string_of_term ctxt t |
|
921 in |
|
922 pretty_helper aux env |
|
923 end*} |
|
924 |
|
925 text {* |
|
926 As can be seen from the @{text "aux"}-function, a term environment associates |
|
927 a schematic term variable with a type and a term. An example of a first-order |
|
928 matching problem is the term @{term "P (\<lambda>a b. Q b a)"} and the pattern |
|
929 @{text "?X ?Y"}. |
|
930 *} |
|
931 |
|
932 ML{*val (_, fo_env) = let |
|
933 val fo_pat = @{term_pat "(?X::(nat\<Rightarrow>nat\<Rightarrow>nat)\<Rightarrow>bool) ?Y"} |
|
934 val trm = @{term "P::(nat\<Rightarrow>nat\<Rightarrow>nat)\<Rightarrow>bool"} |
|
935 $ @{term "\<lambda>a b. (Q::nat\<Rightarrow>nat\<Rightarrow>nat) b a"} |
|
936 val init = (Vartab.empty, Vartab.empty) |
|
937 in |
|
938 Pattern.first_order_match @{theory} (fo_pat, trm) init |
|
939 end *} |
|
940 |
|
941 text {* |
|
942 In this example we annotated explicitly types because then |
|
943 the type environment is empty and can be ignored. The |
|
944 resulting term environment is: |
|
945 |
|
946 @{ML_response_fake [display, gray] |
|
947 "pretty_env @{context} fo_env" |
|
948 "[?X := P, ?Y := \<lambda>a b. Q b a]"} |
|
949 |
|
950 The matcher can be applied to a term using the function @{ML_ind subst_term |
|
951 in Envir} (remember the same convention for types applies to terms: @{ML |
|
952 subst_term in Envir} is for matchers and @{ML norm_term in Envir} is for |
|
953 unifyiers). The function @{ML subst_term in Envir} expects a type environment, |
|
954 which is set to empty in the example below, and a term environment. |
|
955 |
|
956 @{ML_response_fake [display, gray] |
|
957 "let |
|
958 val trm = @{term_pat \"(?X::(nat\<Rightarrow>nat\<Rightarrow>nat)\<Rightarrow>bool) ?Y\"} |
|
959 in |
|
960 Envir.subst_term (Vartab.empty, fo_env) trm |
|
961 |> string_of_term @{context} |
|
962 |> tracing |
|
963 end" |
|
964 "P (\<lambda>a b. Q b a)"} |
|
965 |
|
966 First-order matching is useful for matching against applications and |
|
967 variables. It can deal also with abstractions and a limited form of |
|
968 alpha-equivalence, but this kind of matching should be used with care, since |
|
969 it is not clear whether the result is meaningful. A working example is |
|
970 matching @{text "\<lambda>x. P x"} against the pattern @{text "\<lambda>y. ?X y"}. In this |
|
971 case first-order matching produces @{text "[?X := P]"}. |
|
972 |
|
973 @{ML_response_fake [display, gray] |
|
974 "let |
|
975 val fo_pat = @{term_pat \"\<lambda>y. (?X::nat\<Rightarrow>bool) y\"} |
|
976 val trm = @{term \"\<lambda>x. (P::nat\<Rightarrow>bool) x\"} |
|
977 val init = (Vartab.empty, Vartab.empty) |
|
978 in |
|
979 Pattern.first_order_match @{theory} (fo_pat, trm) init |
|
980 |> snd |
|
981 |> pretty_env @{context} |
|
982 end" |
|
983 "[?X := P]"} |
|
984 *} |
|
985 |
|
986 text {* |
|
987 In the context of unifying abstractions, more thouroughly studied is |
|
988 higher-order pattern unification and higher-order pattern matching. A |
|
989 \emph{\index*{pattern}} is a lambda term whose ``head symbol'' (that is the |
|
990 first symbol under abstractions) is either a constant, or a schematic or a free |
|
991 variable. If it is a schematic variable then it can be only applied by |
|
992 distinct bound variables. This excludes that a schematic variable is an |
|
993 argument of a schematic variable and that a schematic variable is applied |
|
994 twice to the same bound variable. The function @{ML_ind pattern in Pattern} |
|
995 in the structure @{ML_struct Pattern} tests whether a term satisfies these |
|
996 restrictions. |
|
997 |
|
998 |
|
999 @{ML_response [display, gray] |
|
1000 "let |
|
1001 val trm_list = |
|
1002 [@{term_pat \"?X\"}, @{term_pat \"a\"}, |
|
1003 @{term_pat \"\<lambda>a b. ?X a b\"}, @{term_pat \"\<lambda>a b. (op +) a b\"}, |
|
1004 @{term_pat \"\<lambda>a. (op +) a ?Y\"}, @{term_pat \"?X ?Y\"}, |
|
1005 @{term_pat \"\<lambda>a b. ?X a b ?Y\"}, @{term_pat \"\<lambda>a. ?X a a\"}] |
|
1006 in |
|
1007 map Pattern.pattern trm_list |
|
1008 end" |
|
1009 "[true, true, true, true, true, false, false, false]"} |
|
1010 |
|
1011 The point of restricting unification and matching to patterns is that |
|
1012 it is decidable and produces most general unifiers. In this way |
|
1013 matching and unify can be implemented so that they produce a type |
|
1014 and term environment (the latter actually produces only a term |
|
1015 environment). |
|
1016 *} |
|
1017 |
|
1018 |
|
1019 text {* |
835 \begin{readmore} |
1020 \begin{readmore} |
836 For term, unification and matching of higher-order patterns is implemented in |
1021 For term, unification and matching of higher-order patterns is implemented in |
837 @{ML_file "Pure/pattern.ML"}. Full higher-order unification is implemented |
1022 @{ML_file "Pure/pattern.ML"}. This file also contains a first-order matcher |
|
1023 for terms. Full higher-order unification is implemented |
838 in @{ML_file "Pure/unify.ML"}. |
1024 in @{ML_file "Pure/unify.ML"}. |
839 \end{readmore} |
1025 \end{readmore} |
840 *} |
1026 *} |
|
1027 |
|
1028 |
|
1029 |
841 |
1030 |
842 section {* Type-Checking\label{sec:typechecking} *} |
1031 section {* Type-Checking\label{sec:typechecking} *} |
843 |
1032 |
844 text {* |
1033 text {* |
845 Remember Isabelle follows the Church-style typing for terms, i.e., a term contains |
1034 Remember Isabelle follows the Church-style typing for terms, i.e., a term contains |