CookBook/Parsing.thy
changeset 121 26e5b41faa74
parent 120 c39f83d8daeb
child 122 79696161ae16
equal deleted inserted replaced
120:c39f83d8daeb 121:26e5b41faa74
     1 theory Parsing
     1 theory Parsing
     2 imports Base 
     2 imports Base "Package/Simple_Inductive_Package"
     3 
     3 
     4 begin
     4 begin
     5 
     5 
     6 
     6 
     7 chapter {* Parsing *}
     7 chapter {* Parsing *}
   526 text {* (FIXME funny output for a proposition) *}
   526 text {* (FIXME funny output for a proposition) *}
   527 
   527 
   528 section {* Parsing Specifications\label{sec:parsingspecs} *}
   528 section {* Parsing Specifications\label{sec:parsingspecs} *}
   529 
   529 
   530 text {*
   530 text {*
   531   There are a number of special purpose parsers that help with parsing specifications
   531   There are a number of special purpose parsers that help with parsing
   532   of functions, inductive definitions and so on. For example the 
   532   specifications for functions, inductive definitions and so on. In
   533   @{ML OuterParse.target} reads a target in order to indicate a locale.
   533   Capter~\ref{chp:package}, for example, we will need to parse specifications
   534 
   534   for inductive predicates of the form:
   535 @{ML_response [display,gray]
   535 *}
   536 "let 
   536 
   537   val input = filtered_input \"(in test)\"
   537 simple_inductive
   538 in 
   538   even and odd
   539   parse OuterParse.target input 
   539 where
   540 end" "(\"test\",[])"}
   540   even0: "even 0"
   541   
   541 | evenS: "odd n \<Longrightarrow> even (Suc n)"
   542   The function @{ML OuterParse.opt_target} makes this parser ``optional'', that
   542 | oddS: "even n \<Longrightarrow> odd (Suc n)"
   543   is wrapping the result into an option type and returning @{ML NONE} if no
   543 
   544   target is present.
   544 simple_inductive 
   545 
   545   trcl\<iota> for R :: "'a \<Rightarrow> 'a \<Rightarrow> bool"
   546   The function @{ML OuterParse.fixes} reads an \isacommand{and}-separated 
   546 where
   547   list of constants that can include type annotations and syntax translations. 
   547   base: "trcl\<iota> R x x"
   548   For example:\footnote{Note that in the code we need to write 
   548 | step: "trcl\<iota> R x y \<Longrightarrow> R y z \<Longrightarrow> trcl\<iota> R x z"
   549   @{text "\\\"int \<Rightarrow> bool\\\""} in order to properly escape the double quotes
   549 
   550   in the compound type.}
   550 text {*
   551 *}
   551   For this we are going to use the parser:
   552 
   552 *}
   553 text {*
   553 
   554 
   554 ML %linenosgray{*val spec_parser = 
   555 @{ML_response [display,gray]
       
   556 "let
       
   557   val input = filtered_input 
       
   558         \"foo::\\\"int \<Rightarrow> bool\\\" (\\\"FOO\\\" [100] 100) and bar::nat and blonk\"
       
   559 in
       
   560    parse OuterParse.fixes input
       
   561 end"
       
   562 "([(foo, SOME \<dots>, Mixfix (\"FOO\",[100],100)), 
       
   563   (bar, SOME \<dots>, NoSyn), 
       
   564   (blonk, NONE, NoSyn)],[])"}  
       
   565 
       
   566   Whenever types are given, then they are stored in the @{ML SOME}s.
       
   567   If a syntax translation is present for a constant, then it is
       
   568   stored in the @{ML Mixfix} data structure; no syntax translation is
       
   569   indicated by @{ML NoSyn}.
       
   570  
       
   571   (FIXME: should for-fixes take any syntax annotation?)
       
   572 
       
   573   @{ML OuterParse.for_fixes} is an ``optional'' that prefixes 
       
   574   @{ML OuterParse.fixes} with the command \isacommand{for}.
       
   575   (FIXME give an example and explain more)
       
   576 
       
   577 @{ML_response [display,gray]
       
   578 "let
       
   579   val input = filtered_input 
       
   580         \"for foo::\\\"int \<Rightarrow> bool\\\" (\\\"FOO\\\" [100] 100) and bar::nat and blonk\"
       
   581 in
       
   582    parse OuterParse.for_fixes input
       
   583 end"
       
   584 "([(foo, SOME \<dots>, Mixfix (\"FOO\",[100],100)), 
       
   585   (bar, SOME \<dots>, NoSyn), 
       
   586   (blonk, NONE, NoSyn)],[])"}  
       
   587 
       
   588 *}
       
   589 
       
   590 
       
   591 ML{*let 
       
   592   val input = filtered_input "test_lemma[intro,foo,elim,dest!,bar]:"
       
   593 in 
       
   594   parse (SpecParse.thm_name ":") input 
       
   595      |> fst |> snd |> (Attrib.pretty_attribs @{context}) |> (map Pretty.string_of)
       
   596 end*}
       
   597 
       
   598 text {*
       
   599   (FIXME: why is intro, elim and dest treated differently from bar?) 
       
   600 *}
       
   601 
       
   602 ML{*val spec_parser = 
       
   603     OuterParse.opt_target --
   555     OuterParse.opt_target --
   604     OuterParse.fixes -- 
   556     OuterParse.fixes -- 
   605     OuterParse.for_fixes --
   557     OuterParse.for_fixes --
   606     Scan.optional 
   558     Scan.optional 
   607         (OuterParse.$$$ "where" |--
   559         (OuterParse.$$$ "where" |--
   608            OuterParse.!!! 
   560            OuterParse.!!! 
   609              (OuterParse.enum1 "|" 
   561              (OuterParse.enum1 "|" 
   610                 (SpecParse.opt_thm_name ":" -- OuterParse.prop))) []*}
   562                 (SpecParse.opt_thm_name ":" -- OuterParse.prop))) []*}
   611 
   563 
   612 text {*
   564 text {*
       
   565   To see what it returns, let us parse the string corresponding to the 
       
   566   definition of @{term even} and @{term odd}:
       
   567 
   613 @{ML_response [display,gray]
   568 @{ML_response [display,gray]
   614 "let
   569 "let
   615   val input = filtered_input
   570   val input = filtered_input
   616      (\"even and odd \" ^  
   571      (\"even and odd \" ^  
   617       \"where \" ^
   572       \"where \" ^
   623 end"
   578 end"
   624 "((((NONE, [(even, NONE, NoSyn), (odd, NONE, NoSyn)]), []),
   579 "((((NONE, [(even, NONE, NoSyn), (odd, NONE, NoSyn)]), []),
   625      [((even0,\<dots>), \"\\^E\\^Ftoken\\^Eeven 0\\^E\\^F\\^E\"),
   580      [((even0,\<dots>), \"\\^E\\^Ftoken\\^Eeven 0\\^E\\^F\\^E\"),
   626       ((evenS,\<dots>), \"\\^E\\^Ftoken\\^Eodd n \<Longrightarrow> even (Suc n)\\^E\\^F\\^E\"),
   581       ((evenS,\<dots>), \"\\^E\\^Ftoken\\^Eodd n \<Longrightarrow> even (Suc n)\\^E\\^F\\^E\"),
   627       ((oddS,\<dots>), \"\\^E\\^Ftoken\\^Eeven n \<Longrightarrow> odd (Suc n)\\^E\\^F\\^E\")]), [])"}
   582       ((oddS,\<dots>), \"\\^E\\^Ftoken\\^Eeven n \<Longrightarrow> odd (Suc n)\\^E\\^F\\^E\")]), [])"}
   628 *}
   583 
   629 
   584   As can be seen, the result is a ``nested'' four-tuple consisting of an 
   630 text {* 
   585   optional locale; a list of variables with optional type-annotation and optional 
   631   The (outer?) parser for the package: returns optionally a locale; 
   586   syntax-annotation; a list of for-fixes (fixed variables); and a list of rules 
   632   a list of predicate constants with optional type-annotation and 
   587   where each rule has optionally a name and an attribute.
   633   optional syntax-annotation; a list of for-fixes (fixed parameters); 
   588 
   634   and a list of rules where each rule has optionally a name and an attribute.
   589   In Line 2 of the parser, the function @{ML OuterParse.opt_target} reads a target 
       
   590   in order to indicate a locale in which the specification is made. For example
       
   591 
       
   592 @{ML_response [display,gray]
       
   593 "parse OuterParse.opt_target (filtered_input \"(in test)\")" "(SOME \"test\",[])"}
       
   594   
       
   595   The function @{ML OuterParse.opt_target} returns @{ML NONE}, if no target is given.
       
   596 
       
   597   The function @{ML OuterParse.fixes} in Line 3 reads an \isacommand{and}-separated 
       
   598   list of variables that can include type annotations and syntax translations. 
       
   599   For example:\footnote{Note that in the code we need to write 
       
   600   @{text "\\\"int \<Rightarrow> bool\\\""} in order to properly escape the double quotes
       
   601   in the compound type.}
       
   602 
       
   603 @{ML_response [display,gray]
       
   604 "let
       
   605   val input = filtered_input 
       
   606         \"foo::\\\"int \<Rightarrow> bool\\\" and bar::nat (\\\"BAR\\\" 100) and blonk\"
       
   607 in
       
   608    parse OuterParse.fixes input
       
   609 end"
       
   610 "([(foo, SOME \"\\^E\\^Ftoken\\^Eint \<Rightarrow> bool\\^E\\^F\\^E\", NoSyn), 
       
   611   (bar, SOME \"\\^E\\^Ftoken\\^Enat\\^E\\^F\\^E\", Mixfix (\"BAR\", [], 100)), 
       
   612   (blonk, NONE, NoSyn)],[])"}  
       
   613 *}
       
   614 
       
   615 text {*
       
   616   Whenever types are given, they are stored in the @{ML SOME}s. Since types
       
   617   are part of the inner syntax they are strings with some printing directives. 
       
   618   If a syntax translation is present for a variable, then it is
       
   619   stored in the @{ML Mixfix} datastructure; no syntax translation is
       
   620   indicated by @{ML NoSyn}.
       
   621 
       
   622   \begin{readmore}
       
   623   The datastructre for sytax annotations is defined in @{ML_file "Pure/Syntax/mixfix.ML"}.
       
   624   \end{readmore}
       
   625 
       
   626   Similarly, the function @{ML OuterParse.for_fixes} in Line 4: it reads the
       
   627   same \isacommand{and}-separated list of variables as @{ML "fixes" in OuterParse}, 
       
   628   but requires that this list is prefixed by the keyword \isacommand{for}.
       
   629 
       
   630 @{ML_response [display,gray]
       
   631 "parse OuterParse.for_fixes (filtered_input \"for foo and bar\")"
       
   632 "([(foo, NONE, NoSyn), (bar, NONE, NoSyn)],[])"}  
       
   633 
       
   634   Lines 5 to 9 include the parser for a list of introduction rules, that is propositions 
       
   635   with theorem annotations. The introduction rules are propositions parsed 
       
   636   by @{ML OuterParse.prop}. However, they can include an optional theorem name plus
       
   637   some attributes. For example
       
   638 
       
   639 @{ML_response [display,gray] "let 
       
   640   val input = filtered_input \"foo_lemma[intro,dest!]:\"
       
   641   val ((name, attrib), _) = parse (SpecParse.thm_name \":\") input 
       
   642 in 
       
   643   (name, map Args.dest_src attrib)
       
   644 end" "(foo_lemma, [((\"intro\", []), \<dots>), ((\"dest\", [\<dots>]), \<dots>)])"}
       
   645  
       
   646   The function @{ML opt_thm_name in SpecParse} is the ``optional'' variant of
       
   647   @{ML thm_name in SpecParse}. As can be seen each theorem name can contain some 
       
   648   attributes. If you want to print out all currently known attributes a theorem 
       
   649   can have, you can use the function:
       
   650 *}
       
   651 
       
   652 ML{*Attrib.print_attributes @{theory}*}
       
   653 
       
   654 text {*
       
   655   For the inductive definitions described above only the attibutes @{text "[intro]"} and
       
   656   @{text "[simp]"} make sense.
       
   657 
       
   658   \begin{readmore}
       
   659   Attributes and arguments are implemented in the files @{ML_file "Pure/Isar/attrib.ML"} 
       
   660   and @{ML_file "Pure/Isar/args.ML"}.
       
   661   \end{readmore}
   635 *}
   662 *}
   636 
   663 
   637 section {* New Commands and Keyword Files\label{sec:newcommand} *}
   664 section {* New Commands and Keyword Files\label{sec:newcommand} *}
   638 
   665 
   639 text {*
   666 text {*