CookBook/Parsing.thy
changeset 75 f2dea0465bb4
parent 74 f6f8f8ba1eb1
child 80 95e9c4556221
equal deleted inserted replaced
74:f6f8f8ba1eb1 75:f2dea0465bb4
   187   
   187   
   188   @{ML_response_fake [display,gray] "(!! (fn _ => \"foo\") ($$ \"h\")) (explode \"world\")"
   188   @{ML_response_fake [display,gray] "(!! (fn _ => \"foo\") ($$ \"h\")) (explode \"world\")"
   189                                "Exception ABORT raised"}
   189                                "Exception ABORT raised"}
   190 
   190 
   191   then the parsing aborts and the error message @{text "foo"} is printed out. In order to
   191   then the parsing aborts and the error message @{text "foo"} is printed out. In order to
   192   see the error message properly, we need to prefix the parser with the function 
   192   see the error message, we need to prefix the parser with the function 
   193   @{ML "Scan.error"}. For example
   193   @{ML "Scan.error"}. For example
   194 
   194 
   195   @{ML_response_fake [display,gray] "Scan.error (!! (fn _ => \"foo\") ($$ \"h\"))"
   195   @{ML_response_fake [display,gray] "Scan.error (!! (fn _ => \"foo\") ($$ \"h\"))"
   196                                "Exception Error \"foo\" raised"}
   196                                "Exception Error \"foo\" raised"}
   197 
   197 
   351   The parser functions for the theory syntax are contained in the structure
   351   The parser functions for the theory syntax are contained in the structure
   352   @{ML_struct OuterParse} defined in the file @{ML_file  "Pure/Isar/outer_parse.ML"}.
   352   @{ML_struct OuterParse} defined in the file @{ML_file  "Pure/Isar/outer_parse.ML"}.
   353   The definition for tokens is in the file @{ML_file "Pure/Isar/outer_lex.ML"}.
   353   The definition for tokens is in the file @{ML_file "Pure/Isar/outer_lex.ML"}.
   354   \end{readmore}
   354   \end{readmore}
   355 
   355 
   356   The structure @{ML_struct OuterLex} defines several kinds of token (for example 
   356   The structure @{ML_struct OuterLex} defines several kinds of tokens (for example 
   357   @{ML "Ident" in OuterLex} for identifiers, @{ML "Keyword" in OuterLex} for keywords and
   357   @{ML "Ident" in OuterLex} for identifiers, @{ML "Keyword" in OuterLex} for keywords and
   358   @{ML "Command" in OuterLex} for commands). Some token parsers take into account the 
   358   @{ML "Command" in OuterLex} for commands). Some token parsers take into account the 
   359   kind of token.
   359   kind of tokens.
   360 *}  
   360 *}  
   361 
   361 
   362 text {*
   362 text {*
   363   For the examples below, we can generate a token list out of a string using
   363   For the examples below, we can generate a token list out of a string using
   364   the function @{ML "OuterSyntax.scan"}, which we give below @{ML
   364   the function @{ML "OuterSyntax.scan"}, which we give below @{ML
   365   "Position.none"} as argument since, at the moment, we are not interested in
   365   "Position.none"} as argument since, at the moment, we are not interested in
   366   generating precise error messages. The following
   366   generating precise error messages. The following code
   367 
   367 
   368 
   368 
   369 @{ML_response_fake [display,gray] "OuterSyntax.scan Position.none \"hello world\"" 
   369 @{ML_response_fake [display,gray] "OuterSyntax.scan Position.none \"hello world\"" 
   370 "[Token (\<dots>,(Ident, \"hello\"),\<dots>), 
   370 "[Token (\<dots>,(Ident, \"hello\"),\<dots>), 
   371  Token (\<dots>,(Space, \" \"),\<dots>), 
   371  Token (\<dots>,(Space, \" \"),\<dots>), 
   373 
   373 
   374   produces three tokens where the first and the last are identifiers, since
   374   produces three tokens where the first and the last are identifiers, since
   375   @{text [quotes] "hello"} and @{text [quotes] "world"} do not match any
   375   @{text [quotes] "hello"} and @{text [quotes] "world"} do not match any
   376   other syntactic category.\footnote{Note that because of a possible a bug in
   376   other syntactic category.\footnote{Note that because of a possible a bug in
   377   the PolyML runtime system the result is printed as @{text "?"}, instead of
   377   the PolyML runtime system the result is printed as @{text "?"}, instead of
   378   the token.} The second indicates a space.
   378   the tokens.} The second indicates a space.
   379 
   379 
   380   Many parsing functions later on will require spaces, comments and the like
   380   Many parsing functions later on will require spaces, comments and the like
   381   to have already been filtered out.  So from now on we are going to use the 
   381   to have already been filtered out.  So from now on we are going to use the 
   382   functions @{ML filter} and @{ML OuterLex.is_proper} do this. For example
   382   functions @{ML filter} and @{ML OuterLex.is_proper} do this. For example
   383 
   383 
   407  Token (\<dots>,(Keyword, \"|\"),\<dots>), 
   407  Token (\<dots>,(Keyword, \"|\"),\<dots>), 
   408  Token (\<dots>,(Keyword, \"for\"),\<dots>)]"}
   408  Token (\<dots>,(Keyword, \"for\"),\<dots>)]"}
   409 
   409 
   410   we obtain a list consisting of only a command and two keyword tokens.
   410   we obtain a list consisting of only a command and two keyword tokens.
   411   If you want to see which keywords and commands are currently known, type in
   411   If you want to see which keywords and commands are currently known, type in
   412   the following (you might have to adjust the @{ML print_depth} in order to
   412   the following code (you might have to adjust the @{ML print_depth} in order to
   413   see the complete list):
   413   see the complete list):
   414 
   414 
   415 @{ML_response_fake [display,gray] 
   415 @{ML_response_fake [display,gray] 
   416 "let 
   416 "let 
   417   val (keywords, commands) = OuterKeyword.get_lexicons ()
   417   val (keywords, commands) = OuterKeyword.get_lexicons ()
   452   (OuterParse.enum \"|\" (OuterParse.$$$ \"in\")) input
   452   (OuterParse.enum \"|\" (OuterParse.$$$ \"in\")) input
   453 end" 
   453 end" 
   454 "([\"in\",\"in\",\"in\"],[\<dots>])"}
   454 "([\"in\",\"in\",\"in\"],[\<dots>])"}
   455 
   455 
   456   @{ML "OuterParse.enum1"} works similarly, except that the parsed list must
   456   @{ML "OuterParse.enum1"} works similarly, except that the parsed list must
   457   be non-empty. Note that we had to add an @{text [quotes] "foo"} at the end
   457   be non-empty. Note that we had to add a string @{text [quotes] "foo"} at the end
   458   of the parsed string, otherwise the parser would have consumed all tokens
   458   of the parsed string, otherwise the parser would have consumed all tokens
   459   and then failed with the exception @{text "MORE"}. Like in the previous
   459   and then failed with the exception @{text "MORE"}. Like in the previous
   460   section, we can avoid this exception using the wrapper @{ML
   460   section, we can avoid this exception using the wrapper @{ML
   461   Scan.finite}. This time, however, we have to use the ``stopper-token'' @{ML
   461   Scan.finite}. This time, however, we have to use the ``stopper-token'' @{ML
   462   OuterLex.stopper}. We can write
   462   OuterLex.stopper}. We can write
   468   Scan.finite OuterLex.stopper 
   468   Scan.finite OuterLex.stopper 
   469          (OuterParse.enum \"|\" (OuterParse.$$$ \"in\")) input
   469          (OuterParse.enum \"|\" (OuterParse.$$$ \"in\")) input
   470 end" 
   470 end" 
   471 "([\"in\",\"in\",\"in\"],[])"}
   471 "([\"in\",\"in\",\"in\"],[])"}
   472 
   472 
   473   The following function will help us later to run examples
   473   The following function will help to run examples.
   474 
   474 
   475 *}
   475 *}
   476 
   476 
   477 ML{*fun parse p input = Scan.finite OuterLex.stopper (Scan.error p) input *}
   477 ML{*fun parse p input = Scan.finite OuterLex.stopper (Scan.error p) input *}
   478 
   478 
   583 
   583 
   584   @{text [display] "$ isabelle keywords -k foobar some_log_files"}
   584   @{text [display] "$ isabelle keywords -k foobar some_log_files"}
   585 
   585 
   586   The option @{text "-k foobar"} indicates which postfix the name of the keyword file 
   586   The option @{text "-k foobar"} indicates which postfix the name of the keyword file 
   587   will be assigned. In the case above the generated file will be named @{text
   587   will be assigned. In the case above the generated file will be named @{text
   588   "isar-keywords-foobar.el"}. As indicated, this command requires log files to be
   588   "isar-keywords-foobar.el"}. As can be seen, this command requires log files to be
   589   present (in order to extract the keywords from them). To generate these log
   589   present (in order to extract the keywords from them). To generate these log
   590   files, we first package the code above into a separate theory file named
   590   files, we first package the code above into a separate theory file named
   591   @{text "Command.thy"}, say---see Figure~\ref{fig:commandtheory} for the
   591   @{text "Command.thy"}, say---see Figure~\ref{fig:commandtheory} for the
   592   complete code.
   592   complete code.
   593 
   593 
   606 in
   606 in
   607   OuterSyntax.command \"foobar\" \"description of foobar\" kind do_nothing
   607   OuterSyntax.command \"foobar\" \"description of foobar\" kind do_nothing
   608 end"}\\
   608 end"}\\
   609   \isa{\isacharverbatimclose}\\
   609   \isa{\isacharverbatimclose}\\
   610   \isacommand{end}
   610   \isacommand{end}
   611   \end{graybox}
   611   \end{graybox}\\[-4mm]
   612   \caption{The file @{text "Command.thy"} is necessary for generating a log 
   612   \caption{\small The file @{text "Command.thy"} is necessary for generating a log 
   613   file. This log file enables Isabelle to generate a keyword file containing 
   613   file. This log file enables Isabelle to generate a keyword file containing 
   614   the command \isacommand{foobar}.\label{fig:commandtheory}}
   614   the command \isacommand{foobar}.\label{fig:commandtheory}}
   615   \end{figure}
   615   \end{figure}
   616   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
   616   %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
   617 
   617 
   618   For our purposes it is sufficient to use the log files for the theories
   618   For our purposes it is sufficient to use the log files of the theories
   619   @{text "Pure"}, @{text "HOL"} and @{text "Pure-ProofGeneral"}, as well as
   619   @{text "Pure"}, @{text "HOL"} and @{text "Pure-ProofGeneral"}, as well as
   620   the theory @{text "Command.thy"} containing the new
   620   the log file for the theory @{text "Command.thy"}, which contains the new
   621   \isacommand{foobar}-command. If you target another logics besides HOL, such
   621   \isacommand{foobar}-command. If you target other logics besides HOL, such
   622   as Nominal or ZF, then you need to adapt the log files appropriately.
   622   as Nominal or ZF, then you need to adapt the log files appropriately.
   623   @{text Pure} and @{text HOL} are usually compiled during the installation of
   623   @{text Pure} and @{text HOL} are usually compiled during the installation of
   624   Isabelle. So log files for them should be already available. If not, then
   624   Isabelle. So log files for them should be already available. If not, then
   625   they can be conveniently compiled using build-script from the Isabelle
   625   they can be conveniently compiled with the help of the build-script from the Isabelle
   626   distribution
   626   distribution
   627 
   627 
   628 
   628 
   629   @{text [display] 
   629   @{text [display] 
   630 "$ ./build -m \"Pure\"
   630 "$ ./build -m \"Pure\"
   673 "Pure.gz
   673 "Pure.gz
   674 HOL.gz
   674 HOL.gz
   675 Pure-ProofGeneral.gz
   675 Pure-ProofGeneral.gz
   676 HOL-FoobarCommand.gz"} 
   676 HOL-FoobarCommand.gz"} 
   677 
   677 
   678   They are the ones we need for creating the keyword files. Assuming the name 
   678   From them we create the keyword files. Assuming the name 
   679   of the directory is in  @{text "ISABELLE_LOGS"},
   679   of the directory is in  @{text "$ISABELLE_LOGS"},
   680   then the Unix command for creating the keyword file is:
   680   then the Unix command for creating the keyword file is:
   681 
   681 
   682 @{text [display]
   682 @{text [display]
   683 "$ isabelle keywords -k foobar 
   683 "$ isabelle keywords -k foobar 
   684      $ISABELLE_LOGS/{Pure.gz,HOL.gz,Pure-ProofGeneral.gz,HOL-FoobarCommand.gz}"}
   684      $ISABELLE_LOGS/{Pure.gz,HOL.gz,Pure-ProofGeneral.gz,HOL-FoobarCommand.gz}"}
   685 
   685 
   686   The result is the file @{text "isar-keywords-foobar.el"}. It should contain the
   686   The result is the file @{text "isar-keywords-foobar.el"}. It should contain the
   687   string @{text "foobar"} twice (check for example that @{text "grep foobar
   687   string @{text "foobar"} twice (check for example that @{text "grep foobar
   688   isar-keywords-foobar.el"} returns something non-empty).  This keyword file
   688   isar-keywords-foobar.el"} returns something non-empty).  This keyword file
   689   needs to be copied into the directory @{text "~/.isabelle/etc"}. To make
   689   needs to be copied into the directory @{text "~/.isabelle/etc"}. To make
   690   Isabelle aware of this keyword file, we have to start it with the option @{text
   690   Isabelle aware of this keyword file, we have to start Isabelle with the option @{text
   691   "-k foobar"}, i.e.
   691   "-k foobar"}, i.e.
   692 
   692 
   693   @{text [display] "$ isabelle -k foobar a_theory_file"}
   693   @{text [display] "$ isabelle -k foobar a_theory_file"}
   694 
   694 
   695   If we now run the original code
   695   If we now build a theory on top of @{text "Command.thy"}, 
       
   696   then we can make use of the command \isacommand{foobar}. 
       
   697   Similarly with any other new command. 
       
   698 
       
   699 
       
   700   At the moment \isacommand{foobar} is not very useful. Let us refine it a bit 
       
   701   next by taking a proposition as argument and printing this proposition inside
       
   702   the tracing buffer. 
       
   703 
       
   704   The crucial part of a command is the function that determines the behaviour
       
   705   of the command. In the code above we used a ``do-nothing''-function, which
       
   706   because of @{ML Scan.succeed} does not parse any argument, but immediately
       
   707   returns the simple toplevel function @{ML "Toplevel.theory I"}. We can
       
   708   replace this code by a function that first parses a proposition (using the
       
   709   parser @{ML OuterParse.prop}), then prints out the tracing
       
   710   information (using a new top-level function @{text trace_top_lvl}) and 
       
   711   finally does nothing. For this we can write
       
   712 
   696 *}
   713 *}
   697 
   714 
   698 ML{*let
   715 ML{*let
   699   val do_nothing = Scan.succeed (Toplevel.theory I)
   716   fun trace_top_lvl str = 
   700   val kind = OuterKeyword.thy_decl
   717      Toplevel.theory (fn thy => (tracing str; thy))
   701 in
   718 
   702   OuterSyntax.command "foobar" "description of foobar" kind do_nothing
   719   val trace_prop = OuterParse.prop >> trace_top_lvl
   703 end *}
   720 
   704 
       
   705 text {*
       
   706   then we can make use of \isacommand{foobar} in a theory that builds on @{text "Command.thy"}! 
       
   707   Similarly with any other new command. 
       
   708 
       
   709 
       
   710   At the moment \isacommand{foobar} is not very useful. Let us next refine it a bit 
       
   711   by taking a proposition as argument and printing this proposition inside
       
   712   the tracing buffer. 
       
   713 
       
   714   The crucial part of a command is the function that determines
       
   715   the behaviour of the command. In the code above we used a
       
   716   ``do-nothing''-function, which because of @{ML Scan.succeed} does not parse 
       
   717   any argument, but immediately returns the simple toplevel function 
       
   718   @{ML "Toplevel.theory I"}. We can replace this code by a function that first 
       
   719   parses a proposition (using the parser @{ML OuterParse.prop}), then prints out 
       
   720   the tracing information and finally does nothing. For this we can write
       
   721 *}
       
   722 
       
   723 ML{*let
       
   724   val trace_prop = 
       
   725          OuterParse.prop >> (fn str => (tracing str; (Toplevel.theory I)))
       
   726   val kind = OuterKeyword.thy_decl
   721   val kind = OuterKeyword.thy_decl
   727 in
   722 in
   728   OuterSyntax.command "foobar" "traces a proposition" kind trace_prop
   723   OuterSyntax.command "foobar" "traces a proposition" kind trace_prop
   729 end *}
   724 end *}
   730 
   725 
   731 text {*
   726 text {*
   732   Now we can type for example
   727   Now we can type
   733 
   728 
   734   \begin{isabelle}
   729   \begin{isabelle}
   735   \isacommand{foobar}~@{text [quotes] "True \<and> False"}\\
   730   \isacommand{foobar}~@{text [quotes] "True \<and> False"}\\
   736   @{text "> True \<and> False"}
   731   @{text "> \"True \<and> False\""}
   737   \end{isabelle}
   732   \end{isabelle}
   738   
   733   
   739   and see the proposition in the tracing buffer.  
   734   and see the proposition in the tracing buffer.  
   740 
   735 
   741   Note that so far we used @{ML thy_decl in OuterKeyword} as the kind indicator
   736   Note that so far we used @{ML thy_decl in OuterKeyword} as the kind indicator
   742   for the command.  This means that the command finishes as soon as the
   737   for the command.  This means that the command finishes as soon as the
   743   arguments are processed. Examples of this kind of commands are
   738   arguments are processed. Examples of this kind of commands are
   744   \isacommand{definition} and \isacommand{declare}.  In other cases
   739   \isacommand{definition} and \isacommand{declare}.  In other cases,
   745   commands are expected to parse some arguments, for example a proposition,
   740   commands are expected to parse some arguments, for example a proposition,
   746   and then ``open up'' a proof in order to prove the proposition (think of
   741   and then ``open up'' a proof in order to prove the proposition (for example
   747   \isacommand{lemma}) or prove some other properties (for example in
   742   \isacommand{lemma}) or prove some other properties (for example in
   748   \isacommand{function}). To achieve this behaviour, we have to use the kind
   743   \isacommand{function}). To achieve this kind of behaviour, we have to use the kind
   749   indicator @{ML thy_goal in OuterKeyword}.
   744   indicator @{ML thy_goal in OuterKeyword}.
   750 
   745 
   751   Below we change \isacommand{foobar} so that it expects a proposition as
   746   Below we change \isacommand{foobar} so that it takes a proposition as
   752   argument and then starts a proof in order to prove it. Therefore in Line 13
   747   argument and then starts a proof in order to prove it. Therefore in Line 13
   753   below, we set the kind indicator to @{ML thy_goal in OuterKeyword}.
   748   below, we set the kind indicator to @{ML thy_goal in OuterKeyword}.
   754 *}
   749 *}
   755 
   750 
   756 ML%linenumbers{*let
   751 ML%linenumbers{*let
   757   fun set_up_thm str ctxt =
   752   fun set_up_thm str ctxt =
   758     let
   753     let
   759       val prop = Syntax.read_prop ctxt str
   754       val prop = Syntax.read_prop ctxt str
   760     in
   755     in
   761       Proof.theorem_i NONE (K I) [[(Syntax.read_prop ctxt str,[])]] ctxt
   756       Proof.theorem_i NONE (K I) [[(prop,[])]] ctxt
   762     end;
   757     end;
   763   
   758   
   764   val prove_prop = OuterParse.prop >>  
   759   val prove_prop = OuterParse.prop >>  
   765       (fn str => Toplevel.print o 
   760       (fn str => Toplevel.print o 
   766                     Toplevel.local_theory_to_proof NONE (set_up_thm str))
   761                     Toplevel.local_theory_to_proof NONE (set_up_thm str))
   772 
   767 
   773 text {*
   768 text {*
   774   The function @{text set_up_thm} takes a string (the proposition to be
   769   The function @{text set_up_thm} takes a string (the proposition to be
   775   proved) and a context.  The context is necessary in order to be able to use
   770   proved) and a context.  The context is necessary in order to be able to use
   776   @{ML Syntax.read_prop}, which converts a string into a proper proposition.
   771   @{ML Syntax.read_prop}, which converts a string into a proper proposition.
   777   In Line 6 the function @{ML Proof.theorem_i} starts the proof for the 
   772   In Line 6 the function @{ML Proof.theorem_i} starts the proof for the
   778   proposition. In Lines 9 to 11 contain the parser for the proposition.
   773   proposition. Its argument @{ML NONE} stands for a locale (which we chose to
       
   774   omit); the argument @{ML "(K I)"} stands for a function that determines what
       
   775   should be done with the theorem once it is proved (we chose to just forget
       
   776   about it). In Lines 9 to 11 contain the parser for the proposition.
   779 
   777 
   780   If we now type \isacommand{foobar}~@{text [quotes] "True \<and> True"}, we obtain the following 
   778   If we now type \isacommand{foobar}~@{text [quotes] "True \<and> True"}, we obtain the following 
   781   proof state:
   779   proof state:
   782  
   780  
   783   \begin{isabelle}
   781   \begin{isabelle}
   794   \isacommand{apply}@{text "(rule TrueI)+"}\\
   792   \isacommand{apply}@{text "(rule TrueI)+"}\\
   795   \isacommand{done}
   793   \isacommand{done}
   796   \end{isabelle}
   794   \end{isabelle}
   797 
   795 
   798   
   796   
   799   (FIXME What does @{text "Toplevel.theory"}?)
   797   (FIXME What does @{text "Toplevel.theory"} @{text "Toplevel.print"}?)
   800 
   798 
   801   (FIXME Explain @{text "OuterKeyword.thy_decl"} and parser)
   799   (FIXME Explain @{text "OuterKeyword.thy_decl"} and parser)
   802 
   800 
   803 *}
   801   (FIXME read a name and show how to store theorems)
   804 
   802 
       
   803   (FIXME possibly also read a locale)
       
   804 *}
       
   805 
       
   806 (*<*)
   805 chapter {* Parsing *}
   807 chapter {* Parsing *}
   806 
   808 
   807 text {*
   809 text {*
   808 
   810 
   809   Lots of Standard ML code is given in this document, for various reasons,
   811   Lots of Standard ML code is given in this document, for various reasons,
   850   Naturally one may convert between the two different sorts of parsing functions
   852   Naturally one may convert between the two different sorts of parsing functions
   851   as follows:
   853   as follows:
   852   \begin{verbatim}
   854   \begin{verbatim}
   853   open StringCvt ;
   855   open StringCvt ;
   854   type ('res, 'src) ex_reader = 'src -> 'res * 'src
   856   type ('res, 'src) ex_reader = 'src -> 'res * 'src
   855   (* ex_reader : ('res, 'src) reader -> ('res, 'src) ex_reader *)
   857   ex_reader : ('res, 'src) reader -> ('res, 'src) ex_reader 
   856   fun ex_reader rdr src = Option.valOf (rdr src) ;
   858   fun ex_reader rdr src = Option.valOf (rdr src) ;
   857   (* reader : ('res, 'src) ex_reader -> ('res, 'src) reader *)
   859   reader : ('res, 'src) ex_reader -> ('res, 'src) reader 
   858   fun reader exrdr src = SOME (exrdr src) handle _ => NONE ;
   860   fun reader exrdr src = SOME (exrdr src) handle _ => NONE ;
   859   \end{verbatim}
   861   \end{verbatim}
   860   
   862   
   861 *}
   863 *}
   862 
   864 
  1467   The source file \texttt{src/Pure/Isar/method.ML} shows the use of 
  1469   The source file \texttt{src/Pure/Isar/method.ML} shows the use of 
  1468   \texttt{Method.add\_method} to add a number of methods.
  1470   \texttt{Method.add\_method} to add a number of methods.
  1469 
  1471 
  1470 
  1472 
  1471 *}
  1473 *}
  1472 
  1474 (*>*)
  1473 
  1475 
  1474 
  1476 end
  1475 end