diff -r 74846cb0fff9 -r 693711a0c702 CookBook/Parsing.thy --- a/CookBook/Parsing.thy Fri Feb 20 23:19:41 2009 +0000 +++ b/CookBook/Parsing.thy Sat Feb 21 11:38:14 2009 +0000 @@ -45,6 +45,11 @@ @{ML_response [display,gray] "($$ \"w\") (explode \"world\")" "(\"w\", [\"o\", \"r\", \"l\", \"d\"])"} + The type of a parser is defined as + + + + This function will either succeed (as in the two examples above) or raise the exception @{text "FAIL"} if no string can be consumed. For example trying to parse @@ -93,7 +98,7 @@ Note how the result of consumed strings builds up on the left as nested pairs. If, as in the previous example, you want to parse a particular string, - then one should use the function @{ML Scan.this_string}: + then you should use the function @{ML Scan.this_string}: @{ML_response [display,gray] "Scan.this_string \"hell\" (explode \"hello\")" "(\"hell\", [\"o\"])"} @@ -244,7 +249,7 @@ "([\"h\", \"h\", \"h\", \"h\"], [])"} @{ML Symbol.stopper} is the ``end-of-input'' indicator for parsing strings; - other stoppers need to be used when parsing tokens, for example. However, this kind of + other stoppers need to be used when parsing, for example, tokens. However, this kind of manually wrapping is often already done by the surrounding infrastructure. The function @{ML Scan.repeat} can be used with @{ML Scan.one} to read any @@ -348,10 +353,17 @@ text {* Most of the time, however, Isabelle developers have to deal with parsing - tokens, not strings. This is because the parsers for the theory syntax, as - well as the parsers for the arguments of proof methods the type @{ML_type OuterLex.token} - (which is identical to the type @{ML_type OuterParse.token}). There are also handy - parsers for ML-expressions and ML-files. + tokens, not strings. These token parsers will have the type +*} + +ML{*type 'a parser = OuterLex.token list -> 'a * OuterLex.token list*} + +text {* + This reason for using token parsers is that theory syntax, as well as the + parsers for the arguments of proof methods, use the type @{ML_type + OuterLex.token} (which is identical to the type @{ML_type + OuterParse.token}). However, there are also handy parsers for + ML-expressions and ML-files. \begin{readmore} The parser functions for the theory syntax are contained in the structure @@ -367,8 +379,8 @@ text {* The first example shows how to generate a token list out of a string using - the function @{ML "OuterSyntax.scan"}. It is given below @{ML "Position.none"} - as argument since, at the moment, we are not interested in generating + the function @{ML "OuterSyntax.scan"}. It is given the argument @{ML "Position.none"} + since, at the moment, we are not interested in generating precise error messages. The following code @{ML_response_fake [display,gray] "OuterSyntax.scan Position.none \"hello world\"" @@ -422,7 +434,7 @@ in (Scan.dest_lexicon commands, Scan.dest_lexicon keywords) end" -"([\"}\",\"{\",\],[\"\\",\"\\",\])"} +"([\"}\", \"{\", \],[\"\\", \"\\", \])"} The parser @{ML "OuterParse.$$$"} parses a single keyword. For example: @@ -433,7 +445,7 @@ in (OuterParse.$$$ \"where\" input1, OuterParse.$$$ \"|\" input2) end" -"((\"where\",\),(\"|\",\))"} +"((\"where\",\), (\"|\",\))"} Like before, you can sequentially connect parsers with @{ML "--"}. For example: @@ -443,7 +455,7 @@ in (OuterParse.$$$ \"|\" -- OuterParse.$$$ \"in\") input end" -"((\"|\",\"in\"),[])"} +"((\"|\", \"in\"),[])"} The parser @{ML "OuterParse.enum s p" for s p} parses a possibly empty list of items recognised by the parser @{text p}, where the items being parsed @@ -455,7 +467,7 @@ in (OuterParse.enum \"|\" (OuterParse.$$$ \"in\")) input end" -"([\"in\",\"in\",\"in\"],[\])"} +"([\"in\", \"in\", \"in\"],[\])"} @{ML "OuterParse.enum1"} works similarly, except that the parsed list must be non-empty. Note that we had to add a string @{text [quotes] "foo"} at the @@ -472,7 +484,7 @@ Scan.finite OuterLex.stopper (OuterParse.enum \"|\" (OuterParse.$$$ \"in\")) input end" -"([\"in\",\"in\",\"in\"],[])"} +"([\"in\", \"in\", \"in\"],[])"} The following function will help to run examples. @@ -510,7 +522,7 @@ it is a good idea to give as much information about where the error occured. For this Isabelle can attach positional information to tokens and then thread this information up the processing chain. To see this, - modify the function @{ML filtered_input} described earlier to be + modify the function @{ML filtered_input} described earlier to *} ML{*fun filtered_input' str = @@ -584,7 +596,7 @@ able to give more precise error messages. \begin{readmore} - The functions to do with input and outout of XML and YXML are defined + The functions to do with input and output of XML and YXML are defined in @{ML_file "Pure/General/xml.ML"} and @{ML_file "Pure/General/yxml.ML"}. \end{readmore} *} @@ -629,7 +641,7 @@ text {* Note that the parser does not parse the keyword \simpleinductive, even if it is meant to process definitions as shown above. The parser of the keyword - will be given by the infrastructure that will eventually calls @{ML spec_parser}. + will be given by the infrastructure that will eventually call @{ML spec_parser}. To see what the parser returns, let us parse the string corresponding to the @@ -718,10 +730,8 @@ The function @{ML opt_thm_name in SpecParse} is the ``optional'' variant of @{ML thm_name in SpecParse}. As can be seen each theorem name can contain some - attributes. - - For the inductive definitions described above only the attibutes @{text "[intro]"} and - @{text "[simp]"} make sense. + attributes. The name has to end with @{text [quotes] ":"}---see argument of + @{ML SpecParse.opt_thm_name} in Line 9. \begin{readmore} Attributes and arguments are implemented in the files @{ML_file "Pure/Isar/attrib.ML"} @@ -847,7 +857,8 @@ @{text [display] "$ ISABELLE_LOGS=\"$(isabelle getenv -b ISABELLE_OUTPUT)\"/log"} - on the Unix prompt. The directory should include the files: + on the Unix prompt. If you now also type @{text "ls $ISABELLE_LOGS"}, then the + directory should include the files: @{text [display] "Pure.gz @@ -956,8 +967,6 @@ should be done with the theorem once it is proved (we chose to just forget about it). Lines 9 to 11 contain the parser for the proposition. - (FIXME: explain @{ML Toplevel.print} etc) - If you now type \isacommand{foobar}~@{text [quotes] "True \ True"}, you obtain the following proof state: @@ -976,10 +985,12 @@ \isacommand{done} \end{isabelle} + However, once you change the ``kind'' of a command from @{ML thy_decl in OuterKeyword} + to @{ML thy_goal in OuterKeyword} then the keyword file needs to be re-created. (FIXME What do @{ML "Toplevel.theory"} @{ML "Toplevel.print"} - @{ML Toplevel.local_theory}?) + @{ML Toplevel.local_theory} do?) (FIXME read a name and show how to store theorems)