diff -r 7e04dc2368b0 -r fbcb89d84ba6 ProgTutorial/Parsing.thy --- a/ProgTutorial/Parsing.thy Mon Mar 30 09:33:50 2009 +0100 +++ b/ProgTutorial/Parsing.thy Mon Mar 30 15:23:19 2009 +0200 @@ -83,9 +83,9 @@ end" "((\"h\", [\"e\", \"l\", \"l\", \"o\"]),(\"w\", [\"o\", \"r\", \"l\", \"d\"]))"} - Two parser can be connected in sequence by using the function @{ML "--"}. - For example parsing @{text "h"}, @{text "e"} and @{text "l"} in this - sequence you can achieve by: + Two parsers can be connected in sequence by using the function @{ML "--"}. + For example parsing @{text "h"}, @{text "e"} and @{text "l"} (in this + order) you can achieve by: @{ML_response [display,gray] "(($$ \"h\") -- ($$ \"e\") -- ($$ \"l\")) (explode \"hello\")" "(((\"h\", \"e\"), \"l\"), [\"l\", \"o\"])"} @@ -99,7 +99,7 @@ "(\"hell\", [\"o\"])"} Parsers that explore alternatives can be constructed using the function @{ML - "||"}. For example, the parser @{ML "(p || q)" for p q} returns the + "||"}. The parser @{ML "(p || q)" for p q} returns the result of @{text "p"}, in case it succeeds, otherwise it returns the result of @{text "q"}. For example: @@ -155,24 +155,24 @@ end" "((SOME \"h\", [\"e\", \"l\", \"l\", \"o\"]), (NONE, [\"w\", \"o\", \"r\", \"l\", \"d\"]))"} The function @{ML "!!"} helps to produce appropriate error messages - during parsing. For example if you want to parse that @{text p} is immediately + for parsing. For example if you want to parse @{text p} immediately followed by @{text q}, or start a completely different parser @{text r}, you might write: @{ML [display,gray] "(p -- q) || r" for p q r} However, this parser is problematic for producing an appropriate error - message, in case the parsing of @{ML "(p -- q)" for p q} fails. Because in - that case you lose the information that @{text p} should be followed by - @{text q}. To see this consider the case in which @{text p} is present in - the input, but not @{text q}. That means @{ML "(p -- q)" for p q} will fail - and the alternative parser @{text r} will be tried. However in many - circumstance this will be the wrong parser for the input ``p-followed-by-q'' - and therefore will also fail. The error message is then caused by the - failure of @{text r}, not by the absence of @{text q} in the input. This - kind of situation can be avoided when using the function @{ML "!!"}. - This function aborts the whole process of parsing in case of a - failure and prints an error message. For example if you invoke the parser + message, if the parsing of @{ML "(p -- q)" for p q} fails. Because in that + case you lose the information that @{text p} should be followed by @{text q}. + To see this assume that @{text p} is present in the input, but it is not + followed by @{text q}. That means @{ML "(p -- q)" for p q} will fail and + hence the alternative parser @{text r} will be tried. However, in many + circumstances this will be the wrong parser for the input ``p-followed-by-something'' + and therefore will also fail. The error message is then caused by the failure + of @{text r}, not by the absence of @{text q} in the input. This kind of + situation can be avoided when using the function @{ML "!!"}. This function + aborts the whole process of parsing in case of a failure and prints an error + message. For example if you invoke the parser @{ML [display,gray] "(!! (fn _ => \"foo\") ($$ \"h\"))"} @@ -212,7 +212,8 @@ text {* - Running this parser with the @{text [quotes] "h"} and @{text [quotes] "e"}, and + Running this parser with the arguments + @{text [quotes] "h"}, @{text [quotes] "e"} and @{text [quotes] "w"}, and the input @{text [quotes] "holle"} @{ML_response_fake [display,gray] "Scan.error (p_followed_by_q \"h\" \"e\" \"w\") (explode \"holle\")" @@ -291,7 +292,7 @@ "(([\"f\", \"o\", \"o\", \"o\", \"o\", \"o\"], []), ([\"f\", \"o\", \"o\"], [\"*\", \"o\", \"o\", \"o\"]))"} - After parsing is done, you nearly always want to apply a function on the parsed + After parsing is done, you almost always want to apply a function to the parsed items. One way to do this is the function @{ML "(p >> f)" for p f}, which runs first the parser @{text p} and upon successful completion applies the function @{text f} to the result. For example @@ -339,7 +340,7 @@ \begin{exercise}\label{ex:scancmts} Write a parser that parses an input string so that any comment enclosed - inside @{text "(*\*)"} is replaced by a the same comment but enclosed inside + within @{text "(*\*)"} is replaced by the same comment but enclosed within @{text "(**\**)"} in the output string. To enclose a string, you can use the function @{ML "enclose s1 s2 s" for s1 s2 s} which produces the string @{ML "s1 ^ s ^ s2" for s1 s2 s}. @@ -520,7 +521,7 @@ Whenever there is a possibility that the processing of user input can fail, it is a good idea to give as much information about where the error - occured. For this Isabelle can attach positional information to tokens + occurred. For this Isabelle can attach positional information to tokens and then thread this information up the processing chain. To see this, modify the function @{ML filtered_input} described earlier to *} @@ -615,7 +616,7 @@ text {* There are a number of special purpose parsers that help with parsing specifications of function definitions, inductive predicates and so on. In - Capter~\ref{chp:package}, for example, we will need to parse specifications + Chapter~\ref{chp:package}, for example, we will need to parse specifications for inductive predicates of the form: *} @@ -691,11 +692,11 @@ not yet used to type the variables: this must be done by type-inference later on. Since types are part of the inner syntax they are strings with some encoded information (see previous section). If a syntax translation is - present for a variable, then it is stored in the @{ML Mixfix} datastructure; + present for a variable, then it is stored in the @{ML Mixfix} data structure; no syntax translation is indicated by @{ML NoSyn}. \begin{readmore} - The datastructre for sytax annotations is defined in @{ML_file "Pure/Syntax/mixfix.ML"}. + The data structure for syntax annotations is defined in @{ML_file "Pure/Syntax/mixfix.ML"}. \end{readmore} Lines 3 to 7 in the function @{ML spec_parser} implement the parser for a @@ -1283,7 +1284,7 @@ Input text is split up into tokens, and the input source type for many parsing functions is \texttt{token list}. - The datatype definition (which is not published in the signature) is + The data type definition (which is not published in the signature) is \begin{verbatim} datatype token = Token of Position.T * (token_kind * string); \end{verbatim} @@ -1468,7 +1469,7 @@ text {* The source file is \texttt{src/Pure/Isar/args.ML}. - The primary type of this structure is the \texttt{src} datatype; + The primary type of this structure is the \texttt{src} data type; the single constructors not published in the signature, but \texttt{Args.src} and \texttt{Args.dest\_src} are in fact the constructor and destructor functions. @@ -1732,4 +1733,4 @@ *} (*>*) -end \ No newline at end of file +end