CookBook/Parsing.thy
changeset 43 02f76f1b6e7b
parent 42 cd612b489504
child 44 dee4b3e66dfe
equal deleted inserted replaced
42:cd612b489504 43:02f76f1b6e7b
    52                                "Exception FAIL raised"}
    52                                "Exception FAIL raised"}
    53   
    53   
    54   will raise the exception @{ML_text "FAIL"}.
    54   will raise the exception @{ML_text "FAIL"}.
    55   There are three exceptions used in the parsing combinators:
    55   There are three exceptions used in the parsing combinators:
    56 
    56 
    57   (FIXME: do the exception need to be explained, because the user cannot use them from ``outside''?)
       
    58 
       
    59   \begin{itemize}
    57   \begin{itemize}
    60   \item @{ML_text "FAIL"} is used to indicate that alternative routes of parsing 
    58   \item @{ML_text "FAIL"} is used to indicate that alternative routes of parsing 
    61   might be explored. 
    59   might be explored. 
    62   \item @{ML_text "MORE"} indicates that there is not enough input for the parser. For example 
    60   \item @{ML_text "MORE"} indicates that there is not enough input for the parser. For example 
    63   in @{ML_text "($$ \"h\") []"}.
    61   in @{ML_text "($$ \"h\") []"}.
    64   \item @{ML_text "ABORT"} is the exception which is raised when a dead end is reached. 
    62   \item @{ML_text "ABORT"} is the exception which is raised when a dead end is reached. 
    65   It is used for example in the function @{ML "(op !!)"} (see below).
    63   It is used for example in the function @{ML "(op !!)"} (see below).
    66   \end{itemize}
    64   \end{itemize}
    67 
    65 
       
    66   (FIXME: do the exception need to be explained, because the user cannot use them from ``outside''?)
    68 
    67 
    69   Slightly more general than the parser @{ML "(op $$)"} is the function @{ML Scan.one}, in that it 
    68   Slightly more general than the parser @{ML "(op $$)"} is the function @{ML Scan.one}, in that it 
    70   takes a predicate as argument and then parses exactly one item from the input list 
    69   takes a predicate as argument and then parses exactly one item from the input list 
    71   satisfying this prediate. For example the following parser either consumes an @{ML_text "h"}
    70   satisfying this prediate. For example the following parser either consumes an 
    72   or a @{ML_text "w"}:
    71   @{ML_text [quotes] "h"} or a @{ML_text [quotes] "w"}:
    73 
    72 
    74 @{ML_response [display] 
    73 @{ML_response [display] 
    75 "let 
    74 "let 
    76   val hw = Scan.one (fn x => x = \"h\" orelse x = \"w\")
    75   val hw = Scan.one (fn x => x = \"h\" orelse x = \"w\")
    77   val input1 = (explode \"hello\")
    76   val input1 = (explode \"hello\")
   132   (p input1, p input2)
   131   (p input1, p input2)
   133 end" 
   132 end" 
   134  "((\"h\", [\"e\", \"l\", \"l\", \"o\"]), (\"x\", [\"w\", \"o\", \"r\", \"l\", \"d\"]))"}
   133  "((\"h\", [\"e\", \"l\", \"l\", \"o\"]), (\"x\", [\"w\", \"o\", \"r\", \"l\", \"d\"]))"}
   135 
   134 
   136   The function @{ML "(op !!)"} helps to produce appropriate error messages
   135   The function @{ML "(op !!)"} helps to produce appropriate error messages
   137   during parsing. For example if one wants to parse @{ML_text p} immediately 
   136   during parsing. For example if one wants to parse that @{ML_text p} is immediately 
   138   followed by @{ML_text q}, or start a completely different parser @{ML_text r},
   137   followed by @{ML_text q}, or start a completely different parser @{ML_text r},
   139   one might write
   138   one might write
   140 
   139 
   141   @{ML_open [display] "(p -- q) || r" for p q r}
   140   @{ML_open [display] "(p -- q) || r" for p q r}
   142 
   141 
   146   that @{ML_text p} should be followed by @{ML_text q}. To see this consider the case in
   145   that @{ML_text p} should be followed by @{ML_text q}. To see this consider the case in
   147   which @{ML_text p} 
   146   which @{ML_text p} 
   148   is present in the input, but not @{ML_text q}. That means @{ML_open "(p -- q)" for p q} will fail 
   147   is present in the input, but not @{ML_text q}. That means @{ML_open "(p -- q)" for p q} will fail 
   149   and the 
   148   and the 
   150   alternative parser @{ML_text r} will be tried. However in many circumstanes this will be the wrong
   149   alternative parser @{ML_text r} will be tried. However in many circumstanes this will be the wrong
   151   parser for the input ``p-followed-by-q'' and therefore will fail. The error message is then caused by the
   150   parser for the input ``p-followed-by-q'' and therefore will also fail. The error message is then 
       
   151   caused by the
   152   failure of @{ML_text r}, not by the absense of @{ML_text q} in the input. This kind of situation
   152   failure of @{ML_text r}, not by the absense of @{ML_text q} in the input. This kind of situation
   153   can be avoided by using the funtion @{ML "(op !!)"}. This function aborts the whole process of
   153   can be avoided by using the funtion @{ML "(op !!)"}. This function aborts the whole process of
   154   parsing in case of failure and invokes an error message. For example if we invoke the parser
   154   parsing in case of failure and invokes an error message. For example if we invoke the parser
   155   
   155   
   156   @{ML [display] "(!! (fn _ => \"foo\") ($$ \"h\"))"}
   156   @{ML [display] "(!! (fn _ => \"foo\") ($$ \"h\"))"}
   176   This ``prefixing'' is usually done by wrappers such as @{ML "OuterSyntax.command"} 
   176   This ``prefixing'' is usually done by wrappers such as @{ML "OuterSyntax.command"} 
   177   (FIXME: give reference to later place). 
   177   (FIXME: give reference to later place). 
   178   
   178   
   179   Returning to our example of parsing @{ML_open "(p -- q) || r" for p q r}. If we want
   179   Returning to our example of parsing @{ML_open "(p -- q) || r" for p q r}. If we want
   180   to generate the correct error message for p-followed-by-q, then
   180   to generate the correct error message for p-followed-by-q, then
   181   we have to write
   181   we have to write, for example
   182 *}
   182 *}
   183 
   183 
   184 ML {* 
   184 ML {* 
   185   fun p_followed_by_q p q r =
   185   fun p_followed_by_q p q r =
   186   let 
   186   let 
   237 
   237 
   238 @{ML_response [display]
   238 @{ML_response [display]
   239 "Scan.lift (($$ \"h\") -- ($$ \"e\")) (1,(explode \"hello\"))"
   239 "Scan.lift (($$ \"h\") -- ($$ \"e\")) (1,(explode \"hello\"))"
   240 "((\"h\", \"e\"), (1, [\"l\", \"l\", \"o\"]))"}
   240 "((\"h\", \"e\"), (1, [\"l\", \"l\", \"o\"]))"}
   241 
   241 
   242   (FIXME: In which situations is this useful?) 
   242   (FIXME: In which situations is this useful? Give examples.) 
   243 *}
   243 *}
   244 
   244 
   245 section {* Parsing Theory Syntax *}
   245 section {* Parsing Theory Syntax *}
   246 
   246 
   247 text {*
   247 text {*
   248   Most of the time, however, Isabelle developers have to deal with parsing tokens, not strings.
   248   Most of the time, however, Isabelle developers have to deal with parsing tokens, not strings.
   249   This is because the parsers for the theory syntax, as well as the parsers for the 
   249   This is because the parsers for the theory syntax, as well as the parsers for the 
   250   argument syntax of proof methods and attributes, use the type 
   250   argument syntax of proof methods and attributes use the type 
   251   @{ML_type OuterParse.token} (which is identical to the type @{ML_type OuterLex.token}).
   251   @{ML_type OuterParse.token} (which is identical to the type @{ML_type OuterLex.token}).
   252 
   252 
   253   \begin{readmore}
   253   \begin{readmore}
   254   The parser functions for the theory syntax are contained in the structure
   254   The parser functions for the theory syntax are contained in the structure
   255   @{ML_struct OuterParse} defined in the file @{ML_file  "Pure/Isar/outer_parse.ML"}.
   255   @{ML_struct OuterParse} defined in the file @{ML_file  "Pure/Isar/outer_parse.ML"}.
   256   The definition for tokens is in the file @{ML_file "Pure/Isar/outer_lex.ML"}.
   256   The definition for tokens is in the file @{ML_file "Pure/Isar/outer_lex.ML"}.
   257   \end{readmore}
   257   \end{readmore}
   258 *}
   258 *}
   259 
   259 
   260 ML {* OuterLex.mk_text "hello" *}
       
   261 
       
   262 ML {* print_depth 50 *}
       
   263 
       
   264 ML {* 
   260 ML {* 
   265   let open OuterLex in 
   261   let open OuterLex in 
   266   OuterSyntax.scan Position.none "for" 
   262   OuterSyntax.scan Position.none "for" 
   267   end;
   263   end;
   268 
   264 
   269 *}
   265 *}
   270 
       
   271 ML {* map OuterLex.mk_text (explode "hello") *} 
       
   272 
   266 
   273 ML {*
   267 ML {*
   274 
   268 
   275 fun my_scan lex pos str =
   269 fun my_scan lex pos str =
   276    Source.of_string str
   270    Source.of_string str
   280    |> Source.exhaust;
   274    |> Source.exhaust;
   281 
   275 
   282 *}
   276 *}
   283 
   277 
   284 ML {*
   278 ML {*
   285 let val toks = my_scan (["hello"], []) Position.none "hello"
   279 let val toks = my_scan (["hello"], []) Position.none "hello foo bar"
   286 in (OuterParse.$$$ "hello") toks end
   280 in (OuterParse.$$$ "hello") toks end
   287 *}
       
   288 
       
   289 text {* 
       
   290 
       
   291   @{ML_response_fake [display] "map OuterLex.mk_text (explode \"hello\")"  
       
   292        "[Token (\<dots>), Token (\<dots>), Token (\<dots>), Token (\<dots>), Token (\<dots>)]"} 
       
   293 
       
   294 *}
       
   295 
       
   296 ML {*
       
   297   OuterLex.mk_text "hello"
       
   298 *}
   281 *}
   299 
   282 
   300 ML {*
   283 ML {*
   301 
   284 
   302   let val input = [OuterLex.mk_text "hello", OuterLex.mk_text "world"];
   285   let val input = [OuterLex.mk_text "hello", OuterLex.mk_text "world"];
   303   in (Scan.one (fn _ => true)) input end
   286   in (Scan.one (fn _ => true)) input end
   304 
   287 
   305 *}
   288 *}
   306 
   289 
   307 
   290 text {*
       
   291   explain @{ML "OuterParse.enum1"}, @{ML "OuterParse.prop"} 
       
   292    @{ML "OuterParse.opt_target"}, @{ML "OuterParse.fixes"} 
       
   293    @{ML "OuterParse.!!!"}, @{ML "OuterParse.for_fixes"} 
       
   294 *}
   308 
   295 
   309 chapter {* Parsing *}
   296 chapter {* Parsing *}
   310 
   297 
   311 text {*
   298 text {*
   312 
   299