CookBook/Parsing.thy
changeset 52 a04bdee4fb1e
parent 50 3d4b49921cdb
child 53 0c3580c831a4
equal deleted inserted replaced
51:c346c156a7cd 52:a04bdee4fb1e
    43   For example:
    43   For example:
    44 
    44 
    45   @{ML_response [display] "($$ \"h\") (explode \"hello\")" "(\"h\", [\"e\", \"l\", \"l\", \"o\"])"}
    45   @{ML_response [display] "($$ \"h\") (explode \"hello\")" "(\"h\", [\"e\", \"l\", \"l\", \"o\"])"}
    46   @{ML_response [display] "($$ \"w\") (explode \"world\")" "(\"w\", [\"o\", \"r\", \"l\", \"d\"])"}
    46   @{ML_response [display] "($$ \"w\") (explode \"world\")" "(\"w\", [\"o\", \"r\", \"l\", \"d\"])"}
    47 
    47 
    48   This function will either succeed (as in the two examples above) or raise the exeption 
    48   This function will either succeed (as in the two examples above) or raise the exception 
    49   @{ML_text "FAIL"} if no string can be consumed. For example trying to parse
    49   @{ML_text "FAIL"} if no string can be consumed. For example trying to parse
    50 
    50 
    51   @{ML_response_fake [display] "($$ \"x\") (explode \"world\")" 
    51   @{ML_response_fake [display] "($$ \"x\") (explode \"world\")" 
    52                                "Exception FAIL raised"}
    52                                "Exception FAIL raised"}
    53   
    53   
    66   However, note that these exceptions are private to the parser and cannot be accessed
    66   However, note that these exceptions are private to the parser and cannot be accessed
    67   by the programmer (for example to handle them).
    67   by the programmer (for example to handle them).
    68   
    68   
    69   Slightly more general than the parser @{ML "(op $$)"} is the function @{ML
    69   Slightly more general than the parser @{ML "(op $$)"} is the function @{ML
    70   Scan.one}, in that it takes a predicate as argument and then parses exactly
    70   Scan.one}, in that it takes a predicate as argument and then parses exactly
    71   one item from the input list satisfying this prediate. For example the
    71   one item from the input list satisfying this predicate. For example the
    72   following parser either consumes an @{ML_text [quotes] "h"} or a @{ML_text
    72   following parser either consumes an @{ML_text [quotes] "h"} or a @{ML_text
    73   [quotes] "w"}:
    73   [quotes] "w"}:
    74 
    74 
    75 
    75 
    76 @{ML_response [display] 
    76 @{ML_response [display] 
    81 in
    81 in
    82     (hw input1, hw input2)
    82     (hw input1, hw input2)
    83 end"
    83 end"
    84     "((\"h\", [\"e\", \"l\", \"l\", \"o\"]),(\"w\", [\"o\", \"r\", \"l\", \"d\"]))"}
    84     "((\"h\", [\"e\", \"l\", \"l\", \"o\"]),(\"w\", [\"o\", \"r\", \"l\", \"d\"]))"}
    85 
    85 
    86   Two parser can be connected in sequence by using the funtion @{ML "(op --)"}. 
    86   Two parser can be connected in sequence by using the function @{ML "(op --)"}. 
    87   For example parsing @{ML_text "h"}, @{ML_text "e"} and @{ML_text "l"} in this 
    87   For example parsing @{ML_text "h"}, @{ML_text "e"} and @{ML_text "l"} in this 
    88   sequence can be achieved by
    88   sequence can be achieved by
    89 
    89 
    90   @{ML_response [display] "(($$ \"h\") -- ($$ \"e\") -- ($$ \"l\")) (explode \"hello\")"
    90   @{ML_response [display] "(($$ \"h\") -- ($$ \"e\") -- ($$ \"l\")) (explode \"hello\")"
    91                           "(((\"h\", \"e\"), \"l\"), [\"l\", \"o\"])"}
    91                           "(((\"h\", \"e\"), \"l\"), [\"l\", \"o\"])"}
   105 in
   105 in
   106   (hw input1, hw input2)
   106   (hw input1, hw input2)
   107 end"
   107 end"
   108   "((\"h\", [\"e\", \"l\", \"l\", \"o\"]), (\"w\", [\"o\", \"r\", \"l\", \"d\"]))"}
   108   "((\"h\", [\"e\", \"l\", \"l\", \"o\"]), (\"w\", [\"o\", \"r\", \"l\", \"d\"]))"}
   109 
   109 
   110   The functions @{ML "(op |--)"} and @{ML "(op --|)"} work like the sequencing funtion 
   110   The functions @{ML "(op |--)"} and @{ML "(op --|)"} work like the sequencing function 
   111   for parsers, except that they discard the item being parsed by the first (respectively second)
   111   for parsers, except that they discard the item being parsed by the first (respectively second)
   112   parser. For example
   112   parser. For example
   113   
   113   
   114 @{ML_response [display]
   114 @{ML_response [display]
   115 "let 
   115 "let 
   159   above the information 
   159   above the information 
   160   that @{ML_text p} should be followed by @{ML_text q}. To see this consider the case in
   160   that @{ML_text p} should be followed by @{ML_text q}. To see this consider the case in
   161   which @{ML_text p} 
   161   which @{ML_text p} 
   162   is present in the input, but not @{ML_text q}. That means @{ML_open "(p -- q)" for p q} will fail 
   162   is present in the input, but not @{ML_text q}. That means @{ML_open "(p -- q)" for p q} will fail 
   163   and the 
   163   and the 
   164   alternative parser @{ML_text r} will be tried. However in many circumstanes this will be the wrong
   164   alternative parser @{ML_text r} will be tried. However in many circumstance this will be the wrong
   165   parser for the input ``p-followed-by-q'' and therefore will also fail. The error message is then 
   165   parser for the input ``p-followed-by-q'' and therefore will also fail. The error message is then 
   166   caused by the
   166   caused by the
   167   failure of @{ML_text r}, not by the absense of @{ML_text q} in the input. This kind of situation
   167   failure of @{ML_text r}, not by the absence of @{ML_text q} in the input. This kind of situation
   168   can be avoided by using the funtion @{ML "(op !!)"}. This function aborts the whole process of
   168   can be avoided by using the function @{ML "(op !!)"}. This function aborts the whole process of
   169   parsing in case of a failure and invokes an error message. For example if we invoke the parser
   169   parsing in case of a failure and invokes an error message. For example if we invoke the parser
   170   
   170   
   171   @{ML [display] "(!! (fn _ => \"foo\") ($$ \"h\"))"}
   171   @{ML [display] "(!! (fn _ => \"foo\") ($$ \"h\"))"}
   172 
   172 
   173   on @{ML_text [quotes] "hello"}, the parsing succeeds
   173   on @{ML_text [quotes] "hello"}, the parsing succeeds
   331 "filtered_input \"inductive | for\"" 
   331 "filtered_input \"inductive | for\"" 
   332 "[Token (\<dots>,(Command, \"inductive\"),\<dots>), 
   332 "[Token (\<dots>,(Command, \"inductive\"),\<dots>), 
   333  Token (\<dots>,(Keyword, \"|\"),\<dots>), 
   333  Token (\<dots>,(Keyword, \"|\"),\<dots>), 
   334  Token (\<dots>,(Keyword, \"for\"),\<dots>)]"}
   334  Token (\<dots>,(Keyword, \"for\"),\<dots>)]"}
   335 
   335 
   336   we obtain a list consiting of only a command and two keyword tokens.
   336   we obtain a list consisting of only a command and two keyword tokens.
   337   If you want to see which keywords and commands are currently known, use
   337   If you want to see which keywords and commands are currently known, use
   338   the following (you might have to adjust the @{ML print_depth} in order to
   338   the following (you might have to adjust the @{ML print_depth} in order to
   339   see the complete list):
   339   see the complete list):
   340 
   340 
   341 @{ML_response_fake [display] 
   341 @{ML_response_fake [display]