CookBook/Parsing.thy
changeset 48 609f9ef73494
parent 47 4daf913fdbe1
child 49 a0edabf14457
equal deleted inserted replaced
47:4daf913fdbe1 48:609f9ef73494
   211                 "([\"h\", \"h\", \"h\", \"h\"], [\"e\", \"l\", \"l\", \"o\"])"}
   211                 "([\"h\", \"h\", \"h\", \"h\"], [\"e\", \"l\", \"l\", \"o\"])"}
   212   
   212   
   213   Note that @{ML "Scan.repeat"} stores the parsed items in a list. The function
   213   Note that @{ML "Scan.repeat"} stores the parsed items in a list. The function
   214   @{ML "Scan.repeat1"} is similar, but requires that the parser @{ML_text "p"} 
   214   @{ML "Scan.repeat1"} is similar, but requires that the parser @{ML_text "p"} 
   215   succeeds at least once.
   215   succeeds at least once.
   216 *}
   216 
   217  
       
   218 text {* 
       
   219   After parsing succeeded, one nearly always wants to apply a function on the parsed 
   217   After parsing succeeded, one nearly always wants to apply a function on the parsed 
   220   items. This is done using the function @{ML_open "(p >> f)" for p f} which runs 
   218   items. This is done using the function @{ML_open "(p >> f)" for p f} which runs 
   221   first the parser @{ML_text p} and upon successful completion applies the 
   219   first the parser @{ML_text p} and upon successful completion applies the 
   222   function @{ML_text f} to the result. For example
   220   function @{ML_text f} to the result. For example
   223 
   221 
   271  OuterLex.Token (\<dots>,(OuterLex.Space, \" \"),\<dots>), 
   269  OuterLex.Token (\<dots>,(OuterLex.Space, \" \"),\<dots>), 
   272  OuterLex.Token (\<dots>,(OuterLex.Ident, \"world\"),\<dots>)]"}
   270  OuterLex.Token (\<dots>,(OuterLex.Ident, \"world\"),\<dots>)]"}
   273 
   271 
   274   produces three tokens where the first and the last are identifiers, since 
   272   produces three tokens where the first and the last are identifiers, since 
   275   @{ML_text [quotes] "hello"} and @{ML_text [quotes] "world"} do not match 
   273   @{ML_text [quotes] "hello"} and @{ML_text [quotes] "world"} do not match 
   276   any other category. The second indicates a space. If we parse
   274   any other category. The second indicates a space. Many parsing functions 
       
   275   later on will require spaces, comments and the like to have been filtered out.
       
   276   If we parse
   277 
   277 
   278 @{ML_response [display] "OuterSyntax.scan Position.none \"inductive|for\"" 
   278 @{ML_response [display] "OuterSyntax.scan Position.none \"inductive|for\"" 
   279 "[OuterLex.Token (\<dots>,(OuterLex.Command, \"inductive\"),\<dots>), 
   279 "[OuterLex.Token (\<dots>,(OuterLex.Command, \"inductive\"),\<dots>), 
   280  OuterLex.Token (\<dots>,(OuterLex.Keyword, \"|\"),\<dots>), 
   280  OuterLex.Token (\<dots>,(OuterLex.Keyword, \"|\"),\<dots>), 
   281  OuterLex.Token (\<dots>,(OuterLex.Keyword, \"for\"),\<dots>)]"}
   281  OuterLex.Token (\<dots>,(OuterLex.Keyword, \"for\"),\<dots>)]"}
   282 
   282 
   283   we obtain a list of command and keyword tokens.
   283   we obtain a list consiting of only command and keyword tokens.
   284   If you want to see which keywords and commands are currently known, use
   284   If you want to see which keywords and commands are currently known, use
   285   the following (you might have to adjust the @{ML print_depth} in order to
   285   the following (you might have to adjust the @{ML print_depth} in order to
   286   see the complete list):
   286   see the complete list):
   287 
   287 
   288 @{ML_response_fake [display] 
   288 @{ML_response_fake [display] 
   303   (OuterParse.$$$ \"where\" input1, OuterParse.$$$ \"|\" input2)
   303   (OuterParse.$$$ \"where\" input1, OuterParse.$$$ \"|\" input2)
   304 end"
   304 end"
   305 "((\"where\",\<dots>),(\"|\",\<dots>))"}
   305 "((\"where\",\<dots>),(\"|\",\<dots>))"}
   306 
   306 
   307   Like before, we can sequentially connect parsers with @{ML "(op --)"}. For example 
   307   Like before, we can sequentially connect parsers with @{ML "(op --)"}. For example 
   308   follows
       
   309 
   308 
   310 @{ML_response [display]
   309 @{ML_response [display]
   311 "let 
   310 "let 
   312   val input = OuterSyntax.scan Position.none \"|in\"
   311   val input = OuterSyntax.scan Position.none \"|in\"
   313 in 
   312 in 
   314   (OuterParse.$$$ \"|\" -- OuterParse.$$$ \"in\") input
   313   (OuterParse.$$$ \"|\" -- OuterParse.$$$ \"in\") input
   315 end"
   314 end"
   316 "((\"|\",\"in\"),[])"}
   315 "((\"|\",\"in\"),[])"}
   317 
   316 
   318   The parser @{ML_open "OuterParse.enum s p" for s p} parses a possibly empty 
   317   The parser @{ML_open "OuterParse.enum s p" for s p} parses a possibly empty 
   319   list of items recognized by the parser @{ML_text p}, where the items are 
   318   list of items recognised by the parser @{ML_text p}, where the items are 
   320   separated by @{ML_text s}. For example
   319   separated by @{ML_text s}. For example
   321 
   320 
   322 @{ML_response [display]
   321 @{ML_response [display]
   323 "let 
   322 "let 
   324   val input = OuterSyntax.scan Position.none \"in|in|in\\n\"
   323   val input = OuterSyntax.scan Position.none \"in|in|in\\n\"
   327 end" 
   326 end" 
   328 "([\"in\",\"in\",\"in\"],[\<dots>])"}
   327 "([\"in\",\"in\",\"in\"],[\<dots>])"}
   329 
   328 
   330   Note that we had to add a @{ML_text [quotes] "\\n"} at the end of the parsed
   329   Note that we had to add a @{ML_text [quotes] "\\n"} at the end of the parsed
   331   string, otherwise the parser would have consumed all tokens and then failed with
   330   string, otherwise the parser would have consumed all tokens and then failed with
   332   the exception @{ML_text "FAIL"}. @{ML "OuterParse.enum1"} works similarly, 
   331   the exception @{ML_text "MORE"}. @{ML "OuterParse.enum1"} works similarly, 
   333   except that the parsed list must be non-empty.
   332   except that the parsed list must be non-empty.
   334 
   333 
   335 *}
   334 *}
   336 
   335 
   337 text {* FIXME explain @{ML "OuterParse.!!!"} *}
   336 text {* FIXME explain @{ML "OuterParse.!!!"} *}