CookBook/Parsing.thy
changeset 125 748d9c1a32fb
parent 124 0b9fa606a746
child 126 fcc0e6e54dca
equal deleted inserted replaced
124:0b9fa606a746 125:748d9c1a32fb
   324   @{text "(**\<dots>**)"} in the output string. To enclose a string, you can use the
   324   @{text "(**\<dots>**)"} in the output string. To enclose a string, you can use the
   325   function @{ML "enclose s1 s2 s" for s1 s2 s} which produces the string @{ML
   325   function @{ML "enclose s1 s2 s" for s1 s2 s} which produces the string @{ML
   326   "s1 ^ s ^ s2" for s1 s2 s}.
   326   "s1 ^ s ^ s2" for s1 s2 s}.
   327   \end{exercise}
   327   \end{exercise}
   328 
   328 
       
   329   The function @{ML Scan.ahead} parses some input, but leaves the original
       
   330   input unchanged. For example:
       
   331 
       
   332   @{ML_response [display,gray]
       
   333   "Scan.ahead (Scan.this_string \"foo\") (explode \"foo\")" 
       
   334   "(\"foo\", [\"f\", \"o\", \"o\"])"} 
   329 
   335 
   330   The function @{ML Scan.lift} takes a parser and a pair as arguments. This function applies
   336   The function @{ML Scan.lift} takes a parser and a pair as arguments. This function applies
   331   the given parser to the second component of the pair and leaves the  first component 
   337   the given parser to the second component of the pair and leaves the  first component 
   332   untouched. For example
   338   untouched. For example
   333 
   339 
   498   the function @{ML OuterParse.type_ident}.
   504   the function @{ML OuterParse.type_ident}.
   499   \end{exercise}
   505   \end{exercise}
   500 
   506 
   501   (FIXME: or give parser for numbers)
   507   (FIXME: or give parser for numbers)
   502 
   508 
   503 *}
   509   Whenever there is a possibility that the processing of user input can fail, 
   504 
   510   it is a good idea to give as much information about where the error 
   505 
   511   occured. For this Isabelle can attach positional information to tokens
   506 
   512   and then thread this information up the processing chain. To see this,
   507 
   513   modify the function @{ML filtered_input} described earlier to be
   508 
   514 *}
   509 section {* Positional Information *}
   515 
   510 
   516 ML{*fun filtered_input' str = 
   511 text {*
   517        filter OuterLex.is_proper (OuterSyntax.scan (Position.line 7) str) *}
   512 
   518 
   513   @{ML OuterParse.position}
   519 text {*
       
   520   where we pretend the parsed string starts on line 7. An example is
       
   521 
       
   522 @{ML_response_fake [display,gray]
       
   523 "filtered_input' \"foo \\n bar\""
       
   524 "[Token ((\"foo\", ({line=7, end_line=7}, {line=7})), (Ident, \"foo\"), \<dots>),
       
   525  Token ((\"bar\", ({line=8, end_line=8}, {line=8})), (Ident, \"bar\"), \<dots>)]"}
       
   526 
       
   527   in which the @{text [quotes] "\\n"} causes the second token to be in 
       
   528   line 8.
       
   529 
       
   530   Now by using the parser @{ML OuterParse.position} you can decode the positional
       
   531   information and return it as part of the parsed input. For example
       
   532 
       
   533 @{ML_response_fake [display,gray]
       
   534 "let
       
   535   val input = (filtered_input' \"where\")
       
   536 in 
       
   537   parse (OuterParse.position (OuterParse.$$$ \"where\")) input
       
   538 end"
       
   539 "((\"where\", {line=7, end_line=7}), [])"}
       
   540 
       
   541   \begin{readmore}
       
   542   The functions related to positions are implemented in the file
       
   543   @{ML_file "Pure/General/position.ML"}.
       
   544   \end{readmore}
   514 
   545 
   515 *}
   546 *}
   516 
   547 
   517 section {* Parsing Inner Syntax *}
   548 section {* Parsing Inner Syntax *}
   518 
   549 
   519 ML{*let 
   550 text {*
   520   val input = OuterSyntax.scan Position.none "0"
   551   There is usually no need to write your own parser for parsing inner syntax, that is 
       
   552   for terms and  types: you can just call the pre-defined parsers. Terms can 
       
   553   be parsed using the function @{ML OuterParse.term}. For example:
       
   554 
       
   555 @{ML_response [display,gray]
       
   556 "let 
       
   557   val input = OuterSyntax.scan Position.none \"foo\"
   521 in 
   558 in 
   522   OuterParse.prop input
   559   OuterParse.term input
   523 end*}
   560 end"
   524 
   561 "(\"\\^E\\^Ftoken\\^Efoo\\^E\\^F\\^E\", [])"}
   525 text {* (FIXME funny output for a proposition) *}
   562 
       
   563   The function @{ML OuterParse.prop} is similar, except that it gives a different
       
   564   error message, when parsing fails. Looking closer at the result string you will
       
   565   have noticed that the parser not just returns the parsed string, but also some 
       
   566   encoded positional information. You can see this better if you replace 
       
   567   @{ML Position.none} by @{ML "(Position.line 42)"}, say.
       
   568 
       
   569 @{ML_response [display,gray]
       
   570 "let 
       
   571   val input = OuterSyntax.scan (Position.line 42) \"foo\"
       
   572 in 
       
   573   OuterParse.term input
       
   574 end"
       
   575 "(\"\\^E\\^Ftoken\\^Fline=42\\^Fend_line=42\\^Efoo\\^E\\^F\\^E\", [])"}
       
   576   
       
   577   The positional information is stored so that code called later will be
       
   578   able to give more precise error messages. 
       
   579 
       
   580 *}
   526 
   581 
   527 section {* Parsing Specifications\label{sec:parsingspecs} *}
   582 section {* Parsing Specifications\label{sec:parsingspecs} *}
   528 
   583 
   529 text {*
   584 text {*
   530   There are a number of special purpose parsers that help with parsing
   585   There are a number of special purpose parsers that help with parsing