43 |
43 |
44 @{ML_response [display,gray] "($$ \"h\") (explode \"hello\")" "(\"h\", [\"e\", \"l\", \"l\", \"o\"])"} |
44 @{ML_response [display,gray] "($$ \"h\") (explode \"hello\")" "(\"h\", [\"e\", \"l\", \"l\", \"o\"])"} |
45 |
45 |
46 @{ML_response [display,gray] "($$ \"w\") (explode \"world\")" "(\"w\", [\"o\", \"r\", \"l\", \"d\"])"} |
46 @{ML_response [display,gray] "($$ \"w\") (explode \"world\")" "(\"w\", [\"o\", \"r\", \"l\", \"d\"])"} |
47 |
47 |
|
48 The type of a parser is defined as |
|
49 |
|
50 |
|
51 |
|
52 |
48 This function will either succeed (as in the two examples above) or raise the exception |
53 This function will either succeed (as in the two examples above) or raise the exception |
49 @{text "FAIL"} if no string can be consumed. For example trying to parse |
54 @{text "FAIL"} if no string can be consumed. For example trying to parse |
50 |
55 |
51 @{ML_response_fake [display,gray] "($$ \"x\") (explode \"world\")" |
56 @{ML_response_fake [display,gray] "($$ \"x\") (explode \"world\")" |
52 "Exception FAIL raised"} |
57 "Exception FAIL raised"} |
91 "(((\"h\", \"e\"), \"l\"), [\"l\", \"o\"])"} |
96 "(((\"h\", \"e\"), \"l\"), [\"l\", \"o\"])"} |
92 |
97 |
93 Note how the result of consumed strings builds up on the left as nested pairs. |
98 Note how the result of consumed strings builds up on the left as nested pairs. |
94 |
99 |
95 If, as in the previous example, you want to parse a particular string, |
100 If, as in the previous example, you want to parse a particular string, |
96 then one should use the function @{ML Scan.this_string}: |
101 then you should use the function @{ML Scan.this_string}: |
97 |
102 |
98 @{ML_response [display,gray] "Scan.this_string \"hell\" (explode \"hello\")" |
103 @{ML_response [display,gray] "Scan.this_string \"hell\" (explode \"hello\")" |
99 "(\"hell\", [\"o\"])"} |
104 "(\"hell\", [\"o\"])"} |
100 |
105 |
101 Parsers that explore alternatives can be constructed using the function @{ML |
106 Parsers that explore alternatives can be constructed using the function @{ML |
242 |
247 |
243 @{ML_response [display,gray] "Scan.finite Symbol.stopper (Scan.repeat ($$ \"h\")) (explode \"hhhh\")" |
248 @{ML_response [display,gray] "Scan.finite Symbol.stopper (Scan.repeat ($$ \"h\")) (explode \"hhhh\")" |
244 "([\"h\", \"h\", \"h\", \"h\"], [])"} |
249 "([\"h\", \"h\", \"h\", \"h\"], [])"} |
245 |
250 |
246 @{ML Symbol.stopper} is the ``end-of-input'' indicator for parsing strings; |
251 @{ML Symbol.stopper} is the ``end-of-input'' indicator for parsing strings; |
247 other stoppers need to be used when parsing tokens, for example. However, this kind of |
252 other stoppers need to be used when parsing, for example, tokens. However, this kind of |
248 manually wrapping is often already done by the surrounding infrastructure. |
253 manually wrapping is often already done by the surrounding infrastructure. |
249 |
254 |
250 The function @{ML Scan.repeat} can be used with @{ML Scan.one} to read any |
255 The function @{ML Scan.repeat} can be used with @{ML Scan.one} to read any |
251 string as in |
256 string as in |
252 |
257 |
346 |
351 |
347 section {* Parsing Theory Syntax *} |
352 section {* Parsing Theory Syntax *} |
348 |
353 |
349 text {* |
354 text {* |
350 Most of the time, however, Isabelle developers have to deal with parsing |
355 Most of the time, however, Isabelle developers have to deal with parsing |
351 tokens, not strings. This is because the parsers for the theory syntax, as |
356 tokens, not strings. These token parsers will have the type |
352 well as the parsers for the arguments of proof methods the type @{ML_type OuterLex.token} |
357 *} |
353 (which is identical to the type @{ML_type OuterParse.token}). There are also handy |
358 |
354 parsers for ML-expressions and ML-files. |
359 ML{*type 'a parser = OuterLex.token list -> 'a * OuterLex.token list*} |
|
360 |
|
361 text {* |
|
362 This reason for using token parsers is that theory syntax, as well as the |
|
363 parsers for the arguments of proof methods, use the type @{ML_type |
|
364 OuterLex.token} (which is identical to the type @{ML_type |
|
365 OuterParse.token}). However, there are also handy parsers for |
|
366 ML-expressions and ML-files. |
355 |
367 |
356 \begin{readmore} |
368 \begin{readmore} |
357 The parser functions for the theory syntax are contained in the structure |
369 The parser functions for the theory syntax are contained in the structure |
358 @{ML_struct OuterParse} defined in the file @{ML_file "Pure/Isar/outer_parse.ML"}. |
370 @{ML_struct OuterParse} defined in the file @{ML_file "Pure/Isar/outer_parse.ML"}. |
359 The definition for tokens is in the file @{ML_file "Pure/Isar/outer_lex.ML"}. |
371 The definition for tokens is in the file @{ML_file "Pure/Isar/outer_lex.ML"}. |
365 kind of tokens. |
377 kind of tokens. |
366 *} |
378 *} |
367 |
379 |
368 text {* |
380 text {* |
369 The first example shows how to generate a token list out of a string using |
381 The first example shows how to generate a token list out of a string using |
370 the function @{ML "OuterSyntax.scan"}. It is given below @{ML "Position.none"} |
382 the function @{ML "OuterSyntax.scan"}. It is given the argument @{ML "Position.none"} |
371 as argument since, at the moment, we are not interested in generating |
383 since, at the moment, we are not interested in generating |
372 precise error messages. The following code |
384 precise error messages. The following code |
373 |
385 |
374 @{ML_response_fake [display,gray] "OuterSyntax.scan Position.none \"hello world\"" |
386 @{ML_response_fake [display,gray] "OuterSyntax.scan Position.none \"hello world\"" |
375 "[Token (\<dots>,(Ident, \"hello\"),\<dots>), |
387 "[Token (\<dots>,(Ident, \"hello\"),\<dots>), |
376 Token (\<dots>,(Space, \" \"),\<dots>), |
388 Token (\<dots>,(Space, \" \"),\<dots>), |
420 "let |
432 "let |
421 val (keywords, commands) = OuterKeyword.get_lexicons () |
433 val (keywords, commands) = OuterKeyword.get_lexicons () |
422 in |
434 in |
423 (Scan.dest_lexicon commands, Scan.dest_lexicon keywords) |
435 (Scan.dest_lexicon commands, Scan.dest_lexicon keywords) |
424 end" |
436 end" |
425 "([\"}\",\"{\",\<dots>],[\"\<rightleftharpoons>\",\"\<leftharpoondown>\",\<dots>])"} |
437 "([\"}\", \"{\", \<dots>],[\"\<rightleftharpoons>\", \"\<leftharpoondown>\", \<dots>])"} |
426 |
438 |
427 The parser @{ML "OuterParse.$$$"} parses a single keyword. For example: |
439 The parser @{ML "OuterParse.$$$"} parses a single keyword. For example: |
428 |
440 |
429 @{ML_response [display,gray] |
441 @{ML_response [display,gray] |
430 "let |
442 "let |
431 val input1 = filtered_input \"where for\" |
443 val input1 = filtered_input \"where for\" |
432 val input2 = filtered_input \"| in\" |
444 val input2 = filtered_input \"| in\" |
433 in |
445 in |
434 (OuterParse.$$$ \"where\" input1, OuterParse.$$$ \"|\" input2) |
446 (OuterParse.$$$ \"where\" input1, OuterParse.$$$ \"|\" input2) |
435 end" |
447 end" |
436 "((\"where\",\<dots>),(\"|\",\<dots>))"} |
448 "((\"where\",\<dots>), (\"|\",\<dots>))"} |
437 |
449 |
438 Like before, you can sequentially connect parsers with @{ML "--"}. For example: |
450 Like before, you can sequentially connect parsers with @{ML "--"}. For example: |
439 |
451 |
440 @{ML_response [display,gray] |
452 @{ML_response [display,gray] |
441 "let |
453 "let |
442 val input = filtered_input \"| in\" |
454 val input = filtered_input \"| in\" |
443 in |
455 in |
444 (OuterParse.$$$ \"|\" -- OuterParse.$$$ \"in\") input |
456 (OuterParse.$$$ \"|\" -- OuterParse.$$$ \"in\") input |
445 end" |
457 end" |
446 "((\"|\",\"in\"),[])"} |
458 "((\"|\", \"in\"),[])"} |
447 |
459 |
448 The parser @{ML "OuterParse.enum s p" for s p} parses a possibly empty |
460 The parser @{ML "OuterParse.enum s p" for s p} parses a possibly empty |
449 list of items recognised by the parser @{text p}, where the items being parsed |
461 list of items recognised by the parser @{text p}, where the items being parsed |
450 are separated by the string @{text s}. For example: |
462 are separated by the string @{text s}. For example: |
451 |
463 |
453 "let |
465 "let |
454 val input = filtered_input \"in | in | in foo\" |
466 val input = filtered_input \"in | in | in foo\" |
455 in |
467 in |
456 (OuterParse.enum \"|\" (OuterParse.$$$ \"in\")) input |
468 (OuterParse.enum \"|\" (OuterParse.$$$ \"in\")) input |
457 end" |
469 end" |
458 "([\"in\",\"in\",\"in\"],[\<dots>])"} |
470 "([\"in\", \"in\", \"in\"],[\<dots>])"} |
459 |
471 |
460 @{ML "OuterParse.enum1"} works similarly, except that the parsed list must |
472 @{ML "OuterParse.enum1"} works similarly, except that the parsed list must |
461 be non-empty. Note that we had to add a string @{text [quotes] "foo"} at the |
473 be non-empty. Note that we had to add a string @{text [quotes] "foo"} at the |
462 end of the parsed string, otherwise the parser would have consumed all |
474 end of the parsed string, otherwise the parser would have consumed all |
463 tokens and then failed with the exception @{text "MORE"}. Like in the |
475 tokens and then failed with the exception @{text "MORE"}. Like in the |
470 val input = filtered_input \"in | in | in\" |
482 val input = filtered_input \"in | in | in\" |
471 in |
483 in |
472 Scan.finite OuterLex.stopper |
484 Scan.finite OuterLex.stopper |
473 (OuterParse.enum \"|\" (OuterParse.$$$ \"in\")) input |
485 (OuterParse.enum \"|\" (OuterParse.$$$ \"in\")) input |
474 end" |
486 end" |
475 "([\"in\",\"in\",\"in\"],[])"} |
487 "([\"in\", \"in\", \"in\"],[])"} |
476 |
488 |
477 The following function will help to run examples. |
489 The following function will help to run examples. |
478 |
490 |
479 *} |
491 *} |
480 |
492 |
508 |
520 |
509 Whenever there is a possibility that the processing of user input can fail, |
521 Whenever there is a possibility that the processing of user input can fail, |
510 it is a good idea to give as much information about where the error |
522 it is a good idea to give as much information about where the error |
511 occured. For this Isabelle can attach positional information to tokens |
523 occured. For this Isabelle can attach positional information to tokens |
512 and then thread this information up the processing chain. To see this, |
524 and then thread this information up the processing chain. To see this, |
513 modify the function @{ML filtered_input} described earlier to be |
525 modify the function @{ML filtered_input} described earlier to |
514 *} |
526 *} |
515 |
527 |
516 ML{*fun filtered_input' str = |
528 ML{*fun filtered_input' str = |
517 filter OuterLex.is_proper (OuterSyntax.scan (Position.line 7) str) *} |
529 filter OuterLex.is_proper (OuterSyntax.scan (Position.line 7) str) *} |
518 |
530 |
627 (SpecParse.opt_thm_name ":" -- OuterParse.prop))) []*} |
639 (SpecParse.opt_thm_name ":" -- OuterParse.prop))) []*} |
628 |
640 |
629 text {* |
641 text {* |
630 Note that the parser does not parse the keyword \simpleinductive, even if it is |
642 Note that the parser does not parse the keyword \simpleinductive, even if it is |
631 meant to process definitions as shown above. The parser of the keyword |
643 meant to process definitions as shown above. The parser of the keyword |
632 will be given by the infrastructure that will eventually calls @{ML spec_parser}. |
644 will be given by the infrastructure that will eventually call @{ML spec_parser}. |
633 |
645 |
634 |
646 |
635 To see what the parser returns, let us parse the string corresponding to the |
647 To see what the parser returns, let us parse the string corresponding to the |
636 definition of @{term even} and @{term odd}: |
648 definition of @{term even} and @{term odd}: |
637 |
649 |
716 (name, map Args.dest_src attrib) |
728 (name, map Args.dest_src attrib) |
717 end" "(foo_lemma, [((\"intro\", []), \<dots>), ((\"dest\", [\<dots>]), \<dots>)])"} |
729 end" "(foo_lemma, [((\"intro\", []), \<dots>), ((\"dest\", [\<dots>]), \<dots>)])"} |
718 |
730 |
719 The function @{ML opt_thm_name in SpecParse} is the ``optional'' variant of |
731 The function @{ML opt_thm_name in SpecParse} is the ``optional'' variant of |
720 @{ML thm_name in SpecParse}. As can be seen each theorem name can contain some |
732 @{ML thm_name in SpecParse}. As can be seen each theorem name can contain some |
721 attributes. |
733 attributes. The name has to end with @{text [quotes] ":"}---see argument of |
722 |
734 @{ML SpecParse.opt_thm_name} in Line 9. |
723 For the inductive definitions described above only the attibutes @{text "[intro]"} and |
|
724 @{text "[simp]"} make sense. |
|
725 |
735 |
726 \begin{readmore} |
736 \begin{readmore} |
727 Attributes and arguments are implemented in the files @{ML_file "Pure/Isar/attrib.ML"} |
737 Attributes and arguments are implemented in the files @{ML_file "Pure/Isar/attrib.ML"} |
728 and @{ML_file "Pure/Isar/args.ML"}. |
738 and @{ML_file "Pure/Isar/args.ML"}. |
729 \end{readmore} |
739 \end{readmore} |
845 or something similar depending on your Isabelle distribution and architecture. |
855 or something similar depending on your Isabelle distribution and architecture. |
846 One quick way to assign a shell variable to this directory is by typing |
856 One quick way to assign a shell variable to this directory is by typing |
847 |
857 |
848 @{text [display] "$ ISABELLE_LOGS=\"$(isabelle getenv -b ISABELLE_OUTPUT)\"/log"} |
858 @{text [display] "$ ISABELLE_LOGS=\"$(isabelle getenv -b ISABELLE_OUTPUT)\"/log"} |
849 |
859 |
850 on the Unix prompt. The directory should include the files: |
860 on the Unix prompt. If you now also type @{text "ls $ISABELLE_LOGS"}, then the |
|
861 directory should include the files: |
851 |
862 |
852 @{text [display] |
863 @{text [display] |
853 "Pure.gz |
864 "Pure.gz |
854 HOL.gz |
865 HOL.gz |
855 Pure-ProofGeneral.gz |
866 Pure-ProofGeneral.gz |
954 proposition. Its argument @{ML NONE} stands for a locale (which we chose to |
965 proposition. Its argument @{ML NONE} stands for a locale (which we chose to |
955 omit); the argument @{ML "(K I)"} stands for a function that determines what |
966 omit); the argument @{ML "(K I)"} stands for a function that determines what |
956 should be done with the theorem once it is proved (we chose to just forget |
967 should be done with the theorem once it is proved (we chose to just forget |
957 about it). Lines 9 to 11 contain the parser for the proposition. |
968 about it). Lines 9 to 11 contain the parser for the proposition. |
958 |
969 |
959 (FIXME: explain @{ML Toplevel.print} etc) |
|
960 |
|
961 If you now type \isacommand{foobar}~@{text [quotes] "True \<and> True"}, you obtain the following |
970 If you now type \isacommand{foobar}~@{text [quotes] "True \<and> True"}, you obtain the following |
962 proof state: |
971 proof state: |
963 |
972 |
964 \begin{isabelle} |
973 \begin{isabelle} |
965 \isacommand{foobar}~@{text [quotes] "True \<and> True"}\\ |
974 \isacommand{foobar}~@{text [quotes] "True \<and> True"}\\ |
974 \isacommand{apply}@{text "(rule conjI)"}\\ |
983 \isacommand{apply}@{text "(rule conjI)"}\\ |
975 \isacommand{apply}@{text "(rule TrueI)+"}\\ |
984 \isacommand{apply}@{text "(rule TrueI)+"}\\ |
976 \isacommand{done} |
985 \isacommand{done} |
977 \end{isabelle} |
986 \end{isabelle} |
978 |
987 |
|
988 However, once you change the ``kind'' of a command from @{ML thy_decl in OuterKeyword} |
|
989 to @{ML thy_goal in OuterKeyword} then the keyword file needs to be re-created. |
979 |
990 |
980 (FIXME What do @{ML "Toplevel.theory"} |
991 (FIXME What do @{ML "Toplevel.theory"} |
981 @{ML "Toplevel.print"} |
992 @{ML "Toplevel.print"} |
982 @{ML Toplevel.local_theory}?) |
993 @{ML Toplevel.local_theory} do?) |
983 |
994 |
984 (FIXME read a name and show how to store theorems) |
995 (FIXME read a name and show how to store theorems) |
985 |
996 |
986 *} |
997 *} |
987 |
998 |