24 |
24 |
25 \begin{center} |
25 \begin{center} |
26 \begin{tabular}{c} |
26 \begin{tabular}{c} |
27 \\[-5mm] |
27 \\[-5mm] |
28 \LARGE\bf Certified Parsing\\[-10mm] |
28 \LARGE\bf Certified Parsing\\[-10mm] |
29 \mbox{} |
29 \mbox{} |
30 \end{tabular} |
30 \end{tabular} |
31 \end{center} |
31 \end{center} |
32 \thispagestyle{empty} |
32 \thispagestyle{empty} |
33 \mbox{}\\[-5mm] |
33 \mbox{}\\[-5mm] |
34 |
34 |
35 \begin{multicols}{2} |
35 \begin{multicols}{2} |
36 \section*{Background} |
36 \section*{Background} |
37 \noindent |
37 \noindent |
38 Parsing is the act of transforming plain text into some |
38 Parsing is the act of transforming plain text into some |
39 structure that can be analysed by computers for further processing. |
39 structure that can be analysed by computers for further processing. |
40 One might think that parsing has been studied to death and after |
40 One might think that parsing has been studied to death and after |
41 \emph{yacc} and \emph{lex} no new results can be obtained in this area. |
41 \emph{yacc} and \emph{lex} no new results can be obtained in this area. |
42 However recent results and novel approaches make it increasingly clear, |
42 However recent results and novel approaches make it increasingly clear, |
43 that this is not true anymore. |
43 that this is not true anymore. |
44 |
44 |
45 We propose to approach the subject of parsing from a certification point |
45 We propose to approach the subject of parsing from a certification point |
46 of view. Parsers are increasingly part of certified compilers, like \mbox{\emph{CompCert}}, |
46 of view. Parsers are increasingly part of certified compilers, like \mbox{\emph{CompCert}}, |
47 which are guaranteed to be correct and bug-free. Such certified compilers are |
47 which are guaranteed to be correct and bug-free. Such certified compilers are |
48 crucial in areas where software just cannot fail. However, so far the |
48 crucial in areas where software just cannot fail. However, so far the |
49 parsers of these compilers have been left out of the certification. |
49 parsers of these compilers have been left out of the certification. |
50 This is because parsing algorithms are often adhoc and their semantics |
50 This is because parsing algorithms are often adhoc and their semantics |
51 is not clearly specified. Unfortunately, this means parsers can harbour |
51 is not clearly specified. Unfortunately, this means parsers can harbour |
52 errors that potentially invalidate the whole certification and correctness |
52 errors that potentially invalidate the whole certification and correctness |
53 of the compiler. In this project, we like to change that. |
53 of the compiler. In this project, we like to change that. |
54 |
54 |
55 Only in the last few years, theorem provers have become good enough |
55 Only in the last few years, theorem provers have become good enough |
56 for establishing the correctness of some standard lexing and |
56 for establishing the correctness of some standard lexing and |
57 parsing algorithms. For this, the algorithms need to be formulated |
57 parsing algorithms. For this, the algorithms need to be formulated |
58 in way so that it is easy to reason about them. In earlier work |
58 in way so that it is easy to reason about them. In earlier work |
59 about lexing and regular languages, the authors showed that this |
59 about lexing and regular languages, the authors showed that this |
60 precludes well-known algorithms working over graphs. However regular |
60 precludes well-known algorithms working over graphs. However regular |
61 languages can be formulated and reasoned about entirely in terms |
61 languages can be formulated and reasoned about entirely in terms |
62 regular expressions, which can be easily represented in theorem |
62 regular expressions, which can be easily represented in theorem |
63 provers. This work uses the device of derivatives of regular |
63 provers. This work uses the device of derivatives of regular |
64 expressions. We like to extend this device to parsers and grammars. |
64 expressions. We like to extend this device to parsers and grammars. |
65 The aim is to come up with elegant and useful parsing algorithms |
65 The aim is to come up with elegant and useful parsing algorithms |
66 whose correctness and the absence of bugs can be certified in a |
66 whose correctness and the absence of bugs can be certified in a |
67 theorem prover. |
67 theorem prover. |
68 |
68 |
69 \section*{Proposed Work} |
69 \section*{Proposed Work} |
|
70 |
|
71 One new development in formal grammar is the Parsing Expression Grammar (PEG) which is proposed as an refinement of standard Context Free Grammar. The idea is to introduce negative, conjunctive operators as well as production priorities, so that ambiguity abound in CFG can be eliminated. Another benefit of PEG is that it admits a very efficient linear parsing algorithm. |
|
72 |
|
73 |
70 \mbox{}\\[15cm] |
74 \mbox{}\\[15cm] |
71 |
75 |
72 \noindent |
76 \noindent |
|
77 |
|
78 |
73 |
79 |
74 %\small |
80 %\small |
75 %\bibliography{../../bib/all} |
81 %\bibliography{../../bib/all} |
76 %\bibliographystyle{abbrv} |
82 %\bibliographystyle{abbrv} |
77 \end{multicols} |
83 \end{multicols} |
78 |
84 |
79 % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
85 % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
80 % \noindent {\bf Objectives:} The overall goals of the project are as follows: |
86 % \noindent {\bf Objectives:} The overall goals of the project are as follows: |
85 % nominal datatype package. |
91 % nominal datatype package. |
86 % \item To explore the strengths of this package by proving the |
92 % \item To explore the strengths of this package by proving the |
87 % safety of SML. |
93 % safety of SML. |
88 % \item To provide a basis for extracting programs from safety proofs. |
94 % \item To provide a basis for extracting programs from safety proofs. |
89 |
95 |
90 % \item To make the nominal datatype package usable for teaching |
96 % \item To make the nominal datatype package usable for teaching |
91 % students about the lambda-calculus and the theory of programming |
97 % students about the lambda-calculus and the theory of programming |
92 % languages. \smallskip |
98 % languages. \smallskip |
93 % \end{itemize} |
99 % \end{itemize} |
94 |
100 |
95 |
101 |
96 |
102 |
97 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
103 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
98 \end{document} |
104 \end{document} |
99 |
105 |
100 %%% Local Variables: |
106 %%% Local Variables: |
101 %%% mode: latex |
107 %%% mode: latex |
102 %%% TeX-master: t |
108 %%% TeX-master: t |
103 %%% TeX-command-default: "PdfLaTeX" |
109 %%% TeX-command-default: "PdfLaTeX" |
104 %%% TeX-view-style: (("." "kpdf %s.pdf")) |
110 %%% TeX-view-style: (("." "kpdf %s.pdf")) |
105 %%% End: |
111 %%% End: |