225
|
1 |
\documentclass{article}
|
|
2 |
\usepackage{a4wide,ot1patch}
|
|
3 |
\usepackage[latin1]{inputenc}
|
|
4 |
\usepackage{multicol}
|
|
5 |
\usepackage{charter}
|
|
6 |
\usepackage{amsmath,amssymb,amsthm}
|
|
7 |
\usepackage{fancyheadings}
|
|
8 |
|
|
9 |
\addtolength{\oddsidemargin}{-6mm}
|
|
10 |
\addtolength{\evensidemargin}{-6mm}
|
|
11 |
\addtolength{\textwidth}{11mm}
|
|
12 |
\addtolength{\columnsep}{3mm}
|
|
13 |
\addtolength{\textheight}{8mm}
|
|
14 |
\addtolength{\topmargin}{-7.5mm}
|
|
15 |
|
|
16 |
\pagestyle{fancyplain}
|
|
17 |
\lhead[\fancyplain{}{A}]{\fancyplain{}{}}
|
|
18 |
\rhead[\fancyplain{}{C}]{\fancyplain{}{}}
|
|
19 |
\renewcommand{\headrulewidth}{0pt}
|
|
20 |
|
|
21 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
22 |
\begin{document}
|
|
23 |
|
|
24 |
|
|
25 |
\begin{center}
|
|
26 |
\begin{tabular}{c}
|
|
27 |
\\[-5mm]
|
|
28 |
\LARGE\bf Certified Parsing\\[-10mm]
|
|
29 |
\mbox{}
|
|
30 |
\end{tabular}
|
|
31 |
\end{center}
|
|
32 |
\thispagestyle{empty}
|
|
33 |
|
|
34 |
\section*{Background}
|
|
35 |
|
|
36 |
\mbox{}\\[-14mm]
|
|
37 |
|
|
38 |
\begin{multicols}{2}
|
|
39 |
\noindent
|
|
40 |
Parsing is the act of transforming plain text into some
|
|
41 |
structure that can be analysed by computers for further processing.
|
|
42 |
One might think that parsing has been studied to death and after
|
|
43 |
\emph{yacc} and \emph{lex} no new results can be obtained in this area.
|
|
44 |
However recent results and novel approaches make it increasingly clear,
|
|
45 |
that this is not true anymore.
|
|
46 |
|
|
47 |
We propose to approach the subject of parsing from a certification point
|
|
48 |
of view. Parsers are increasingly part of certified compilers, like CompCert,
|
|
49 |
which are guaranteed to be correct and bug-free. Such certified compilers are
|
|
50 |
important in areas where software just cannot fail. However, so far the
|
|
51 |
parsers of these compilers have been left out of the certification.
|
|
52 |
This is because parsing algorithms are often adhoc and their semantics
|
|
53 |
is not clearly specified. Unfortunately, this means parsers can harbour
|
|
54 |
errors that potentially invalidate the whole certification and correctness
|
|
55 |
of the compiler. In this project, we like to change that.
|
|
56 |
|
|
57 |
Only in the last few years, theorem provers have become good enough
|
|
58 |
for establishing the correctness of some standard lexing and
|
|
59 |
parsing algorithms. For this, the algorithms need to be formulated
|
|
60 |
in way so that it is easy to reason about them. In earlier work
|
|
61 |
about lexing and regular languages, the authors showed that this
|
|
62 |
precludes algorithms working over graphs. However regular
|
|
63 |
languages can be formulated and reasoned about entirely in terms
|
|
64 |
regular expressions, which can be easily represented in theorem
|
|
65 |
provers. This work uses the device of derivatives of regular
|
|
66 |
languages. We like to extend this work to parsers and grammars.
|
|
67 |
The aim is to come up with elegant and useful parsing algorithms
|
|
68 |
whose correctness and absence of bugs can be certified in a theorem
|
|
69 |
prover.
|
|
70 |
\mbox{}\\[15cm]
|
|
71 |
|
|
72 |
\noindent
|
|
73 |
|
|
74 |
%\small
|
|
75 |
%\bibliography{../../bib/all}
|
|
76 |
%\bibliographystyle{abbrv}
|
|
77 |
\end{multicols}
|
|
78 |
|
|
79 |
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
80 |
% \noindent {\bf Objectives:} The overall goals of the project are as follows:
|
|
81 |
% \begin{itemize}
|
|
82 |
% \item To solve the POPLmark challenge.
|
|
83 |
|
|
84 |
% \item To complete and greatly improve the existing implementation of the
|
|
85 |
% nominal datatype package.
|
|
86 |
% \item To explore the strengths of this package by proving the
|
|
87 |
% safety of SML.
|
|
88 |
% \item To provide a basis for extracting programs from safety proofs.
|
|
89 |
|
|
90 |
% \item To make the nominal datatype package usable for teaching
|
|
91 |
% students about the lambda-calculus and the theory of programming
|
|
92 |
% languages. \smallskip
|
|
93 |
% \end{itemize}
|
|
94 |
|
|
95 |
|
|
96 |
|
|
97 |
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
98 |
\end{document}
|
|
99 |
|
|
100 |
%%% Local Variables:
|
|
101 |
%%% mode: latex
|
|
102 |
%%% TeX-master: t
|
|
103 |
%%% TeX-command-default: "PdfLaTeX"
|
|
104 |
%%% TeX-view-style: (("." "kpdf %s.pdf"))
|
|
105 |
%%% End:
|