lexing: comparison ChengsongTanPhdThesis/Chapters/Introduction.tex

equal deleted inserted replaced

-:0d1e68268d0f
+:ba44144875b1
 %TODO: get source for SNORT/BRO's regex matching engine/speed
 %----------------------------------------------------------------------------------------
 \section{Contribution}
-{\color{red} \rule{\linewidth}{0.5mm}}
+%{\color{red} \rule{\linewidth}{0.5mm}}
-\textbf{issue with this part: way too short, not enough details of what I have done.}
+%\textbf{issue with this part: way too short, not enough details of what I have done.}
-{\color{red} \rule{\linewidth}{0.5mm}}
+%{\color{red} \rule{\linewidth}{0.5mm}}
+\marginpar{\em Gerog: No sufficient context on related work making contribution hard to digest.}
-In this thesis,
-we propose a solution to catastrophic
-backtracking and error-prone matchers: a formally verified
+%In this thesis,
-regular expression lexing algorithm
+%we propose a solution to catastrophic
-that is both fast
+%backtracking and error-prone matchers: a formally verified
-and correct.
+%regular expression lexing algorithm
-{\color{red} \rule{\linewidth}{0.5mm}}
+%that is both fast
-\HandRight Added content:
+%and correct.
+%%{\color{red} \rule{\linewidth}{0.5mm}}
+%\HandRight Added content:
 %Package \verb`pifont`: \ding{43}
 %Package \verb`utfsym`: \usym{261E}
 %Package \verb`dingbat`: \leftpointright
 %Package \verb`dingbat`: \rightpointright
+We have made mainly two contributions in this thesis: %the
+%lexer we developed based on Brzozowski derivatives and
+%Sulzmanna and Lu's developments called
+proving the lexer $\blexersimp$ is both correctness and fast.
+It is correct w.r.t a formalisation of POSIX lexing by Ausaf et al.\ref{AusafDyckhoffUrban2016}.
+It is fast compared with un-optimised implementations like Sulzmann and Lu's orignal
+development by our metric of internal data structures not growing unbounded.
+Our formalisation of complexity is unique among similar works in the sense that
+%is about the size of internal data structures.
+to our knowledge %we don't know of a
+there is not a lexing/parsing algorithm with similar formalised correctness theorems.
+Common practices involve making some empirical analysis of the complexity of the algorithm
+in question (\ref{Verbatim}, \ref{Verbatimpp}), or relying
+on prior (unformalised) complexity analysis of well-known algorithms (\ref{ValiantParsing}),
+making them prone to complexity bugs.
+%TODO: add citation
+%, for example in the Verbatim work \ref{Verbatim}
+%While formalised proofs have not included
+%Issues known as "performance bugs" can
+Whilst formalised complexity theorems
+have not yet appeared in other certified lexers/parsers,
+%while this work is done,
+they do find themselves in the broader theorem-proving literature:
+\emph{time credits} have been formalised for separation logic in Coq
+\ref{atkey2010amortised}%not used in
+to characterise the runtime complexity of an algorithm,
+where integer values are recorded %from the beginning of an execution
+as part of the program state
+and decremented in each step.
+The idea is that the total number of decrements
+from the time credits during execution represents the complexity of an algorithm.
+%each time a basic step is executed.
+%The way to use a time credit Time credit is an integer
+%is assigned to a program prior to execution, and each step in the program consumes
+%one credit.
+This has been further developed to include the big-O notation
+so one can prove both the functional correctness and asymptotic complexity
+of higher-order imperative algorithms \ref{bigOImperative}.
+%for example the complexity of
+%the Quicksort algorithm
+%is $\Theta n\ln n$
+\marginpar{more work on formalising complexity}.
+\marginpar{new changes up to this point.}
 In particular, the main problem we solved on top of previous work was
 coming up with a formally verified algorithm called $\blexersimp$ based
 on Brzozowski derivatives. It calculates a POSIX
 lexical value from a string and a regular expression. This algorithm was originally

changeset 664	ba44144875b1
parent 654	2ad20ba5b178
child 665	3bedbdce3a3b