thys2/Paper/document/root.tex
changeset 402 1612f2a77bf6
parent 401 8bbe2468fedc
child 405 3cfea5bb5e23
equal deleted inserted replaced
401:8bbe2468fedc 402:1612f2a77bf6
    61   Sulzmann and Lu described a lexing algorithm that calculates
    61   Sulzmann and Lu described a lexing algorithm that calculates
    62   Brzozowski derivatives using bitcodes annotated to regular
    62   Brzozowski derivatives using bitcodes annotated to regular
    63   expressions.  Their algorithm generates POSIX values which encode
    63   expressions.  Their algorithm generates POSIX values which encode
    64   the information of \emph{how} a regular expression matches a
    64   the information of \emph{how} a regular expression matches a
    65   string---that is, which part of the string is matched by which part
    65   string---that is, which part of the string is matched by which part
    66   of the regular expression.  The purpose of the bitcodes in Sulzmann
    66   of the regular expression.  The purpose of the bitcodes is to generate POSIX values incrementally while
    67   and Lu's algorithm is to generate POSIX values incrementally while
    67   derivatives are calculated. They also help with designing
    68   derivatives are calculated. However they also help with designing
    68   an `aggressive' simplification function that keeps the size of
    69   `aggressive' simplification functions that keep the size of
       
    70   derivatives small. Without simplification derivatives can grow
    69   derivatives small. Without simplification derivatives can grow
    71   exponentially resulting in an extremely slow lexing algorithm.  In this
    70   exponentially resulting in an extremely slow lexing algorithm.  In this
    72   paper we describe a variant of Sulzmann and Lu's algorithm: Our
    71   paper we describe a variant of Sulzmann and Lu's algorithm: Our
    73   algorithm is a small, recursive functional program, whereas Sulzmann
    72   algorithm is a recursive functional program, whereas Sulzmann
    74   and Lu's version involves a fixpoint construction. We \textit{(i)}
    73   and Lu's version involves a fixpoint construction. We \textit{(i)}
    75   prove in Isabelle/HOL that our program is correct and generates
    74   prove in Isabelle/HOL that our program is correct and generates
    76   unique POSIX values; we also \textit{(ii)} establish a polynomial
    75   unique POSIX values; we also \textit{(ii)} establish a polynomial
    77   bound for the size of the derivatives. The size can be seen as a
    76   bound for the size of the derivatives. The size can be seen as a
    78   proxy measure for the efficiency of the lexing algorithm---that means
    77   proxy measure for the efficiency of the lexing algorithm: because of
    79   because of the polynomial bound our algorithm does not suffer from
    78   the polynomial bound our algorithm does not suffer from
    80   the exponential blowup in earlier works.
    79   the exponential blowup in earlier works.
    81   
    80   
    82   % Brzozowski introduced the notion of derivatives for regular
    81   % Brzozowski introduced the notion of derivatives for regular
    83   % expressions. They can be used for a very simple regular expression
    82   % expressions. They can be used for a very simple regular expression
    84   % matching algorithm.  Sulzmann and Lu cleverly extended this
    83   % matching algorithm.  Sulzmann and Lu cleverly extended this