thys2/Paper/document/root.tex
changeset 401 8bbe2468fedc
parent 400 46e5566ad4ba
child 402 1612f2a77bf6
equal deleted inserted replaced
400:46e5566ad4ba 401:8bbe2468fedc
    64   the information of \emph{how} a regular expression matches a
    64   the information of \emph{how} a regular expression matches a
    65   string---that is, which part of the string is matched by which part
    65   string---that is, which part of the string is matched by which part
    66   of the regular expression.  The purpose of the bitcodes in Sulzmann
    66   of the regular expression.  The purpose of the bitcodes in Sulzmann
    67   and Lu's algorithm is to generate POSIX values incrementally while
    67   and Lu's algorithm is to generate POSIX values incrementally while
    68   derivatives are calculated. However they also help with designing
    68   derivatives are calculated. However they also help with designing
    69   `aggressive' simplification methods that keep the size of
    69   `aggressive' simplification functions that keep the size of
    70   derivatives small. Without simplification derivatives can grow
    70   derivatives small. Without simplification derivatives can grow
    71   exponentially resulting in an extremely slow lexing algorithm.  In this
    71   exponentially resulting in an extremely slow lexing algorithm.  In this
    72   paper we describe a variant of Sulzmann and Lu's algorithm: Our
    72   paper we describe a variant of Sulzmann and Lu's algorithm: Our
    73   algorithm is a small, recursive functional program, whereas Sulzmann
    73   algorithm is a small, recursive functional program, whereas Sulzmann
    74   and Lu's version involves a fixpoint construction. We \textit{(i)}
    74   and Lu's version involves a fixpoint construction. We \textit{(i)}
    75   prove in Isabelle/HOL that our program is correct and generates
    75   prove in Isabelle/HOL that our program is correct and generates
    76   unique POSIX values; we also \textit{(ii)} establish a polynomial
    76   unique POSIX values; we also \textit{(ii)} establish a polynomial
    77   bound for the size of the derivatives. The size can be seen as a
    77   bound for the size of the derivatives. The size can be seen as a
    78   proxy measure for the effeciency of the lexing algorithm---that means
    78   proxy measure for the efficiency of the lexing algorithm---that means
    79   our algorithm does not suffer from the exponential blowup.
    79   because of the polynomial bound our algorithm does not suffer from
       
    80   the exponential blowup in earlier works.
    80   
    81   
    81   % Brzozowski introduced the notion of derivatives for regular
    82   % Brzozowski introduced the notion of derivatives for regular
    82   % expressions. They can be used for a very simple regular expression
    83   % expressions. They can be used for a very simple regular expression
    83   % matching algorithm.  Sulzmann and Lu cleverly extended this
    84   % matching algorithm.  Sulzmann and Lu cleverly extended this
    84   % algorithm in order to deal with POSIX matching, which is the
    85   % algorithm in order to deal with POSIX matching, which is the