ChengsongTanPhdThesis/Chapters/Finite.tex
changeset 624 8ffa28fce271
parent 620 ae6010c14e49
child 625 b797c9a709d9
equal deleted inserted replaced
623:c0c1ebe09c7d 624:8ffa28fce271
     6 %  In Chapter 4 \ref{Chapter4} we give the second guarantee
     6 %  In Chapter 4 \ref{Chapter4} we give the second guarantee
     7 %of our bitcoded algorithm, that is a finite bound on the size of any 
     7 %of our bitcoded algorithm, that is a finite bound on the size of any 
     8 %regex's derivatives. 
     8 %regex's derivatives. 
     9 
     9 
    10 
    10 
    11 In this chapter we give a guarantee in terms of size of derivatives: 
    11 In this chapter we give a bound in terms of size of 
       
    12 the calculated derivatives: 
    12 given an annotated regular expression $a$, for any string $s$
    13 given an annotated regular expression $a$, for any string $s$
    13 our algorithm $\blexersimp$'s internal annotated regular expression 
    14 our algorithm $\blexersimp$'s derivatives
    14 size  is finitely bounded
    15 are finitely bounded
    15 by a constant  that only depends on $a$.
    16 by a constant  that only depends on $a$.
    16 Formally we show that there exists an $N_a$ such that
    17 Formally we show that there exists an $N_a$ such that
    17 \begin{center}
    18 \begin{center}
    18 	$\llbracket \bderssimp{a}{s} \rrbracket \leq N_a$
    19 	$\llbracket \bderssimp{a}{s} \rrbracket \leq N_a$
    19 \end{center}
    20 \end{center}
    30 		of eliminating ``catastrophic backtracking''. 
    31 		of eliminating ``catastrophic backtracking''. 
    31 		If the internal data structures used by our algorithm
    32 		If the internal data structures used by our algorithm
    32 		grows beyond a finite bound, then clearly 
    33 		grows beyond a finite bound, then clearly 
    33 		the algorithm (which traverses these structures) will
    34 		the algorithm (which traverses these structures) will
    34 		be slow.
    35 		be slow.
    35 		The next step would be to refine the bound $N_a$ so that it
    36 		The next step is to refine the bound $N_a$ so that it
    36 		is polynomial on $\llbracket a\rrbracket$.
    37 		is not just finite but polynomial in $\llbracket a\rrbracket$.
    37 	\item
    38 	\item
    38 		Having the finite bound formalised 
    39 		Having the finite bound formalised 
    39 		gives us a higher confidence that
    40 		gives us a higher confidence that
    40 		our simplification algorithm $\simp$ does not ``mis-behave''
    41 		our simplification algorithm $\simp$ does not ``mis-behave''
    41 		like $\simpsulz$ does.
    42 		like $\simpsulz$ does.
    42 		The bound is universal for a given regular expression, 
    43 		The bound is universal for a given regular expression, 
    43 		which is an advantage over work which 
    44 		which is an advantage over work which 
    44 		only gives empirical evidence on some test cases (see Verbatim++\cite{Verbatimpp}).
    45 		only gives empirical evidence on 
       
    46 		some test cases (see Verbatim++ work\cite{Verbatimpp}).
    45 \end{itemize}
    47 \end{itemize}
    46 In the next section we describe in more detail
    48 In the next section we describe in more detail
    47 what the finite bound means in our algorithm
    49 what the finite bound means in our algorithm
    48 and why the size of the internal data structures of
    50 and why the size of the internal data structures of
    49 a typical derivative-based lexer such as
    51 a typical derivative-based lexer such as