ChengsongTanPhdThesis/Chapters/Cubic.tex
changeset 639 80cc6dc4c98b
parent 638 dd9dde2d902b
equal deleted inserted replaced
638:dd9dde2d902b 639:80cc6dc4c98b
    25 extensions to our $\blexersimp$.
    25 extensions to our $\blexersimp$.
    26 We make a conjecture that the finite
    26 We make a conjecture that the finite
    27 size bound from the previous chapter can be 
    27 size bound from the previous chapter can be 
    28 improved to a cubic bound.
    28 improved to a cubic bound.
    29 We implemented our conjecture in Scala.
    29 We implemented our conjecture in Scala.
    30 We intend to formalise this part in Isabelle/HOL at a
    30 We have not formalised this part in Isabelle/HOL. 
    31 later stage.
       
    32 %we have not been able to finish due to time constraints of the PhD.
    31 %we have not been able to finish due to time constraints of the PhD.
    33 Nevertheless, we outline the ideas we intend to use for the proof.
    32 Nevertheless, we outline the ideas we intend to use for the proof.
    34 
    33 
    35 \section{A Stronger Version of Simplification}
    34 \section{A Stronger Version of Simplification}
    36 
    35 
    37 We present further improvements
    36 Let us first present further improvements
    38 for our lexer algorithm $\blexersimp$.
    37 for our lexer algorithm $\blexersimp$.
    39 We devise a stronger simplification algorithm,
    38 We devise a stronger simplification algorithm,
    40 called $\bsimpStrong$, which can prune away
    39 called $\bsimpStrong$, which can prune away
    41 similar components in two regular expressions at the same 
    40 similar components in two regular expressions at the same 
    42 alternative level,
    41 alternative level,
    68 For example, the regular expression 
    67 For example, the regular expression 
    69 \[
    68 \[
    70 	aa \cdot a^*+ a \cdot a^* + aa\cdot a^*
    69 	aa \cdot a^*+ a \cdot a^* + aa\cdot a^*
    71 \]
    70 \]
    72 contains three terms, 
    71 contains three terms, 
    73 expressing three possibilities for how it can match some input.
    72 expressing three possibilities for how it can match some more
       
    73 input of the form $a \ldots a$.
    74 The first and the third terms are identical, which means we can eliminate
    74 The first and the third terms are identical, which means we can eliminate
    75 the latter as it will not contribute to a POSIX value.
    75 the latter as it will not contribute to a POSIX value.
    76 In $\bsimps$, the $\distinctBy$ function takes care of 
    76 In $\bsimps$, the $\distinctBy$ function takes care of 
    77 such instances.
    77 such instances.
    78 The criteria $\distinctBy$ uses for removing a duplicate
    78 The criteria $\distinctBy$ uses for removing a duplicate
   300       }
   300       }
   301 
   301 
   302 
   302 
   303 \end{lstlisting}
   303 \end{lstlisting}
   304 \caption{The helper functions of $\textit{prune}$: 
   304 \caption{The helper functions of $\textit{prune}$: 
   305   $\textit{atMostEmpty}$, $\textit{isOne}$ and $\textit{removeSeqTail}$}
   305 $\textit{atMostEmpty}$, $\textit{isOne}$ and $\textit{removeSeqTail}$. $\textit{atMostEmpty}
       
   306 $ is a function that takes a regular expression and returns true only in case that it
       
   307 contains nothing more than the empty string in its language. $\textit{isOne}$ tests whether
       
   308 a regular expression is equivalent to $\ONE$. $\textit{removeSeqTail}$ is a function that
       
   309 takes away the tail of a sequence regular expression.}
   306 \end{figure}
   310 \end{figure}
   307 \noindent
   311 \noindent
   308 Suppose we feed 
   312 Suppose we feed 
   309 \begin{center}
   313 \begin{center}
   310 	$r= (\underline{\ONE}+(\underline{f}+b)\cdot g)\cdot (a\cdot(d\cdot e))$
   314 	$r= (\underline{\ONE}+(\underline{f}+b)\cdot g)\cdot (a\cdot(d\cdot e))$
   386                 )
   390                 )
   387             }
   391             }
   388         }
   392         }
   389 
   393 
   390 \end{lstlisting}
   394 \end{lstlisting}
   391 \caption{A Stronger Version of $\textit{distinctBy}$ XXX}
   395 \caption{A Stronger Version of $\textit{distinctBy}$. This function allows ``partial de-duplication''
       
   396 of a regular expression. When part of a regular expression has appeared before in the accumulator, we 
       
   397 can remove that verbose component.}
   392 \end{figure}
   398 \end{figure}
   393 \noindent
   399 \noindent
   394 Once a regular expression has been pruned,
   400 Once a regular expression has been pruned,
   395 all its components will be added to the accumulator
   401 all its components will be added to the accumulator
   396 to remove any future regular expressions' duplicate components.
   402 to remove any future regular expressions' duplicate components.
   435 \noindent
   441 \noindent
   436 The benefits of using 
   442 The benefits of using 
   437 $\textit{prune}$ refining the finiteness bound
   443 $\textit{prune}$ refining the finiteness bound
   438 to a cubic bound has not been formalised yet.
   444 to a cubic bound has not been formalised yet.
   439 Therefore we choose to use Scala code rather than an Isabelle-style formal 
   445 Therefore we choose to use Scala code rather than an Isabelle-style formal 
   440 definition like we did for $\simp$, as the definitions might change
   446 definition like we did for $\textit{bsimp}$, as the definitions might change
   441 to suit our proof needs.
   447 to suit our proof needs.
   442 In the rest of the chapter we will use this convention consistently.
   448 %In the rest of the chapter we will use this convention consistently.
   443 
   449 
   444 %The function $\textit{prune}$ is used in $\distinctWith$.
   450 %The function $\textit{prune}$ is used in $\distinctWith$.
   445 %$\distinctWith$ is a stronger version of $\distinctBy$
   451 %$\distinctWith$ is a stronger version of $\distinctBy$
   446 %which not only removes duplicates as $\distinctBy$ would
   452 %which not only removes duplicates as $\distinctBy$ would
   447 %do, but also uses the $\textit{pruneFunction}$
   453 %do, but also uses the $\textit{pruneFunction}$
   667 The union on the right-hand-side has $n * (n + 1) / 2$ terms.
   673 The union on the right-hand-side has $n * (n + 1) / 2$ terms.
   668 This leads us to believe that the maximum number of terms needed
   674 This leads us to believe that the maximum number of terms needed
   669 in our derivative is also only $n * (n + 1) / 2$.
   675 in our derivative is also only $n * (n + 1) / 2$.
   670 Therefore
   676 Therefore
   671 we conjecture that $\bsimpStrong$ is also able to achieve this
   677 we conjecture that $\bsimpStrong$ is also able to achieve this
   672 upper limit in general
   678 upper limit in general.
   673 \begin{conjecture}\label{bsimpStrongInclusionPder}
   679 \begin{conjecture}\label{bsimpStrongInclusionPder}
   674 	Using a suitable transformation $f$, we have that
   680 	Using a suitable transformation $f$, we have that
   675 	\begin{center}
   681 	\begin{center}
   676 		$\forall s.\; f \; (r \bdersStrong  \; s) \subseteq
   682 		$\forall s.\; f \; (r \bdersStrong  \; s) \subseteq
   677 		 \textit{PDER}_{\Sigma^*} \; r$
   683 		 \textit{PDER}_{\Sigma^*} \; r$