diff -r 3cbcd7cda0a9 -r 0497408a3598 ChengsongTanPhdThesis/Chapters/Chapter5.tex --- a/ChengsongTanPhdThesis/Chapters/Chapter5.tex Wed Jul 13 08:27:28 2022 +0100 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,87 +0,0 @@ -% Chapter Template - -\chapter{A Better Bound and Other Extensions} % Main chapter title - -\label{Chapter5} %In Chapter 5\ref{Chapter5} we discuss stronger simplifications to improve the finite bound -%in Chapter 4 to a polynomial one, and demonstrate how one can extend the -%algorithm to include constructs such as bounded repetitions and negations. - -%---------------------------------------------------------------------------------------- -% SECTION strongsimp -%---------------------------------------------------------------------------------------- -\section{A Stronger Version of Simplification Inspired by Antimirov} -%TODO: search for isabelle proofs of algorithms that check equivalence - - - -%---------------------------------------------------------------------------------------- -% SECTION 1 -%---------------------------------------------------------------------------------------- - -\section{Adding Support for the Negation Construct, and its Correctness Proof} -We now add support for the negation regular expression: -\[ r ::= \ZERO \mid \ONE - \mid c - \mid r_1 \cdot r_2 - \mid r_1 + r_2 - \mid r^* - \mid \sim r -\] -The $\textit{nullable}$ function's clause for it would be -\[ -\textit{nullable}(~r) = \neg \nullable(r) -\] -The derivative would be -\[ -~r \backslash c = ~ (r \backslash c) -\] - -The most tricky part of lexing for the $~r$ regex - is creating a value for it. - For other regular expressions, the value aligns with the - structure of the regex: - \[ - \vdash \Seq(\Char(a), \Char(b)) : a \cdot b - \] -But for the $~r$ regex, $s$ is a member of it if and only if -$s$ does not belong to $L(r)$. -That means when there -is a match for the not regex, it is not possible to generate how the string $s$ matched -with $r$. -What we can do is preserve the information of how $s$ was not matched by $r$, -and there are a number of options to do this. - -We could give a partial value when there is a partial match for the regex inside -the $\mathbf{not}$ construct. -For example, the string $ab$ is not in the language of $(a\cdot b) \cdot c$, -A value for it could be - \[ - \vdash \textit{Not}(\Seq(\Char(a), \Char(b))) : ~((a \cdot b ) \cdot c) - \] - The above example demonstrates what value to construct - when the string $s$ is at most a real prefix - of the strings in $L(r)$. When $s$ instead is not a prefix of any strings - in $L(r)$, it becomes unclear what to return as a value inside the $\textit{Not}$ - constructor. - - Another option would be to either store the string $s$ that resulted in - a mis-match for $r$ or a dummy value as a placeholder: - \[ - \vdash \textit{Not}(abcd) : ~(a^*) - \] -or - \[ - \vdash \textit{Not}(\textit{Dummy}) : ~(a^*) - \] - We choose to implement this as it is most straightforward: - \[ - \mkeps(~(r)) = \textit{if}(\nullable(r)) \; \textit{Error} \; \textit{else} \; \textit{Not}(\textit{Dummy}) - \] - -%---------------------------------------------------------------------------------------- -% SECTION 2 -%---------------------------------------------------------------------------------------- - -\section{Bounded Repetitions} - -