773 |
773 |
774 \subsection{Why Lemma \ref{retrieveStepwise}'s Requirement is too Strong} |
774 \subsection{Why Lemma \ref{retrieveStepwise}'s Requirement is too Strong} |
775 |
775 |
776 %From this chapter we start with the main contribution of this thesis, which |
776 %From this chapter we start with the main contribution of this thesis, which |
777 |
777 |
778 The $\blexer$ proof relies on a lockstep POSIX |
778 The $\blexer$ proof relies on $r_i, \; v_i$ to match each other in lockstep |
779 correspondence between the lexical value and the |
779 for each derivative step $i$, however only $v_0$ is needed and intermediate |
780 regular expression in each derivative and injection. |
780 $v_i$s are purely proof scaffolding. |
781 If we zoom into the diagram \ref{graph:inj} and look specifically at |
781 Moreover property \ref{eq:stepwise} |
782 the pairs $v_i, r_i$ and $v_{i+1},\, r_{i+1}$, we get the diagram demonstrating |
782 is stronger than needed for POSIX lexing: the precondition |
783 the invariant that the same bitcodes can be extracted from the pairs: |
783 $\vdash v_{i+1}:r_{i+1}$ is too general in the sense that it allows arbitrary |
784 \tikzset{three sided/.style={ |
784 values inhabiting in $r_i$ to retrieve bitcodes. |
785 draw=none, |
785 %correspondence between the lexical value and the |
786 append after command={ |
786 %regular expression in derivative and injection operations at the same step $i$. |
787 [-,shorten <= -0.5\pgflinewidth] |
787 %If we revisit the diagram \ref{graph:injZoom} with an example |
788 ([shift={(-1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east) |
788 Consider a concrete example where $a_i = (_{ZZ}x + _{ZS}y + _{S}x)$ and |
789 edge([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west) |
789 $a_{i+1} = (_{ZZ}\ONE + \ZERO + _{S}\ONE)$. |
790 ([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west) |
790 What is required in the proof of $\blexer$ is that for the POSIX value |
791 edge([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west) |
791 $v_i = \Left \; (\Left \; Empty)$, |
792 ([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west) |
792 the property |
793 edge([shift={(-1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east) |
793 \[ |
794 } |
794 %\vdash \Left \; (\Left \; Empty) : (\ONE+\ZERO+\ONE) \implies |
795 } |
795 \retrieve \; (_{ZZ}\ONE + \ZERO + _{S}\ONE) \; (\Left \; (\Left \; \Empty) ) = |
796 } |
796 \retrieve \; (_{ZZ}x + _{ZS}y + _{S}x ) \; (\Left \; (\Left \; \Char\; x) ) |
797 |
797 \] |
798 \tikzset{three sided1/.style={ |
798 holds, and for $\blexersimp$ a property of similar shape to |
799 draw=none, |
799 \[ |
800 append after command={ |
800 \retrieve \; _{ZZ}\ONE \; \; Empty = |
801 [-,shorten <= -0.5\pgflinewidth] |
801 \retrieve \; _{ZZ}x \; (\Char\; x) |
802 ([shift={(1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west) |
802 \] |
803 edge([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east) |
803 needs to hold as well. |
804 ([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east) |
804 However for the definitely non-POSIX value $v_i' = \Right \; \Empty$ |
805 edge([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east) |
805 the precondition $\vdash \Right \; \Empty : x+y+x$ holds as well, and therefore |
806 ([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east) |
806 the following equality |
807 edge([shift={(1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west) |
807 \[ |
808 } |
808 \retrieve \; (_{ZZ}\ONE + \ZERO + _{S}\ONE) \; (\Right \; \Empty) = |
809 } |
809 \retrieve \; (_{ZZ}x + _{ZS}y + _{S}x ) \; (\Right \; (\Char\; x)) |
810 } |
810 \] |
811 |
811 by lemma \ref{retrieveStepwise} holds for $\blexer$ as well. |
812 \begin{center} |
812 This cannot hold or be proven anymore with $\blexersimp$ as the corresponding |
813 \begin{tikzpicture}[->, >=stealth', shorten >= 1pt, auto, thick] |
813 sub-regular expressions and values have been eliminated during the |
814 %\node [rectangle ] (1) at (-7, 2) {$\ldots$}; |
814 de-duplication procedure of our smplification. |
815 %\node [rectangle, draw] (2) at (-4, 2) {$r_i = _{bs'}(_Za+_Saa)^*$}; |
815 We are stuck with a property that holds in $\blexer$ but does not have a counterpart |
816 %\node [rectangle, draw] (3) at (4, 2) {$r_{i+1} = _{bs'}(_Z(_Z\ONE + _S(\ONE \cdot a)))\cdot(_Za+_Saa)^*$}; |
816 in $\blexersimp$. |
817 %\node [rectangle] (4) at (9, 2) {$\ldots$}; |
817 This needs not hold for the purpose of POSIX lexing though--we know the rightmost |
818 %\node [rectangle] (5) at (-7, -2) {$\ldots$}; |
818 subexpression $x$ is not POSIX by the left priority rule. |
819 %\node [rectangle, draw] (6) at (-4, -2) {$v_i = \Stars \; [\Left (a)]$}; |
819 The inductive invariant \ref{eq:stepwise} can be weakened by restricting the precondition |
820 %\node [rectangle, draw] (7) at ( 4, -2) {$v_{i+1} = \Seq (\Alt (\Left \; \Empty)) \; \Stars \, []$}; |
820 $\vdash v_i:r_i$ to $\exists s_i. \; (s_i, r_i) \rightarrow v_i$. |
821 %\node [rectangle] (8) at ( 9, -2) {$\ldots$}; |
821 We tried this route but it did not work well since we need to |
822 %\node [rectangle] (9) at (-7, -6) {$\ldots$}; |
822 use a similar technique as the rectification functions by Ausaf et al, |
823 %\node [rectangle, draw] (10) at (-4, -6) {$\textit{bits}_{i} = bs' @ ZZS$}; |
823 and they can get very complicated with our simplifications. |
824 %\node [rectangle, draw] (11) at (4, -6) {$\textit{bits}_{i+1} = bs'@ ZZS$}; |
824 After some trial-and-error we found a property of the form |
825 %\node [rectangle] (12) at (9, -6) {$\ldots$}; |
825 \begin{property} |
826 |
826 If a POSIX value can be extracted from $a \backslash s$, then |
827 \node [rectangle ] (1) at (-8, 2) {$\ldots$}; |
827 it can be extracted from $a \backslash_{bsimps} s$ as well. |
828 \node [rectangle, draw] (2) at (-5, 2) {$r_i = _{bs'}(_Za+_Saa)^*$}; |
828 \end{property} |
829 \node [rectangle, draw] (3) at (3, 2) {$r_{i+1} = _{bs'}(_Z(_Z\ONE + _S(\ONE \cdot a)))\cdot(_Za+_Saa)^*$}; |
829 \noindent |
830 \node [rectangle] (4) at (8, 2) {$\ldots$}; |
830 most natural to work with, and we defined a binary relation to capture |
831 \node [rectangle] (5) at (-8, -2) {$\ldots$}; |
831 the connection between $a\backslash s$ and $a \backslash_{bsimps} s$. |
832 \node [rectangle, draw] (6) at (-5, -2) {$v_i = \Stars \; [\Left (a)]$}; |
832 %and look specifically at |
833 \node [rectangle, draw] (7) at ( 3, -2) {$v_{i+1} = \Seq (\Alt (\Left \; \Empty)) \; \Stars \, []$}; |
833 %the pairs $v_i, r_i$ and $v_{i+1},\, r_{i+1}$, we get the diagram demonstrating |
834 \node [rectangle] (8) at ( 8, -2) {$\ldots$}; |
834 %the invariant that the same bitcodes can be extracted from the pairs: |
835 \node [rectangle] (9) at (-8, -6) {$\ldots$}; |
835 %\tikzset{three sided/.style={ |
836 \node [rectangle, draw] (10) at (-5, -6) {$\textit{bits}_{i} = bs' @ ZZS$}; |
836 % draw=none, |
837 \node [rectangle, draw] (11) at (3, -6) {$\textit{bits}_{i+1} = bs'@ ZZS$}; |
837 % append after command={ |
838 \node [rectangle] (12) at (8, -6) {$\ldots$}; |
838 % [-,shorten <= -0.5\pgflinewidth] |
839 |
839 % ([shift={(-1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east) |
840 |
840 % edge([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west) |
841 |
841 % ([shift={( 0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west) |
842 \path (1) edge [] node {} (2); |
842 % edge([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west) |
843 \path (5) edge [] node {} (6); |
843 % ([shift={( 0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west) |
844 \path (9) edge [] node {} (10); |
844 % edge([shift={(-1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east) |
845 |
845 % } |
846 \path (11) edge [] node {} (12); |
846 % } |
847 \path (7) edge [] node {} (8); |
847 %} |
848 \path (3) edge [] node {} (4); |
848 % |
849 |
849 %\tikzset{three sided1/.style={ |
850 |
850 % draw=none, |
851 \path (6) edge [dashed,bend right = 30] node {$\retrieve \; r_i \; v_i$} (10); |
851 % append after command={ |
852 \path (2) edge [dashed,bend left = 48] node {} (10); |
852 % [-,shorten <= -0.5\pgflinewidth] |
853 |
853 % ([shift={(1.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north west) |
854 \path (7) edge [dashed,bend right = 30] node {$\retrieve \; r_{i+1} \; v_{i+1}$} (11); |
854 % edge([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east) |
855 \path (3) edge [dashed,bend left = 45] node {} (11); |
855 % ([shift={(-0.5\pgflinewidth,-0.5\pgflinewidth)}]\tikzlastnode.north east) |
856 |
856 % edge([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east) |
857 \path (2) edge [] node {$\backslash a$} (3); |
857 % ([shift={(-0.5\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south east) |
858 \path (2) edge [dashed, <->] node {$\vdash v_i : r_i$} (6); |
858 % edge([shift={(1.0\pgflinewidth,+0.5\pgflinewidth)}]\tikzlastnode.south west) |
859 \path (3) edge [dashed, <->] node {$\vdash v_{i+1} : r_{i+1}$} (7); |
859 % } |
860 %\path (6) edge [] node {$\vdash v_i : r_i$} (10); |
860 % } |
861 %\path (7) edge [dashed, <->] node {$\vdash v_i : r_i$} (11); |
861 %} |
862 |
862 % |
863 \path (10) edge [dashed, <->] node {$=$} (11); |
863 %\begin{center} |
864 |
864 % \begin{tikzpicture}[->, >=stealth', shorten >= 1pt, auto, thick] |
865 \path (7) edge [] node {$\inj \; r_{i+1} \; a \; v_i$} (6); |
865 % %\node [rectangle ] (1) at (-7, 2) {$\ldots$}; |
866 |
866 % %\node [rectangle, draw] (2) at (-4, 2) {$r_i = _{bs'}(_Za+_Saa)^*$}; |
867 % \node [rectangle, draw] (r) at (-6, -1) {$(aa)^*(b+c)$}; |
867 % %\node [rectangle, draw] (3) at (4, 2) {$r_{i+1} = _{bs'}(_Z(_Z\ONE + _S(\ONE \cdot a)))\cdot(_Za+_Saa)^*$}; |
868 % \node [rectangle, draw] (a) at (-6, 4) {$(aa)^*(_{Z}b + _{S}c)$}; |
868 % %\node [rectangle] (4) at (9, 2) {$\ldots$}; |
869 % \path (r) |
869 % %\node [rectangle] (5) at (-7, -2) {$\ldots$}; |
870 % edge [] node {$\internalise$} (a); |
870 % %\node [rectangle, draw] (6) at (-4, -2) {$v_i = \Stars \; [\Left (a)]$}; |
871 % \node [rectangle, draw] (a1) at (-3, 1) {$(_{Z}(\ONE \cdot a) \cdot (aa)^*) (_{Z}b + _Sc)$}; |
871 % %\node [rectangle, draw] (7) at ( 4, -2) {$v_{i+1} = \Seq (\Alt (\Left \; \Empty)) \; \Stars \, []$}; |
872 % \path (a) |
872 % %\node [rectangle] (8) at ( 9, -2) {$\ldots$}; |
873 % edge [] node {$\backslash a$} (a1); |
873 % %\node [rectangle] (9) at (-7, -6) {$\ldots$}; |
874 % |
874 % %\node [rectangle, draw] (10) at (-4, -6) {$\textit{bits}_{i} = bs' @ ZZS$}; |
875 % \node [rectangle, draw, three sided] (a21) at (-2.5, 4) {$(_{Z}\ONE \cdot (aa)^*)$}; |
875 % %\node [rectangle, draw] (11) at (4, -6) {$\textit{bits}_{i+1} = bs'@ ZZS$}; |
876 % \node [rectangle, draw, three sided1] (a22) at (-0.8, 4) {$(_{Z}b + _{S}c)$}; |
876 % %\node [rectangle] (12) at (9, -6) {$\ldots$}; |
877 % \path (a1) |
877 % |
878 % edge [] node {$\backslash a$} (a21); |
878 % \node [rectangle ] (1) at (-8, 2) {$\ldots$}; |
879 % \node [rectangle, draw] (a3) at (0.5, 2) {$_{ZS}(_{Z}\ONE + \ZERO)$}; |
879 % \node [rectangle, draw] (2) at (-5, 2) {$r_i = _{bs'}(_Za+_Saa)^*$}; |
880 % \path (a22) |
880 % \node [rectangle, draw] (3) at (3, 2) {$r_{i+1} = _{bs'}(_Z(_Z\ONE + _S(\ONE \cdot a)))\cdot(_Za+_Saa)^*$}; |
881 % edge [] node {$\backslash b$} (a3); |
881 % \node [rectangle] (4) at (8, 2) {$\ldots$}; |
882 % \path (a21) |
882 % \node [rectangle] (5) at (-8, -2) {$\ldots$}; |
883 % edge [dashed, bend right] node {} (a3); |
883 % \node [rectangle, draw] (6) at (-5, -2) {$v_i = \Stars \; [\Left (a)]$}; |
884 % \node [rectangle, draw] (bs) at (2, 4) {$ZSZ$}; |
884 % \node [rectangle, draw] (7) at ( 3, -2) {$v_{i+1} = \Seq (\Alt (\Left \; \Empty)) \; \Stars \, []$}; |
885 % \path (a3) |
885 % \node [rectangle] (8) at ( 8, -2) {$\ldots$}; |
886 % edge [below] node {$\bmkeps$} (bs); |
886 % \node [rectangle] (9) at (-8, -6) {$\ldots$}; |
887 % \node [rectangle, draw] (v) at (3, 0) {$\Seq \; (\Stars\; [\Seq \; a \; a]) \; (\Left \; b)$}; |
887 % \node [rectangle, draw] (10) at (-5, -6) {$\textit{bits}_{i} = bs' @ ZZS$}; |
888 % \path (bs) |
888 % \node [rectangle, draw] (11) at (3, -6) {$\textit{bits}_{i+1} = bs'@ ZZS$}; |
889 % edge [] node {$\decode$} (v); |
889 % \node [rectangle] (12) at (8, -6) {$\ldots$}; |
890 |
890 % |
891 |
891 % |
892 \end{tikzpicture} |
892 % |
893 %\caption{$\blexer$ with the regular expression $(aa)^*(b+c)$ and $aab$} |
893 % \path (1) edge [] node {} (2); |
894 \end{center} |
894 % \path (5) edge [] node {} (6); |
895 |
895 % \path (9) edge [] node {} (10); |
896 When simplifications are added, the inhabitation relation no longer holds, |
896 % |
897 causing the above diagram to break. |
897 % \path (11) edge [] node {} (12); |
898 |
898 % \path (7) edge [] node {} (8); |
899 Ausaf addressed this with an augmented lexer he called $\textit{slexer}$. |
899 % \path (3) edge [] node {} (4); |
900 |
900 % |
901 |
901 % |
902 |
902 % \path (6) edge [dashed,bend right = 30] node {$\retrieve \; r_i \; v_i$} (10); |
903 we note that the invariant |
903 % \path (2) edge [dashed,bend left = 48] node {} (10); |
904 $\vdash v_{i+1}: r_{i+1} \implies \retrieve \; r_{i+1} \; v_{i+1} $ is too strong |
904 % |
905 to maintain because the precondition $\vdash v_i : r_i$ is too weak. |
905 % \path (7) edge [dashed,bend right = 30] node {$\retrieve \; r_{i+1} \; v_{i+1}$} (11); |
906 It does not require $v_i$ to be a POSIX value |
906 % \path (3) edge [dashed,bend left = 45] node {} (11); |
907 |
907 % |
908 |
908 % \path (2) edge [] node {$\backslash a$} (3); |
909 |
909 % \path (2) edge [dashed, <->] node {$\vdash v_i : r_i$} (6); |
910 |
910 % \path (3) edge [dashed, <->] node {$\vdash v_{i+1} : r_{i+1}$} (7); |
911 |
911 % %\path (6) edge [] node {$\vdash v_i : r_i$} (10); |
|
912 % %\path (7) edge [dashed, <->] node {$\vdash v_i : r_i$} (11); |
|
913 % |
|
914 % \path (10) edge [dashed, <->] node {$=$} (11); |
|
915 % |
|
916 % \path (7) edge [] node {$\inj \; r_{i+1} \; a \; v_i$} (6); |
|
917 % |
|
918 %% \node [rectangle, draw] (r) at (-6, -1) {$(aa)^*(b+c)$}; |
|
919 %% \node [rectangle, draw] (a) at (-6, 4) {$(aa)^*(_{Z}b + _{S}c)$}; |
|
920 %% \path (r) |
|
921 %% edge [] node {$\internalise$} (a); |
|
922 %% \node [rectangle, draw] (a1) at (-3, 1) {$(_{Z}(\ONE \cdot a) \cdot (aa)^*) (_{Z}b + _Sc)$}; |
|
923 %% \path (a) |
|
924 %% edge [] node {$\backslash a$} (a1); |
|
925 %% |
|
926 %% \node [rectangle, draw, three sided] (a21) at (-2.5, 4) {$(_{Z}\ONE \cdot (aa)^*)$}; |
|
927 %% \node [rectangle, draw, three sided1] (a22) at (-0.8, 4) {$(_{Z}b + _{S}c)$}; |
|
928 %% \path (a1) |
|
929 %% edge [] node {$\backslash a$} (a21); |
|
930 %% \node [rectangle, draw] (a3) at (0.5, 2) {$_{ZS}(_{Z}\ONE + \ZERO)$}; |
|
931 %% \path (a22) |
|
932 %% edge [] node {$\backslash b$} (a3); |
|
933 %% \path (a21) |
|
934 %% edge [dashed, bend right] node {} (a3); |
|
935 %% \node [rectangle, draw] (bs) at (2, 4) {$ZSZ$}; |
|
936 %% \path (a3) |
|
937 %% edge [below] node {$\bmkeps$} (bs); |
|
938 %% \node [rectangle, draw] (v) at (3, 0) {$\Seq \; (\Stars\; [\Seq \; a \; a]) \; (\Left \; b)$}; |
|
939 %% \path (bs) |
|
940 %% edge [] node {$\decode$} (v); |
|
941 % |
|
942 % |
|
943 % \end{tikzpicture} |
|
944 % %\caption{$\blexer$ with the regular expression $(aa)^*(b+c)$ and $aab$} |
|
945 %\end{center} |
|
946 |
|
947 %When simplifications are added, the inhabitation relation no longer holds, |
|
948 %causing the above diagram to break. |
|
949 % |
|
950 %Ausaf addressed this with an augmented lexer he called $\textit{slexer}$. |
|
951 % |
|
952 % |
|
953 % |
|
954 %we note that the invariant |
|
955 %$\vdash v_{i+1}: r_{i+1} \implies \retrieve \; r_{i+1} \; v_{i+1} $ is too strong |
|
956 %to maintain because the precondition $\vdash v_i : r_i$ is too weak. |
|
957 %It does not require $v_i$ to be a POSIX value |
912 % |
958 % |
913 % |
959 % |
914 %which is essential for getting an understanding this thesis |
960 %which is essential for getting an understanding this thesis |
915 %in chapter \ref{Bitcoded1}, which is necessary for understanding why |
961 %in chapter \ref{Bitcoded1}, which is necessary for understanding why |
916 %the proof |
962 %the proof |