thys/Paper/Paper.thy
changeset 113 90fe1a1d7d0e
parent 112 698967eceaf1
child 114 8b41d01b5e5d
equal deleted inserted replaced
112:698967eceaf1 113:90fe1a1d7d0e
    17   CHAR ("_" [1000] 80) and
    17   CHAR ("_" [1000] 80) and
    18   ALT ("_ + _" [77,77] 78) and
    18   ALT ("_ + _" [77,77] 78) and
    19   SEQ ("_ \<cdot> _" [77,77] 78) and
    19   SEQ ("_ \<cdot> _" [77,77] 78) and
    20   STAR ("_\<^sup>\<star>" [1000] 78) and
    20   STAR ("_\<^sup>\<star>" [1000] 78) and
    21   
    21   
    22   val.Void ("'(')" 78) and
    22   val.Void ("'(')" 79) and
    23   val.Char ("Char _" [1000] 78) and
    23   val.Char ("Char _" [1000] 79) and
    24   val.Left ("Left _" [1000] 78) and
    24   val.Left ("Left _" [1000] 78) and
    25   val.Right ("Right _" [1000] 78) and
    25   val.Right ("Right _" [1000] 78) and
    26   
    26   
    27   L ("L'(_')" [10] 78) and
    27   L ("L'(_')" [10] 78) and
    28   der_syn ("_\\_" [79, 1000] 76) and
    28   der_syn ("_\\_" [79, 1000] 76) and
    29   flat ("|_|" [70] 73) and
    29   flat ("|_|" [75] 73) and
    30   Sequ ("_ @ _" [78,77] 63) and
    30   Sequ ("_ @ _" [78,77] 63) and
    31   injval ("inj _ _ _" [1000,77,1000] 77) and 
    31   injval ("inj _ _ _" [78,77,78] 77) and 
    32   projval ("proj _ _ _" [1000,77,1000] 77) and 
    32   projval ("proj _ _ _" [1000,77,1000] 77) and 
    33   length ("len _" [78] 73) 
    33   length ("len _" [78] 73) and
       
    34 
       
    35   Prf ("\<triangleright> _ : _" [75,75] 75) and  
       
    36   PMatch ("_ : _ \<rightarrow> _" [75,75,75] 75)
    34   (* and ValOrd ("_ \<succeq>\<^bsub>_\<^esub> _" [78,77,77] 73) *)
    37   (* and ValOrd ("_ \<succeq>\<^bsub>_\<^esub> _" [78,77,77] 73) *)
    35 (*>*)
    38 (*>*)
    36 
    39 
    37 section {* Introduction *}
    40 section {* Introduction *}
       
    41 
    38 
    42 
    39 text {*
    43 text {*
    40 
    44 
    41 Brzozowski \cite{Brzozowski1964} introduced the notion of the {\em
    45 Brzozowski \cite{Brzozowski1964} introduced the notion of the {\em
    42 derivative} @{term "der c r"} of a regular expression @{text r} w.r.t.\ a
    46 derivative} @{term "der c r"} of a regular expression @{text r} w.r.t.\ a
    57 One limitation of Brzozowski's matcher is that it only generates a YES/NO
    61 One limitation of Brzozowski's matcher is that it only generates a YES/NO
    58 answer for whether a string is being matched by a regular expression.
    62 answer for whether a string is being matched by a regular expression.
    59 Sulzmann and Lu \cite{Sulzmann2014} extended this matcher to allow
    63 Sulzmann and Lu \cite{Sulzmann2014} extended this matcher to allow
    60 generation not just of a YES/NO answer but of an actual matching, called a
    64 generation not just of a YES/NO answer but of an actual matching, called a
    61 [lexical] {\em value}. They give a simple algorithm to calculate a value
    65 [lexical] {\em value}. They give a simple algorithm to calculate a value
    62 that appears to be the value associated with POSIX lexing
    66 that appears to be the value associated with POSIX matching
    63 \cite{Kuklewicz,Vansummeren2006}. The challenge then is to specify that
    67 \cite{Kuklewicz,Vansummeren2006}. The challenge then is to specify that
    64 value, in an algorithm-independent fashion, and to show that Sulzamann and
    68 value, in an algorithm-independent fashion, and to show that Sulzamann and
    65 Lu's derivative-based algorithm does indeed calculate a value that is
    69 Lu's derivative-based algorithm does indeed calculate a value that is
    66 correct according to the specification.
    70 correct according to the specification.
    67 
    71 
    84 experience of doing our proofs has been that this mechanical checking was
    88 experience of doing our proofs has been that this mechanical checking was
    85 absolutely essential: this subject area has hidden snares. This was also
    89 absolutely essential: this subject area has hidden snares. This was also
    86 noted by Kuklewitz \cite{Kuklewicz} who found that nearly all POSIX matching
    90 noted by Kuklewitz \cite{Kuklewicz} who found that nearly all POSIX matching
    87 implementations are ``buggy'' \cite[Page 203]{Sulzmann2014}.
    91 implementations are ``buggy'' \cite[Page 203]{Sulzmann2014}.
    88 
    92 
    89 If a regular expression matches a string, then in general there is more
    93 If a regular expression matches a string, then in general there is more than
    90 than one way of how the string is matched. There are two commonly used
    94 one way of how the string is matched. There are two commonly used
    91 disambiguation strategies to generate a unique answer: one is called GREEDY
    95 disambiguation strategies to generate a unique answer: one is called GREEDY
    92 matching \cite{Frisch2004} and the other is POSIX
    96 matching \cite{Frisch2004} and the other is POSIX
    93 matching~\cite{Kuklewicz,Sulzmann2014}. For example consider the string
    97 matching~\cite{Kuklewicz,Sulzmann2014,Vansummeren2006}. For example consider
    94 @{term xy} and the regular expression \mbox{@{term "STAR (ALT (ALT x y)
    98 the string @{term xy} and the regular expression \mbox{@{term "STAR (ALT
    95 xy)"}}. Either the string can be matched in two `iterations' by the single
    99 (ALT x y) xy)"}}. Either the string can be matched in two `iterations' by
    96 letter-regular expressions @{term x} and @{term y}, or directly in one
   100 the single letter-regular expressions @{term x} and @{term y}, or directly
    97 iteration by @{term xy}. The first case corresponds to GREEDY matching,
   101 in one iteration by @{term xy}. The first case corresponds to GREEDY
    98 which first matches with the left-most symbol and only matches the next
   102 matching, which first matches with the left-most symbol and only matches the
    99 symbol in case of a mismatch (this is greedy in the sense of preferring
   103 next symbol in case of a mismatch (this is greedy in the sense of preferring
   100 instant gratification to delayed repletion). The second case is POSIX
   104 instant gratification to delayed repletion). The second case is POSIX
   101 matching, which prefers the longest match.
   105 matching, which prefers the longest match.
   102 
   106 
   103 In the context of lexing, where an input string needs to be split up into a
   107 In the context of lexing, where an input string needs to be split up into a
   104 sequence of tokens, POSIX is the more natural disambiguation strategy for
   108 sequence of tokens, POSIX is the more natural disambiguation strategy for
   118 
   122 
   119 For a particular longest initial substring, the first regular expression
   123 For a particular longest initial substring, the first regular expression
   120 that can match determines the token.
   124 that can match determines the token.
   121 \end{itemize}
   125 \end{itemize}
   122  
   126  
   123 \noindent Consider for example @{text "r\<^bsub>key\<^esub>"} recognising
   127 \noindent Consider for example @{text "r\<^bsub>key\<^esub>"} recognising keywords such as
   124 keywords such as @{text "if"}, @{text "then"} and so on; and @{text
   128 @{text "if"}, @{text "then"} and so on; and @{text "r\<^bsub>id\<^esub>"} recognising
   125 "r\<^bsub>id\<^esub>"} recognising identifiers (a single character followed
   129 identifiers (a single character followed by characters or numbers). Then we
   126 by characters or numbers). Then we can form the regular expression @{text
   130 can form the regular expression @{text "(r\<^bsub>key\<^esub> + r\<^bsub>id\<^esub>)\<^sup>\<star>"} and use POSIX
   127 "(r\<^bsub>key\<^esub> + r\<^bsub>id\<^esub>)\<^sup>\<star>"} and use POSIX
       
   128 matching to tokenise strings, say @{text "iffoo"} and @{text "if"}. In the
   131 matching to tokenise strings, say @{text "iffoo"} and @{text "if"}. In the
   129 first case we obtain by the longest match rule a single identifier token,
   132 first case we obtain by the longest match rule a single identifier token,
   130 not a keyword followed by identifier. In the second case we obtain by rule
   133 not a keyword followed by an identifier. In the second case we obtain by rule
   131 priority a keyword token, not an identifier token---even if @{text
   134 priority a keyword token, not an identifier token---even if @{text "r\<^bsub>id\<^esub>"}
   132 "r\<^bsub>id\<^esub>"} matches also.\bigskip
   135 matches also.\bigskip
   133 
   136 
   134 \noindent\textcolor{green}{Not Done Yet}
   137 \noindent {\bf Contributions:} We have implemented in Isabelle/HOL the
   135 
   138 derivative-based regular expression matching algorithm as described by
   136 
   139 Sulzmann and Lu \cite{Sulzmann2014}. We have proved the correctness of this
   137 \medskip\noindent
   140 algorithms according to our specification what a POSIX value is. The
   138 {\bf Contributions:}
   141 informal correctness proof given in \cite{Sulzmann2014} is in final
   139 
   142 form\footnote{} and to us contains unfillable gaps. Our specification of a
   140 Derivatives as calculated by Brzozowski's method are usually more complex
   143 POSIX value consists of a simple inductive definition that given a string
   141 regular expressions than the initial one; various optimisations are
   144 and a regular expression uniquely determines this value. Derivatives as
   142 possible, such as the simplifications of @{term "ALT ZERO r"}, @{term "ALT r
   145 calculated by Brzozowski's method are usually more complex regular
   143 ZERO"}, @{term "SEQ ONE r"} and @{term "SEQ r ONE"} to @{term r}. One of the
   146 expressions than the initial one; various optimisations are possible, such
   144 advantages of having a simple specification and correctness proof is that
   147 as the simplifications of @{term "ALT ZERO r"}, @{term "ALT r ZERO"}, @{term
   145 the latter can be refined to allow for such optimisations and simple
   148 "SEQ ONE r"} and @{term "SEQ r ONE"} to @{term r}. One of the advantages of
   146 correctness proof.
   149 having a simple specification and correctness proof is that the latter can
   147 
   150 be refined to allow for such optimisations and simple correctness proof.
   148 An extended version of \cite{Sulzmann2014} is available at the website
   151 
   149 of its first author; this includes some ``proofs'', claimed in
   152 An extended version of \cite{Sulzmann2014} is available at the website of
       
   153 its first author; this includes some ``proofs'', claimed in
   150 \cite{Sulzmann2014} to be ``rigorous''. Since these are evidently not in
   154 \cite{Sulzmann2014} to be ``rigorous''. Since these are evidently not in
   151 final form, we make no comment thereon, preferring to give general
   155 final form, we make no comment thereon, preferring to give general reasons
   152 reasons for our belief that the approach of \cite{Sulzmann2014} is
   156 for our belief that the approach of \cite{Sulzmann2014} is problematic
   153 problematic rather than to discuss details of unpublished work.
   157 rather than to discuss details of unpublished work.
   154 
       
   155 
   158 
   156 *}
   159 *}
   157 
   160 
   158 section {* Preliminaries *}
   161 section {* Preliminaries *}
   159 
   162 
   160 text {* \noindent Strings in Isabelle/HOL are lists of characters with the
   163 text {* \noindent Strings in Isabelle/HOL are lists of characters with the
   161 empty string being represented by the empty list, written @{term "[]"}, and
   164 empty string being represented by the empty list, written @{term "[]"}, and
   162 list-cons being written as @{term "DUMMY # DUMMY"}. Often we use the usual
   165 list-cons being written as @{term "DUMMY # DUMMY"}. Often we use the usual
   163 bracket notation for strings; for example a string consisting of just a
   166 bracket notation for lists also for strings; for example a string consisting
   164 single character is written @{term "[c]"}. By using the type @{type char}
   167 of just a single character @{term c} is written @{term "[c]"}. By using the
   165 for characters we have a supply of finitely many characters roughly
   168 type @{type char} for characters we have a supply of finitely many
   166 corresponding to the ASCII character set. Regular expressions are defined as
   169 characters roughly corresponding to the ASCII character set. Regular
   167 usual as the following inductive datatype:
   170 expressions are defined as usual as the following inductive datatype:
   168 
   171 
   169   \begin{center}
   172   \begin{center}
   170   @{text "r :="}
   173   @{text "r :="}
   171   @{const "ZERO"} $\mid$
   174   @{const "ZERO"} $\mid$
   172   @{const "ONE"} $\mid$
   175   @{const "ONE"} $\mid$
   174   @{term "ALT r\<^sub>1 r\<^sub>2"} $\mid$
   177   @{term "ALT r\<^sub>1 r\<^sub>2"} $\mid$
   175   @{term "SEQ r\<^sub>1 r\<^sub>2"} $\mid$
   178   @{term "SEQ r\<^sub>1 r\<^sub>2"} $\mid$
   176   @{term "STAR r"} 
   179   @{term "STAR r"} 
   177   \end{center}
   180   \end{center}
   178 
   181 
   179   \noindent where @{const ZERO} stands for the regular expression that
   182   \noindent where @{const ZERO} stands for the regular expression that does
   180   does not match any string and @{const ONE} for the regular
   183   not match any string, @{const ONE} for the regular expression that matches
   181   expression that matches only the empty string. The language of a regular expression
   184   only the empty string and @{term c} for matching a character literal. The
   182   is again defined routinely by the recursive function @{term L} with the
   185   language of a regular expression is again defined as usual by the
   183   clauses:
   186   recursive function @{term L} with the clauses:
   184 
   187 
   185   \begin{center}
   188   \begin{center}
   186   \begin{tabular}{rcl}
   189   \begin{tabular}{rcl}
   187   @{thm (lhs) L.simps(1)} & $\dn$ & @{thm (rhs) L.simps(1)}\\
   190   @{thm (lhs) L.simps(1)} & $\dn$ & @{thm (rhs) L.simps(1)}\\
   188   @{thm (lhs) L.simps(2)} & $\dn$ & @{thm (rhs) L.simps(2)}\\
   191   @{thm (lhs) L.simps(2)} & $\dn$ & @{thm (rhs) L.simps(2)}\\
   193   \end{tabular}
   196   \end{tabular}
   194   \end{center}
   197   \end{center}
   195   
   198   
   196   \noindent In the fourth clause we use @{term "DUMMY ;; DUMMY"} for the
   199   \noindent In the fourth clause we use @{term "DUMMY ;; DUMMY"} for the
   197   concatenation of two languages (it is also list-append for strings). We
   200   concatenation of two languages (it is also list-append for strings). We
   198   use the star-notation for regular expressions and also for languages (in
   201   use the star-notation for regular expressions and languages (in the last
   199   the last clause). The star for languages is defined inductively as usual
   202   clause above). The star on languages is defined inductively by two
   200   by two clauses for the empty string being in the star of a language and if
   203   clauses: @{text "(i)"} for the empty string being in the star of a
   201   @{term "s\<^sub>1"} is in a language and @{term "s\<^sub>2"} in the star of this
   204   language and @{text "(ii)"} if @{term "s\<^sub>1"} is in a language and @{term
   202   language, then also @{term "s\<^sub>1 @ s\<^sub>2"} is in the star of this language.
   205   "s\<^sub>2"} in the star of this language, then also @{term "s\<^sub>1 @ s\<^sub>2"} is in
   203   
   206   the star of this language. It will also be convenient to use the following
   204   \emph{Semantic derivatives} of sets of strings are defined as
   207   notion of a \emph{semantic derivative} (or left quotient) of a language,
       
   208   say @{text A}, defined as:
   205 
   209 
   206   \begin{center}
   210   \begin{center}
   207   \begin{tabular}{lcl}
   211   \begin{tabular}{lcl}
   208   @{thm (lhs) Der_def} & $\dn$ & @{thm (rhs) Der_def}\\
   212   @{thm (lhs) Der_def} & $\dn$ & @{thm (rhs) Der_def}\\
   209   \end{tabular}
   213   \end{tabular}
   210   \end{center}
   214   \end{center}
   211   
   215   
   212   
   216   \noindent 
   213 
   217   For semantic derivatives we have the following equations (for example
   214   \noindent
   218   \cite{Krauss2011}):
   215   The nullable function
   219 
       
   220   \begin{equation}
       
   221   \begin{array}{lcl}
       
   222   @{thm (lhs) Der_null}  & \dn & @{thm (rhs) Der_null}\\
       
   223   \end{array}
       
   224   \end{equation}
       
   225 
       
   226 
       
   227   \noindent \emph{\Brz's derivatives} of regular expressions
       
   228   \cite{Brzozowski1964} can be easily defined by two recursive functions:
       
   229   the first is from regular expressions to booleans (implementing a test
       
   230   when a regular expression can match the empty string), and the second
       
   231   takes a regular expression and a character to a (derivative) regular
       
   232   expression:
   216 
   233 
   217   \begin{center}
   234   \begin{center}
   218   \begin{tabular}{lcl}
   235   \begin{tabular}{lcl}
   219   @{thm (lhs) nullable.simps(1)} & $\dn$ & @{thm (rhs) nullable.simps(1)}\\
   236   @{thm (lhs) nullable.simps(1)} & $\dn$ & @{thm (rhs) nullable.simps(1)}\\
   220   @{thm (lhs) nullable.simps(2)} & $\dn$ & @{thm (rhs) nullable.simps(2)}\\
   237   @{thm (lhs) nullable.simps(2)} & $\dn$ & @{thm (rhs) nullable.simps(2)}\\
   221   @{thm (lhs) nullable.simps(3)} & $\dn$ & @{thm (rhs) nullable.simps(3)}\\
   238   @{thm (lhs) nullable.simps(3)} & $\dn$ & @{thm (rhs) nullable.simps(3)}\\
   222   @{thm (lhs) nullable.simps(4)[of "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) nullable.simps(4)[of "r\<^sub>1" "r\<^sub>2"]}\\
   239   @{thm (lhs) nullable.simps(4)[of "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) nullable.simps(4)[of "r\<^sub>1" "r\<^sub>2"]}\\
   223   @{thm (lhs) nullable.simps(5)[of "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) nullable.simps(5)[of "r\<^sub>1" "r\<^sub>2"]}\\
   240   @{thm (lhs) nullable.simps(5)[of "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) nullable.simps(5)[of "r\<^sub>1" "r\<^sub>2"]}\\
   224   @{thm (lhs) nullable.simps(6)} & $\dn$ & @{thm (rhs) nullable.simps(6)}\\
   241   @{thm (lhs) nullable.simps(6)} & $\dn$ & @{thm (rhs) nullable.simps(6)}\medskip\\
   225   \end{tabular}
       
   226   \end{center}
       
   227 
       
   228   \noindent
       
   229   The derivative function for characters and strings
       
   230 
       
   231   \begin{center}
       
   232   \begin{tabular}{lcp{8cm}}
       
   233   @{thm (lhs) der.simps(1)} & $\dn$ & @{thm (rhs) der.simps(1)}\\
   242   @{thm (lhs) der.simps(1)} & $\dn$ & @{thm (rhs) der.simps(1)}\\
   234   @{thm (lhs) der.simps(2)} & $\dn$ & @{thm (rhs) der.simps(2)}\\
   243   @{thm (lhs) der.simps(2)} & $\dn$ & @{thm (rhs) der.simps(2)}\\
   235   @{thm (lhs) der.simps(3)} & $\dn$ & @{thm (rhs) der.simps(3)}\\
   244   @{thm (lhs) der.simps(3)} & $\dn$ & @{thm (rhs) der.simps(3)}\\
   236   @{thm (lhs) der.simps(4)[of c "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) der.simps(4)[of c "r\<^sub>1" "r\<^sub>2"]}\\
   245   @{thm (lhs) der.simps(4)[of c "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) der.simps(4)[of c "r\<^sub>1" "r\<^sub>2"]}\\
   237   @{thm (lhs) der.simps(5)[of c "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) der.simps(5)[of c "r\<^sub>1" "r\<^sub>2"]}\\
   246   @{thm (lhs) der.simps(5)[of c "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) der.simps(5)[of c "r\<^sub>1" "r\<^sub>2"]}\\
   238   @{thm (lhs) der.simps(6)} & $\dn$ & @{thm (rhs) der.simps(6)}
   247   @{thm (lhs) der.simps(6)} & $\dn$ & @{thm (rhs) der.simps(6)}
   239   \end{tabular}
   248   \end{tabular}
   240   \end{center}
   249   \end{center}
   241 
   250 
   242   It is a relatively easy exercise to prove that
   251   \noindent
   243 
   252   Given the equations in ???, 
   244   \begin{center}
   253   it is a relatively easy exercise in mechanical reasoning to establish that
   245   \begin{tabular}{l}
   254 
   246   @{thm[mode=IfThen] nullable_correctness}\\
   255   \begin{proposition} 
   247   @{thm[mode=IfThen] der_correctness}\\
   256   \begin{tabular}{ll}
   248   \end{tabular}
   257   @{text "(1)"} & @{thm (lhs) nullable_correctness} if and only if
   249   \end{center}
   258   @{thm (rhs) nullable_correctness} \\ 
       
   259   @{text "(2)"} & @{thm[mode=IfThen] der_correctness}
       
   260   \end{tabular}
       
   261   \end{proposition}
       
   262 
       
   263   \noindent With this in place it is also very routine to prove that the
       
   264   regular expression matcher defined as
       
   265 
       
   266   \begin{center}
       
   267 
       
   268   \end{center}
       
   269 
       
   270   While the matcher above calculates a provably correct a YES/NO answer for
       
   271   whether a regular expression matches a string, the novel idea of Sulzmann
       
   272   and Lu \cite{Sulzmann2014} is to append another phase to calculate a
       
   273   value.
       
   274 
   250 *}
   275 *}
   251 
   276 
   252 section {* POSIX Regular Expression Matching *}
   277 section {* POSIX Regular Expression Matching *}
   253 
   278 
   254 text {* 
   279 text {* 
   255 
   280 
   256 The clever idea in \cite{Sulzmann2014} is to define a function on values that mirrors
   281 The clever idea in \cite{Sulzmann2014} is to introduce values for encoding
   257 (but inverts) the construction of the derivative on regular expressions. We
   282 \emph{how} a regular expression matches a string and then define a function
   258 begin with the case of a nullable regular expression: from the nullability
   283 on values that mirrors (but inverts) the construction of the derivative on
   259 we need to construct a value that witnesses the nullability. This is as
   284 regular expressions. Values are defined as the inductive datatype
   260 follows. The @{const mkeps} function (from \cite{Sulzmann2014}) is a partial (but
   285 
   261 unambiguous) function from regular expressions to values, total on exactly
   286   \begin{center}
   262 the set of nullable regular expressions. 
   287   @{text "v :="}
       
   288   @{const "Void"} $\mid$
       
   289   @{term "val.Char c"} $\mid$
       
   290   @{term "Left v"} $\mid$
       
   291   @{term "Right v"} $\mid$
       
   292   @{term "Seq v\<^sub>1 v\<^sub>2"} $\mid$ 
       
   293   @{term "Stars vs"} 
       
   294   \end{center}  
       
   295 
       
   296 The inhabitation relation:
       
   297 
       
   298   \begin{center}
       
   299   \begin{tabular}{c}
       
   300   @{thm[mode=Rule] Prf.intros(1)[of "v\<^sub>1" "r\<^sub>1" "v\<^sub>2" "r\<^sub>2"]}\medskip\\ 
       
   301   @{thm[mode=Rule] Prf.intros(2)[of "v\<^sub>1" "r\<^sub>1" "r\<^sub>2"]} \qquad 
       
   302   @{thm[mode=Rule] Prf.intros(3)[of "v\<^sub>2" "r\<^sub>1" "r\<^sub>2"]}\medskip\\
       
   303   @{thm[mode=Axiom] Prf.intros(4)} \qquad 
       
   304   @{thm[mode=Axiom] Prf.intros(5)[of "c"]}\medskip\\
       
   305   @{thm[mode=Axiom] Prf.intros(6)[of "r"]} \qquad 
       
   306   @{thm[mode=Rule] Prf.intros(7)[of "v" "r" "vs"]}\medskip\\
       
   307   \end{tabular}
       
   308   \end{center}
       
   309 
       
   310 
       
   311 Note that no values are associated with the regular expression $Zero$, and
       
   312 that the only value associated with the regular expression $One$ is $Void$,
       
   313 pronounced (if one must) as {\em ``Void''}. We use $\in$ for the usual
       
   314 membership relation from set theory and take $[]$ as the standard name for
       
   315 the empty string (rather than, as in \cite{Sulzmann2014}, the regular
       
   316 expression that we call $One$).
       
   317 
       
   318 We begin with the case of a nullable regular expression: from
       
   319 the nullability we need to construct a value that witnesses the nullability.
       
   320 The @{const mkeps} function (from \cite{Sulzmann2014})
       
   321 is a partial (but unambiguous) function from regular expressions to values,
       
   322 total on exactly the set of nullable regular expressions.
   263 
   323 
   264  \begin{center}
   324  \begin{center}
   265   \begin{tabular}{lcl}
   325   \begin{tabular}{lcl}
   266   @{thm (lhs) mkeps.simps(1)} & $\dn$ & @{thm (rhs) mkeps.simps(1)}\\
   326   @{thm (lhs) mkeps.simps(1)} & $\dn$ & @{thm (rhs) mkeps.simps(1)}\\
   267   @{thm (lhs) mkeps.simps(2)[of "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) mkeps.simps(2)[of "r\<^sub>1" "r\<^sub>2"]}\\
   327   @{thm (lhs) mkeps.simps(2)[of "r\<^sub>1" "r\<^sub>2"]} & $\dn$ & @{thm (rhs) mkeps.simps(2)[of "r\<^sub>1" "r\<^sub>2"]}\\
   321 
   381 
   322 
   382 
   323   \noindent
   383   \noindent
   324   Values
   384   Values
   325 
   385 
   326   \begin{center}
   386   
   327   @{text "v :="}
       
   328   @{const "Void"} $\mid$
       
   329   @{term "val.Char c"} $\mid$
       
   330   @{term "Left v"} $\mid$
       
   331   @{term "Right v"} $\mid$
       
   332   @{term "Seq v\<^sub>1 v\<^sub>2"} $\mid$ 
       
   333   @{term "Stars vs"} 
       
   334   \end{center}  
       
   335 
   387 
   336  
   388  
   337 
   389 
   338   \noindent
   390   \noindent
   339   The @{const flat} function for values
   391   The @{const flat} function for values