147 $r + r$ & $\equiv$ & $r$ |
147 $r + r$ & $\equiv$ & $r$ |
148 \end{tabular} |
148 \end{tabular} |
149 \end{center} |
149 \end{center} |
150 |
150 |
151 \noindent which always hold no matter what the regular |
151 \noindent which always hold no matter what the regular |
152 expression $r$ looks like. The first are easy to verify since |
152 expression $r$ looks like. The first two are easy to verify since |
153 $L(\varnothing)$ is the empty set. The next two are also easy |
153 $L(\varnothing)$ is the empty set. The next two are also easy |
154 to verify since $L(\epsilon) = \{[]\}$ and appending the empty |
154 to verify since $L(\epsilon) = \{[]\}$ and appending the empty |
155 string to every string of another set, leaves the set |
155 string to every string of another set, leaves the set |
156 unchanged. Be careful to fully comprehend the fifth and |
156 unchanged. Be careful to fully comprehend the fifth and |
157 sixth equivalence: if you concatenate two sets of strings |
157 sixth equivalence: if you concatenate two sets of strings |
158 and one is the empty set, then the concatenation will also be |
158 and one is the empty set, then the concatenation will also be |
159 the empty set. Check the definition of \pcode{_ @ _}. |
159 the empty set. Check the definition of \pcode{_ @ _}. |
160 The last equivalence is again trivial. |
160 The last equivalence is again trivial. |
161 |
161 |
162 |
|
163 What will be important later on is that we can orient these |
162 What will be important later on is that we can orient these |
164 equivalences and read them from left to right. In this way we |
163 equivalences and read them from left to right. In this way we |
165 can view them as \emph{simplification rules}. Suppose for |
164 can view them as \emph{simplification rules}. Suppose for |
166 example the regular expression |
165 example the regular expression |
167 |
166 |
170 \label{big} |
169 \label{big} |
171 \end{equation} |
170 \end{equation} |
172 |
171 |
173 \noindent If we can find an equivalent regular expression that |
172 \noindent If we can find an equivalent regular expression that |
174 is simpler (smaller for example), then this might potentially |
173 is simpler (smaller for example), then this might potentially |
175 make our matching algorithm is faster. The reason is that |
174 make our matching algorithm run faster. The reason is that |
176 whether a string $s$ is in $L(r)$ or in $L(r')$ with $r\equiv r'$ |
175 whether a string $s$ is in $L(r)$ or in $L(r')$ with $r\equiv r'$ |
177 will always give the same answer. In the example above you |
176 will always give the same answer. In the example above you |
178 will see that the regular expression is equivalent to $r_1$ |
177 will see that the regular expression is equivalent to $r_1$ |
179 if you iteratively apply the simplification rules from above: |
178 if you iteratively apply the simplification rules from above: |
180 |
179 |
189 $\equiv$ & $\underline{r_1 + \varnothing}$\smallskip\\ |
188 $\equiv$ & $\underline{r_1 + \varnothing}$\smallskip\\ |
190 $\equiv$ & $r_1$\ |
189 $\equiv$ & $r_1$\ |
191 \end{tabular} |
190 \end{tabular} |
192 \end{center} |
191 \end{center} |
193 |
192 |
194 \noindent In each step I underlined where a simplification |
193 \noindent In each step, I underlined where a simplification |
195 rule is applied. Our matching algorithm in the next section |
194 rule is applied. Our matching algorithm in the next section |
196 will often generate such ``useless'' $\epsilon$s and |
195 will often generate such ``useless'' $\epsilon$s and |
197 $\varnothing$s, therefore simplifying them away will make the |
196 $\varnothing$s, therefore simplifying them away will make the |
198 algorithm quite a bit faster. |
197 algorithm quite a bit faster. |
199 |
198 |
390 |
389 |
391 \subsection*{The Matching Algorithm in Scala} |
390 \subsection*{The Matching Algorithm in Scala} |
392 |
391 |
393 Another attraction of the algorithm is that it can be easily |
392 Another attraction of the algorithm is that it can be easily |
394 implemented in a functional programming language, like Scala. |
393 implemented in a functional programming language, like Scala. |
395 Given the implementation of regular expressions in Scala given |
394 Given the implementation of regular expressions in Scala shown |
396 in the first lecture and handout, the functions for |
395 in the first lecture and handout, the functions and subfunctions |
397 \pcode{matches} are shown in Figure~\ref{scala1}. |
396 for \pcode{matches} are shown in Figure~\ref{scala1}. |
398 |
397 |
399 \begin{figure}[p] |
398 \begin{figure}[p] |
400 \lstinputlisting{../progs/app5.scala} |
399 \lstinputlisting{../progs/app5.scala} |
401 \caption{Scala implementation of the nullable and |
400 \caption{Scala implementation of the nullable and |
402 derivatives functions.\label{scala1}} |
401 derivatives functions.\label{scala1}} |
535 \end{center} |
534 \end{center} |
536 |
535 |
537 \noindent I leave it to you to contemplate whether such a |
536 \noindent I leave it to you to contemplate whether such a |
538 simplification can have any impact on the correctness of our |
537 simplification can have any impact on the correctness of our |
539 algorithm (will it change any answers?). Figure~\ref{scala2} |
538 algorithm (will it change any answers?). Figure~\ref{scala2} |
540 give a simplification function that recursively traverses a |
539 gives a simplification function that recursively traverses a |
541 regular expression and simplifies it according to the rules |
540 regular expression and simplifies it according to the rules |
542 given at the beginning. There are only rules for $+$, $\cdot$ |
541 given at the beginning. There are only rules for $+$, $\cdot$ |
543 and $n$-times (the latter because we added it in the second |
542 and $n$-times (the latter because we added it in the second |
544 version of our matcher). There is no rule for a star, because |
543 version of our matcher). There is no rule for a star, because |
545 empirical data and also a little thought showed that |
544 empirical data and also a little thought showed that |