308 might look arbitrary, but there is in fact a relatively simple idea |
308 might look arbitrary, but there is in fact a relatively simple idea |
309 behind it. Ultimately the $\inj$-functions is determined by the |
309 behind it. Ultimately the $\inj$-functions is determined by the |
310 derivative functions. For this consider one of the ``squares'' from |
310 derivative functions. For this consider one of the ``squares'' from |
311 Figure~\ref{Sulz}: |
311 Figure~\ref{Sulz}: |
312 |
312 |
313 |
313 \begin{center} |
314 \begin{center} |
314 \begin{tikzpicture}[scale=2,node distance=1.2cm, |
315 \begin{tikzpicture}[scale=2,node distance=1.2cm, |
315 every node/.style={minimum size=7mm}] |
316 every node/.style={minimum size=7mm}] |
316 \node (r) {$r$}; |
317 \node (r) {$r$}; |
317 \node (rd) [right=of r]{$r_{der}$}; |
318 \node (rd) [right=of r]{$r_{der}$}; |
318 \draw[->,line width=1mm](r)--(rd) node[above,midway] {$\der\,c$}; |
319 \draw[->,line width=1mm](r)--(rd) node[above,midway] {$\der\,c$}; |
319 \node (vd) [below=of r2]{$v_{der}$}; |
320 \node (vd) [below=of r2]{$v_{der}$}; |
320 \draw[->,line width=1mm](rd) -- (vd); |
321 \draw[->,line width=1mm](rd) -- (vd); |
321 \node (v) [left=of vd] {$v$}; |
322 \node (v) [left=of vd] {$v$}; |
322 \draw[->,line width=1mm](vd)--(v) node[below,midway] {$\inj\,c$}; |
323 \draw[->,line width=1mm](vd)--(v) node[below,midway] {$\inj\,c$}; |
323 \draw[->,line width=0.5mm,dotted](r) -- (v) node[left,midway,red] {\bf ?}; |
324 \draw[->,line width=0.5mm,dotted](r) -- (v) node[left,midway,red] {\bf ?}; |
324 \end{tikzpicture} |
325 \end{tikzpicture} |
325 \end{center} |
326 \end{center} |
|
327 |
326 |
328 \noindent |
327 \noindent |
329 The input to the $\inj$-function is $r$ (on the top left), $c$ (the |
328 The input to the $\inj$-function is $r$ (on the top left), $c$ (the |
330 character to be injected) and $v_{der}$ (bottom right). The output is |
329 character to be injected) and $v_{der}$ (bottom right). The output is |
331 the value $v$ for how the regular expression $r$ matched the |
330 the value $v$ for how the regular expression $r$ matched the |
334 the regular expression $r$. The value $v_{der}$ corresponds to the |
333 the regular expression $r$. The value $v_{der}$ corresponds to the |
335 $r_{der}$-regular expression which is the derivative of $r$. Remember |
334 $r_{der}$-regular expression which is the derivative of $r$. Remember |
336 that $v_{der}$ is the value for how $r_{der}$ matches the corresponding string |
335 that $v_{der}$ is the value for how $r_{der}$ matches the corresponding string |
337 where $c$ has been chopped off. |
336 where $c$ has been chopped off. |
338 |
337 |
339 Let $r$ be $r_1 + r_2$. Then $r_{der}$ |
338 For a concrete example, let $r$ be $r_1 + r_2$. Then $r_{der}$ |
340 is of the form $(\der\,c\,r_1) + (\der\,c\,r_2)$. What are the possible |
339 is of the form $(\der\,c\,r_1) + (\der\,c\,r_2)$. What are the possible |
341 values corresponding to $r_{der}$? Well, they can be only of the form |
340 values corresponding to $r_{der}$? Well, they can be only of the form |
342 $\Left(\ldots)$ and $\Right(\ldots)$. Therefore you have two |
341 $\Left(\ldots)$ and $\Right(\ldots)$. Therefore you have two |
343 cases in the $\inj$ function -- one for $\Left$ and one for $\Right$. |
342 cases in the $\inj$ function -- one for $\Left$ and one for $\Right$. |
344 |
343 |
430 the other clauses in $\inj$. |
429 the other clauses in $\inj$. |
431 |
430 |
432 |
431 |
433 Phew\ldots{}Sweat\ldots!\#@$\skull$\%\ldots Unfortunately, there is |
432 Phew\ldots{}Sweat\ldots!\#@$\skull$\%\ldots Unfortunately, there is |
434 a gigantic problem with the described algorithm so far: it is very |
433 a gigantic problem with the described algorithm so far: it is very |
435 slow. We need to include in all this the simplification from Lecture |
434 slow. To make it faster, we have to include in all this the simplification |
436 2. And what rotten luck: simplification messes things up and we need |
435 from Lecture 2\ldots{}and what rotten luck: simplification messes things |
437 to rectify the mess. This is what we shall do next. |
436 up and we need to rectify the mess. This is what we shall do next. |
438 |
437 |
439 |
438 |
440 \subsubsection*{Simplification} |
439 \subsubsection*{Simplification} |
441 |
440 |
442 Generally the matching algorithms based on derivatives do |
441 Generally the matching algorithms based on derivatives do |