diff -r 0b5f06539a84 -r dc5ab66b11cc handouts/ho05.tex --- a/handouts/ho05.tex Mon Oct 02 23:10:56 2023 +0100 +++ b/handouts/ho05.tex Tue Oct 03 14:29:12 2023 +0100 @@ -84,7 +84,7 @@ \noindent from natural languages were the meaning of \emph{flies} depends on the surrounding \emph{context} are avoided as much as possible. Here is -an interesting video about C++ being not a context-free language +an interesting video about C++ not being a context-free language \begin{center} \url{https://www.youtube.com/watch?v=OzK8pUu4UfM} @@ -99,7 +99,7 @@ \end{plstx} \noindent -or a grammar for recognising strings consisting of ones is +or a grammar for recognising strings consisting of ones (at least one) is \begin{plstx}[margin=3cm] : \meta{O} ::= 1 \cdot \meta{O} @@ -378,10 +378,11 @@ \end{plstx} \noindent -In this grammar all $\meta{E}$xpressions, $\meta{T}$erms and $\meta{F}$actors -are in some way protected from being left-recusive. For example if you -start $\meta{E}$ you can derive another one by going through $\meta{T}$, then -$\meta{F}$, but then $\meta{E}$ is protected by the open-parenthesis. +In this grammar all $\meta{E}$xpressions, $\meta{T}$erms and +$\meta{F}$actors are in some way protected from being +left-recusive. For example if you start $\meta{E}$ you can derive +another one by going through $\meta{T}$, then $\meta{F}$, but then +$\meta{E}$ is protected by the open-parenthesis in the last rule. \subsection*{Removing $\epsilon$-Rules and CYK-Algorithm} @@ -395,7 +396,7 @@ \noindent The transformation made the original grammar non-left-recursive, but at the expense of introducing an $\epsilon$ in the second rule. Having an -explicit $\epsilon$-rule is annoying to, not in terms of looping, but in +explicit $\epsilon$-rule is annoying, not in terms of looping, but in terms of efficiency. The reason is that the $\epsilon$-rule always applies but since it recognises the empty string, it does not make any progress with recognising a string. Better are rules like $( \cdot @@ -455,14 +456,15 @@ \noindent I let you think about whether this grammar can still recognise all binary numbers and whether this grammar is non-left-recursive. The -precise statement for the transformation of removing $\epsilon$-rules is -that if the original grammar was able to recognise only non-empty +precise statement for the transformation of removing $\epsilon$-rules +is that if the original grammar was able to recognise only non-empty strings, then the transformed grammar will be equivalent (matching the same set of strings); if the original grammar was able to match the empty string, then the transformed grammar will be able to match the -same strings, \emph{except} the empty string. So the $\epsilon$-removal -does not preserve equivalence of grammars, but the small defect with the -empty string is not important for practical purposes. +same strings, \emph{except} the empty string. So the +$\epsilon$-removal does not preserve equivalence of grammars in +general, but the small defect with the empty string is not important +for practical purposes. So why are these transformations all useful? Well apart from making the parser combinators work (remember they cannot deal with left-recursion and @@ -563,10 +565,10 @@ \noindent The last row contains the information about all words and their corresponding non-terminals. For example the field for \texttt{trains} -contains the information $\meta{N}$ and $\meta{V}$ because it can be a +contains the information $\meta{N}$ and $\meta{V}$ because \texttt{trains} can be a ``verb'' and a ``noun'' according to the grammar. The row above, let's call the corresponding fields 5a to 5e, contains information -about 2-word parts of the sentence, namely +about \underline{2-word} parts of the sentence, namely \begin{center} \begin{tabular}{llll}