afl-material: comparison handouts/ho05.tex

equal deleted inserted replaced

-:0b5f06539a84
+:dc5ab66b11cc
 \end{center}
 \noindent
 from natural languages were the meaning of \emph{flies} depends on the
 surrounding \emph{context} are avoided as much as possible. Here is
-an interesting video about C++ being not a context-free language
+an interesting video about C++ not being a context-free language
 \begin{center}
 \url{https://www.youtube.com/watch?v=OzK8pUu4UfM}
 \end{center}
 : \meta{P} ::=  ( \cdot  \meta{P} \cdot ) \cdot \meta{P}
 | \epsilon\\
 \end{plstx}
 \noindent
-or a grammar for recognising strings consisting of ones is
+or a grammar for recognising strings consisting of ones (at least one) is
 \begin{plstx}[margin=3cm]
 : \meta{O} ::= 1 \cdot  \meta{O}
 | 1\\
 \end{plstx}
 : \meta{T} ::= \meta{F} | \meta{F} \cdot * \cdot \meta{T}\\
 : \meta{F} ::= num\_token | ( \cdot \meta{E} \cdot )\\
 \end{plstx}
 \noindent
-In this grammar all $\meta{E}$xpressions, $\meta{T}$erms and $\meta{F}$actors
+In this grammar all $\meta{E}$xpressions, $\meta{T}$erms and
-are in some way protected from being left-recusive. For example if you
+$\meta{F}$actors are in some way protected from being
-start $\meta{E}$ you can derive another one by going through $\meta{T}$, then
+left-recusive. For example if you start $\meta{E}$ you can derive
-$\meta{F}$, but then $\meta{E}$ is protected by the open-parenthesis.
+another one by going through $\meta{T}$, then $\meta{F}$, but then
+$\meta{E}$ is protected by the open-parenthesis in the last rule.
 \subsection*{Removing $\epsilon$-Rules and CYK-Algorithm}
 I showed above that the non-left-recursive grammar for binary numbers is
 \end{plstx}
 \noindent
 The transformation made the original grammar non-left-recursive, but at
 the expense of introducing an $\epsilon$ in the second rule. Having an
-explicit $\epsilon$-rule is annoying to, not in terms of looping, but in
+explicit $\epsilon$-rule is annoying, not in terms of looping, but in
 terms of efficiency. The reason is that the $\epsilon$-rule always
 applies but since it recognises the empty string, it does not make any
 progress with recognising a string. Better are rules like $( \cdot
 \meta{E} \cdot )$ where something of the input is consumed. Getting
 rid of $\epsilon$-rules is also important for the CYK parsing algorithm,
 \end{plstx}
 \noindent
 I let you think about whether this grammar can still recognise all
 binary numbers and whether this grammar is non-left-recursive. The
-precise statement for the transformation of removing $\epsilon$-rules is
+precise statement for the transformation of removing $\epsilon$-rules
-that if the original grammar was able to recognise only non-empty
+is that if the original grammar was able to recognise only non-empty
 strings, then the transformed grammar will be equivalent (matching the
 same set of strings); if the original grammar was able to match the
 empty string, then the transformed grammar will be able to match the
-same strings, \emph{except} the empty string. So the  $\epsilon$-removal
+same strings, \emph{except} the empty string. So the
-does not preserve equivalence of grammars, but the small defect with the
+$\epsilon$-removal does not preserve equivalence of grammars in
-empty string is not important for practical purposes.
+general, but the small defect with the empty string is not important
+for practical purposes.
 So why are these transformations all useful? Well apart from making the
 parser combinators work (remember they cannot deal with left-recursion and
 are inefficient with $\epsilon$-rules), a second reason is that they help
 with getting any insight into the complexity of the parsing problem.
 \noindent
 The last row contains the information about all words and their
 corresponding non-terminals. For example the field for \texttt{trains}
-contains the information $\meta{N}$ and $\meta{V}$ because it can be a
+contains the information $\meta{N}$ and $\meta{V}$ because \texttt{trains} can be a
 ``verb'' and a ``noun'' according to the grammar.  The row above,
 let's call the corresponding fields 5a to 5e, contains information
-about 2-word parts of the sentence, namely
+about \underline{2-word} parts of the sentence, namely
 \begin{center}
 \begin{tabular}{llll}
 5a) & $\underbrace{\texttt{The}}_{A}$ $\mid$ $\underbrace{\texttt{trainer}}_{N}$   \\
 5b) & $\underbrace{\texttt{trainer}}_{N}$ $\mid$ $\underbrace{\texttt{trains}}_{N,V}$\\

changeset 937	dc5ab66b11cc
parent 798	aaf0bd0a211d
child 941	66adcae6c762