handouts/ho05.tex
changeset 682 553b4d4e3719
parent 681 7b7736bea3ca
child 686 05cfce0fdef7
equal deleted inserted replaced
681:7b7736bea3ca 682:553b4d4e3719
    15 \section*{Handout 5 (Grammars \& Parser)}
    15 \section*{Handout 5 (Grammars \& Parser)}
    16 
    16 
    17 While regular expressions are very useful for lexing and for recognising
    17 While regular expressions are very useful for lexing and for recognising
    18 many patterns in strings (like email addresses), they have their
    18 many patterns in strings (like email addresses), they have their
    19 limitations. For example there is no regular expression that can
    19 limitations. For example there is no regular expression that can
    20 recognise the language $a^nb^n$ (where you have strings with $n$ $a$'s
    20 recognise the language $a^nb^n$ (where you have strings starting with $n$ $a$'s
    21 followed by the same amount of $b$'s). Another example for which there
    21 followed by the same amount of $b$'s). Another example for which there
    22 exists no regular expression is the language of well-parenthesised
    22 exists no regular expression is the language of well-parenthesised
    23 expressions. In languages like Lisp, which use parentheses rather
    23 expressions. In languages like Lisp, which use parentheses rather
    24 extensively, it might be of interest to know whether the following two
    24 extensively, it might be of interest to know whether the following two
    25 expressions are well-parenthesised or not (the left one is, the right
    25 expressions are well-parenthesised or not (the left one is, the right
    64 languages. Context-free in this setting means that ``words'' have one
    64 languages. Context-free in this setting means that ``words'' have one
    65 meaning only and this meaning is independent from the context
    65 meaning only and this meaning is independent from the context
    66 the ``words'' appear in. For example ambiguity issues like
    66 the ``words'' appear in. For example ambiguity issues like
    67 
    67 
    68 \begin{center}
    68 \begin{center}
    69 \tt Time flies like an arrow; fruit flies like bananas.
    69 \tt Time flies like an arrow. Fruit flies like bananas.
    70 \end{center}  
    70 \end{center}  
    71 
    71 
    72 \noindent
    72 \noindent
    73 from natural languages were the meaning of \emph{flies} depends on the
    73 from natural languages were the meaning of \emph{flies} depends on the
    74 surrounding \emph{context} are avoided as much as possible.
    74 surrounding \emph{context} are avoided as much as possible.
   464 a terminal, or $\meta{A} ::= \meta{B}\cdot \meta{C}$, where $\meta{B}$ and
   464 a terminal, or $\meta{A} ::= \meta{B}\cdot \meta{C}$, where $\meta{B}$ and
   465 $\meta{B}$ need to be non-terminals. And no rule can contain $\epsilon$.
   465 $\meta{B}$ need to be non-terminals. And no rule can contain $\epsilon$.
   466 The following grammar is in Chomsky normalform:
   466 The following grammar is in Chomsky normalform:
   467 
   467 
   468 \begin{plstx}[margin=1cm]
   468 \begin{plstx}[margin=1cm]
   469   : \meta{S\/} ::= \meta{N}\cdot \meta{P}\\
   469   : \meta{S} ::= \meta{N}\cdot \meta{P}\\
   470   : \meta{P\/} ::= \meta{V}\cdot \meta{N}\\
   470   : \meta{P} ::= \meta{V}\cdot \meta{N}\\
   471   : \meta{N\/} ::= \meta{N}\cdot \meta{N}\\
   471   : \meta{N} ::= \meta{N}\cdot \meta{N}\\
   472   : \meta{N\/} ::= \meta{A}\cdot \meta{N}\\
   472   : \meta{N} ::= \meta{A}\cdot \meta{N}\\
   473   : \meta{N\/} ::= \texttt{student} | \texttt{trainer} | \texttt{team} 
   473   : \meta{N} ::= \texttt{student} | \texttt{trainer} | \texttt{team} 
   474                    | \texttt{trains}\\
   474                 | \texttt{trains}\\
   475   : \meta{V\/} ::= \texttt{trains} | \texttt{team}\\
   475   : \meta{V} ::= \texttt{trains} | \texttt{team}\\
   476   : \meta{A\/} ::= \texttt{The} | \texttt{the}\\
   476   : \meta{A} ::= \texttt{The} | \texttt{the}\\
   477 \end{plstx}
   477 \end{plstx}
   478 
   478 
   479 \noindent
   479 \noindent
   480 where $\meta{S}$ is the start symbol and $\meta{S}$, $\meta{P}$,
   480 where $\meta{S}$ is the start symbol and $\meta{S}$, $\meta{P}$,
   481 $\meta{N}$, $\meta{V}$ and $\meta{A}$ are non-terminals. The ``words''
   481 $\meta{N}$, $\meta{V}$ and $\meta{A}$ are non-terminals. The ``words''
   491 
   491 
   492 \noindent
   492 \noindent
   493 is recognised by the grammar. The CYK algorithm starts with the
   493 is recognised by the grammar. The CYK algorithm starts with the
   494 following triangular data structure.
   494 following triangular data structure.
   495 
   495 
   496 TBD
   496 \begin{figure}[t]
       
   497 \begin{center}
       
   498   \begin{tikzpicture}[scale=0.8,line width=0.8mm]
       
   499   \draw (-2,0) -- (4,0);
       
   500   \draw (-2,1) -- (4,1);
       
   501   \draw (-2,2) -- (3,2);
       
   502   \draw (-2,3) -- (2,3);
       
   503   \draw (-2,4) -- (1,4);
       
   504   \draw (-2,5) -- (0,5);
       
   505   \draw (-2,6) -- (-1,6);
       
   506   
       
   507   \draw (0,0) -- (0, 5);
       
   508   \draw (1,0) -- (1, 4);
       
   509   \draw (2,0) -- (2, 3);
       
   510   \draw (3,0) -- (3, 2);
       
   511   \draw (4,0) -- (4, 1);
       
   512   \draw (-1,0) -- (-1, 6);
       
   513   \draw (-2,0) -- (-2, 6);
       
   514   
       
   515   \draw (-1.5,-0.5) node {\footnotesize{}\texttt{The}}; 
       
   516   \draw (-0.5,-1.0) node {\footnotesize{}\texttt{trainer}}; 
       
   517   \draw ( 0.5,-0.5) node {\footnotesize{}\texttt{trains}}; 
       
   518   \draw ( 1.5,-1.0) node {\footnotesize{}\texttt{the}}; 
       
   519   \draw ( 2.5,-0.5) node {\footnotesize{}\texttt{student}}; 
       
   520   \draw ( 3.5,-1.0) node {\footnotesize{}\texttt{team}};
       
   521   
       
   522   \draw (-1.5,0.5) node {$A$}; 
       
   523   \draw (-0.5,0.5) node {$N$}; 
       
   524   \draw ( 0.5,0.5) node {$N,V$}; 
       
   525   \draw ( 1.5,0.5) node {$A$}; 
       
   526   \draw ( 2.5,0.5) node {$N$}; 
       
   527   \draw ( 3.5,0.5) node {$N,V$};
       
   528 
       
   529   \draw (-2.4, 5.5) node {$1$}; 
       
   530   \draw (-2.4, 4.5) node {$2$}; 
       
   531   \draw (-2.4, 3.5) node {$3$}; 
       
   532   \draw (-2.4, 2.5) node {$4$}; 
       
   533   \draw (-2.4, 1.5) node {$5$}; 
       
   534   \draw (-2.4, 0.5) node {$6$}; 
       
   535   \end{tikzpicture}
       
   536   \end{center}
       
   537 \end{figure}
   497 
   538 
   498 \end{document}
   539 \end{document}
   499 
   540 
   500 
   541 
   501 %%% Parser combinators are now part of handout 6
   542 %%% Parser combinators are now part of handout 6