10 16:00. You are asked to implement a regular expression matcher |
10 16:00. You are asked to implement a regular expression matcher |
11 and submit a document containing the answers for the questions |
11 and submit a document containing the answers for the questions |
12 below. You can do the implementation in any programming |
12 below. You can do the implementation in any programming |
13 language you like, but you need to submit the source code with |
13 language you like, but you need to submit the source code with |
14 which you answered the questions, otherwise a mark of 0\% will |
14 which you answered the questions, otherwise a mark of 0\% will |
15 be awarded. However, the coursework will \emph{only} be judged |
15 be awarded. You can submit your answers in a txt-file or pdf. |
16 according to the answers. You can submit your answers in a |
|
17 txt-file or pdf. |
|
18 |
16 |
19 |
17 |
20 |
18 |
21 \subsubsection*{Disclaimer} |
19 \subsubsection*{Disclaimer} |
22 |
20 |
23 It should be understood that the work you submit represents your own |
21 It should be understood that the work you submit represents |
24 effort. You have not copied from anyone else. An exception is the |
22 your own effort. You have not copied from anyone else. An |
25 Scala code I showed during the lectures, which you can use.\bigskip |
23 exception is the Scala code I showed during the lectures or |
|
24 uploaded to KEATS, which you can use.\bigskip |
26 |
25 |
27 |
26 |
28 \subsubsection*{Tasks} |
27 \subsubsection*{Tasks} |
29 |
28 |
30 The task is to implement a regular expression matcher based on |
29 The task is to implement a regular expression matcher based on |
46 $r^{\{n,m\}}$ & at least $n$-times $r$ but no more than $m$-times\\ |
45 $r^{\{n,m\}}$ & at least $n$-times $r$ but no more than $m$-times\\ |
47 $\sim{}r$ & not-regular expression of $r$\\ |
46 $\sim{}r$ & not-regular expression of $r$\\ |
48 \end{tabular} |
47 \end{tabular} |
49 \end{center} |
48 \end{center} |
50 |
49 |
51 \noindent |
50 \noindent In the case of $r^{\{n,m\}}$ we have the convention |
52 In the case of $r^{\{n,m\}}$ we have the convention that $0 \le n \le m$. |
51 that $0 \le n \le m$. The meanings of the extended regular |
53 The meaning of these regular expressions is |
52 expressions are |
54 |
53 |
55 \begin{center} |
54 \begin{center} |
56 \begin{tabular}{r@{\hspace{2mm}}c@{\hspace{2mm}}l} |
55 \begin{tabular}{r@{\hspace{2mm}}c@{\hspace{2mm}}l} |
57 $L([c_1 c_2 \ldots c_n])$ & $\dn$ & $\{[c_1], [c_2], \ldots, [c_n]\}$\\ |
56 $L([c_1 c_2 \ldots c_n])$ & $\dn$ & $\{[c_1], [c_2], \ldots, [c_n]\}$\\ |
58 $L(r^+)$ & $\dn$ & $\bigcup_{1\le i}. L(r)^i$\\ |
57 $L(r^+)$ & $\dn$ & $\bigcup_{1\le i}. L(r)^i$\\ |
60 $L(r^{\{n,m\}})$ & $\dn$ & $\bigcup_{n\le i \le m}. L(r)^i$\\ |
59 $L(r^{\{n,m\}})$ & $\dn$ & $\bigcup_{n\le i \le m}. L(r)^i$\\ |
61 $L(\sim{}r)$ & $\dn$ & $\Sigma^* - L(r)$ |
60 $L(\sim{}r)$ & $\dn$ & $\Sigma^* - L(r)$ |
62 \end{tabular} |
61 \end{tabular} |
63 \end{center} |
62 \end{center} |
64 |
63 |
65 \noindent |
64 \noindent whereby in the last clause the set $\Sigma^*$ stands |
66 whereby in the last clause the set $\Sigma^*$ stands for the set of |
65 for the set of \emph{all} strings over the alphabet $\Sigma$ |
67 \emph{all} strings over the alphabet $\Sigma$ (in the implementation the |
66 (in the implementation the alphabet can be just what is |
68 alphabet can be just what is represented by, say, the type \pcode{char})). |
67 represented by, say, the type \pcode{char}). So $\sim{}r$ |
69 So $\sim{}r$ means `all the strings that $r$ |
68 means `all the strings that $r$ cannot match'. |
70 cannot match'. |
69 |
71 |
70 Be careful that your implementation of \textit{nullable} and |
72 Be careful that your implementation of $nullable$ and $der$ satisfies |
71 \textit{der} satisfies for every $r$ the following two |
73 for every $r$ the following two properties (see also Question 2): |
72 properties (see also Question 2): |
74 |
73 |
75 \begin{itemize} |
74 \begin{itemize} |
76 \item $nullable(r)$ if and only if $[]\in L(r)$ |
75 \item $\textit{nullable}(r)$ if and only if $[]\in L(r)$ |
77 \item $L(der\,c\,r)) = Der\,c\,(L(r))$ |
76 \item $L(der\,c\,r) = Der\,c\,(L(r))$ |
78 \end{itemize} |
77 \end{itemize} |
79 |
78 |
80 \noindent |
79 \noindent {\bf Important!} Your implementation should have |
81 {\bf Important!} Your implementation should have explicit cases for |
80 explicit cases for the basic regular expressions, but also |
82 the basic regular expressions, but also explicit cases for the |
81 explicit cases for the extended regular expressions. That |
83 extended regular expressions. That means do not treat the extended |
82 means do not treat the extended regular expressions by just |
84 regular expressions by just translating them into the basic ones. See |
83 translating them into the basic ones. See also Question 2, |
85 also Question 2, where you are asked to explicitly give the rules for |
84 where you are asked to explicitly give the rules for |
86 \textit{nullable} and \textit{der} for the extended regular |
85 \textit{nullable} and \textit{der} for the extended regular |
87 expressions. |
86 expressions. |
88 |
87 |
89 |
88 |
90 \subsection*{Question 1 (unmarked)} |
89 \subsection*{Question 1 (unmarked)} |
91 |
90 |
92 What is your King's email address (you will need it in Question 3)? |
91 What is your King's email address (you will need it in |
|
92 Question 3)? |
93 |
93 |
94 \subsection*{Question 2 (marked with 2\%)} |
94 \subsection*{Question 2 (marked with 2\%)} |
95 |
95 |
96 This question does not require any implementation. From the lectures |
96 This question does not require any implementation. From the |
97 you have seen the definitions for the functions \textit{nullable} and |
97 lectures you have seen the definitions for the functions |
98 \textit{der} for the basic regular expressions. Give the rules for |
98 \textit{nullable} and \textit{der} for the basic regular |
99 the extended regular expressions: |
99 expressions. Give the rules for the extended regular |
|
100 expressions: |
100 |
101 |
101 \begin{center} |
102 \begin{center} |
102 \begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}} |
103 \begin{tabular}{@ {}l@ {\hspace{2mm}}c@ {\hspace{2mm}}l@ {}} |
103 $nullable([c_1 c_2 \ldots c_n])$ & $\dn$ & $?$\\ |
104 $\textit{nullable}([c_1 c_2 \ldots c_n])$ & $\dn$ & $?$\\ |
104 $nullable(r^+)$ & $\dn$ & $?$\\ |
105 $\textit{nullable}(r^+)$ & $\dn$ & $?$\\ |
105 $nullable(r^?)$ & $\dn$ & $?$\\ |
106 $\textit{nullable}(r^?)$ & $\dn$ & $?$\\ |
106 $nullable(r^{\{n,m\}})$ & $\dn$ & $?$\\ |
107 $\textit{nullable}(r^{\{n,m\}})$ & $\dn$ & $?$\\ |
107 $nullable(\sim{}r)$ & $\dn$ & $?$\medskip\\ |
108 $\textit{nullable}(\sim{}r)$ & $\dn$ & $?$\medskip\\ |
108 $der\, c\, ([c_1 c_2 \ldots c_n])$ & $\dn$ & $?$\\ |
109 $der\, c\, ([c_1 c_2 \ldots c_n])$ & $\dn$ & $?$\\ |
109 $der\, c\, (r^+)$ & $\dn$ & $?$\\ |
110 $der\, c\, (r^+)$ & $\dn$ & $?$\\ |
110 $der\, c\, (r^?)$ & $\dn$ & $?$\\ |
111 $der\, c\, (r^?)$ & $\dn$ & $?$\\ |
111 $der\, c\, (r^{\{n,m\}})$ & $\dn$ & $?$\\ |
112 $der\, c\, (r^{\{n,m\}})$ & $\dn$ & $?$\\ |
112 $der\, c\, (\sim{}r)$ & $\dn$ & $?$\\ |
113 $der\, c\, (\sim{}r)$ & $\dn$ & $?$\\ |
127 |
128 |
128 \[ |
129 \[ |
129 ([a\mbox{-}z0\mbox{-}9\_\!\_\,.-]^+)\cdot @\cdot ([a\mbox{-}z0\mbox{-}9\,.-]^+)\cdot .\cdot ([a\mbox{-}z\,.]^{\{2,6\}}) |
130 ([a\mbox{-}z0\mbox{-}9\_\!\_\,.-]^+)\cdot @\cdot ([a\mbox{-}z0\mbox{-}9\,.-]^+)\cdot .\cdot ([a\mbox{-}z\,.]^{\{2,6\}}) |
130 \] |
131 \] |
131 |
132 |
132 \noindent |
133 \noindent and calculate the derivative according to your email |
133 and calculate the derivative according to your email address. When |
134 address. When calculating the derivative, simplify all regular |
134 calculating the derivative, simplify all regular expressions as much |
135 expressions as much as possible, but at least apply the |
135 as possible, but at least apply the following six simplification |
136 following six simplification rules: |
136 rules: |
|
137 |
137 |
138 \begin{center} |
138 \begin{center} |
139 \begin{tabular}{l@{\hspace{2mm}}c@{\hspace{2mm}}ll} |
139 \begin{tabular}{l@{\hspace{2mm}}c@{\hspace{2mm}}ll} |
140 $r \cdot \varnothing$ & $\mapsto$ & $\varnothing$\\ |
140 $r \cdot \varnothing$ & $\mapsto$ & $\varnothing$\\ |
141 $\varnothing \cdot r$ & $\mapsto$ & $\varnothing$\\ |
141 $\varnothing \cdot r$ & $\mapsto$ & $\varnothing$\\ |