774
|
1 |
1
|
|
2 |
00:00:06,410 --> 00:00:09,390
|
|
3 |
The final video for this week.
|
|
4 |
|
|
5 |
2
|
|
6 |
00:00:09,390 --> 00:00:12,465
|
|
7 |
And let me say something
|
|
8 |
about the coursework.
|
|
9 |
|
|
10 |
3
|
|
11 |
00:00:12,465 --> 00:00:15,300
|
|
12 |
First. You can solve
|
|
13 |
|
|
14 |
4
|
|
15 |
00:00:15,300 --> 00:00:17,745
|
|
16 |
the coursework in any programming
|
|
17 |
language you're like,
|
|
18 |
|
|
19 |
5
|
|
20 |
00:00:17,745 --> 00:00:22,440
|
|
21 |
I already said that. You
|
|
22 |
have to submit a PDF file.
|
|
23 |
|
|
24 |
6
|
|
25 |
00:00:22,440 --> 00:00:23,850
|
|
26 |
There will be some questions,
|
|
27 |
|
|
28 |
7
|
|
29 |
00:00:23,850 --> 00:00:26,250
|
|
30 |
so you have to write down the
|
|
31 |
solution to this questions.
|
|
32 |
|
|
33 |
8
|
|
34 |
00:00:26,250 --> 00:00:30,315
|
|
35 |
Please use a PDF and you have
|
|
36 |
to submit your source code.
|
|
37 |
|
|
38 |
9
|
|
39 |
00:00:30,315 --> 00:00:35,580
|
|
40 |
And yes, if you do use a
|
|
41 |
language which isn't Scala,
|
|
42 |
|
|
43 |
10
|
|
44 |
00:00:35,580 --> 00:00:39,450
|
|
45 |
it might help to tell me
|
|
46 |
how I can run your code.
|
|
47 |
|
|
48 |
11
|
|
49 |
00:00:39,450 --> 00:00:42,550
|
|
50 |
If I can't run your code,
|
|
51 |
I will ask you anyway.
|
|
52 |
|
|
53 |
12
|
|
54 |
00:00:42,550 --> 00:00:50,044
|
|
55 |
Also, please submit both the
|
|
56 |
PDF and the code in a file,
|
|
57 |
|
|
58 |
13
|
|
59 |
00:00:50,044 --> 00:00:52,730
|
|
60 |
in a zip file, which generates
|
|
61 |
|
|
62 |
14
|
|
63 |
00:00:52,730 --> 00:00:55,835
|
|
64 |
a subdirectory with your
|
|
65 |
name and your family name.
|
|
66 |
|
|
67 |
15
|
|
68 |
00:00:55,835 --> 00:00:58,970
|
|
69 |
That we'll just save a
|
|
70 |
lot of time for me to try
|
|
71 |
|
|
72 |
16
|
|
73 |
00:00:58,970 --> 00:01:02,030
|
|
74 |
to figure out who
|
|
75 |
has submitted what.
|
|
76 |
|
|
77 |
17
|
|
78 |
00:01:02,030 --> 00:01:04,445
|
|
79 |
Please do that.
|
|
80 |
|
|
81 |
18
|
|
82 |
00:01:04,445 --> 00:01:07,789
|
|
83 |
So what is the task
|
|
84 |
into coursework?
|
|
85 |
|
|
86 |
19
|
|
87 |
00:01:07,789 --> 00:01:09,885
|
|
88 |
I essentially showed you how
|
|
89 |
|
|
90 |
20
|
|
91 |
00:01:09,885 --> 00:01:11,995
|
|
92 |
the Brzozowski matcher
|
|
93 |
|
|
94 |
21
|
|
95 |
00:01:11,995 --> 00:01:14,645
|
|
96 |
works for the basic
|
|
97 |
regular expressions.
|
|
98 |
|
|
99 |
22
|
|
100 |
00:01:14,645 --> 00:01:16,295
|
|
101 |
I also showed you actually how it
|
|
102 |
|
|
103 |
23
|
|
104 |
00:01:16,295 --> 00:01:18,110
|
|
105 |
works for the n-times.
|
|
106 |
|
|
107 |
24
|
|
108 |
00:01:18,110 --> 00:01:20,300
|
|
109 |
And the task in coursework
|
|
110 |
|
|
111 |
25
|
|
112 |
00:01:20,300 --> 00:01:22,970
|
|
113 |
is that you extend the
|
|
114 |
Brzozowski matcher to
|
|
115 |
|
|
116 |
26
|
|
117 |
00:01:22,970 --> 00:01:25,820
|
|
118 |
the other regular expressions
|
|
119 |
|
|
120 |
27
|
|
121 |
00:01:25,820 --> 00:01:27,800
|
|
122 |
from the extended
|
|
123 |
regular expressions.
|
|
124 |
|
|
125 |
28
|
|
126 |
00:01:27,800 --> 00:01:30,245
|
|
127 |
So you're supposed
|
|
128 |
to extended with
|
|
129 |
|
|
130 |
29
|
|
131 |
00:01:30,245 --> 00:01:34,490
|
|
132 |
r-plus, optional r, for n-times.
|
|
133 |
|
|
134 |
30
|
|
135 |
00:01:34,490 --> 00:01:38,420
|
|
136 |
You've already seen that.
|
|
137 |
For 0 or more times,
|
|
138 |
|
|
139 |
31
|
|
140 |
00:01:38,420 --> 00:01:41,120
|
|
141 |
but not more than m
|
|
142 |
times regular expression.
|
|
143 |
|
|
144 |
32
|
|
145 |
00:01:41,120 --> 00:01:45,890
|
|
146 |
For at least n-times regular
|
|
147 |
expression and for between
|
|
148 |
|
|
149 |
33
|
|
150 |
00:01:45,890 --> 00:01:47,480
|
|
151 |
n times and m times
|
|
152 |
|
|
153 |
34
|
|
154 |
00:01:47,480 --> 00:01:50,810
|
|
155 |
regular expression and also this
|
|
156 |
NOT regular expression.
|
|
157 |
|
|
158 |
35
|
|
159 |
00:01:50,810 --> 00:01:52,790
|
|
160 |
So your task is to
|
|
161 |
|
|
162 |
36
|
|
163 |
00:01:52,790 --> 00:01:55,310
|
|
164 |
essentially find the definition
|
|
165 |
|
|
166 |
37
|
|
167 |
00:01:55,310 --> 00:01:57,155
|
|
168 |
for nullable in these cases.
|
|
169 |
|
|
170 |
38
|
|
171 |
00:01:57,155 --> 00:02:00,215
|
|
172 |
Find a definition for derivative,
|
|
173 |
|
|
174 |
39
|
|
175 |
00:02:00,215 --> 00:02:02,480
|
|
176 |
implement them,
|
|
177 |
write them down in
|
|
178 |
|
|
179 |
40
|
|
180 |
00:02:02,480 --> 00:02:06,065
|
|
181 |
a PDF and then do some
|
|
182 |
experiments with them.
|
|
183 |
|
|
184 |
41
|
|
185 |
00:02:06,065 --> 00:02:08,875
|
|
186 |
Well, how can you do that?
|
|
187 |
|
|
188 |
42
|
|
189 |
00:02:08,875 --> 00:02:10,370
|
|
190 |
Well I've given you
|
|
191 |
|
|
192 |
43
|
|
193 |
00:02:10,370 --> 00:02:13,565
|
|
194 |
the meaning of these additional
|
|
195 |
regular expressions.
|
|
196 |
|
|
197 |
44
|
|
198 |
00:02:13,565 --> 00:02:16,730
|
|
199 |
So here's, for example, the
|
|
200 |
meaning of this r-plus.
|
|
201 |
|
|
202 |
45
|
|
203 |
00:02:16,730 --> 00:02:18,290
|
|
204 |
So that would be, I have
|
|
205 |
|
|
206 |
46
|
|
207 |
00:02:18,290 --> 00:02:22,115
|
|
208 |
at least one copy up to infinity.
|
|
209 |
|
|
210 |
47
|
|
211 |
00:02:22,115 --> 00:02:25,070
|
|
212 |
And the optional-r would be it's
|
|
213 |
|
|
214 |
48
|
|
215 |
00:02:25,070 --> 00:02:28,370
|
|
216 |
the language of r plus
|
|
217 |
the empty string.
|
|
218 |
|
|
219 |
49
|
|
220 |
00:02:28,370 --> 00:02:30,440
|
|
221 |
If I have it exactly n times,
|
|
222 |
|
|
223 |
50
|
|
224 |
00:02:30,440 --> 00:02:31,985
|
|
225 |
then it would be
|
|
226 |
just the language
|
|
227 |
|
|
228 |
51
|
|
229 |
00:02:31,985 --> 00:02:34,010
|
|
230 |
of r exactly n-times.
|
|
231 |
|
|
232 |
52
|
|
233 |
00:02:34,010 --> 00:02:38,105
|
|
234 |
And here you have union
|
|
235 |
from 0 to m and so on.
|
|
236 |
|
|
237 |
53
|
|
238 |
00:02:38,105 --> 00:02:42,560
|
|
239 |
This might be a slightly
|
|
240 |
interesting regular expression.
|
|
241 |
|
|
242 |
54
|
|
243 |
00:02:42,560 --> 00:02:46,580
|
|
244 |
So that's supposed to be the
|
|
245 |
character set that is very
|
|
246 |
|
|
247 |
55
|
|
248 |
00:02:46,580 --> 00:02:48,410
|
|
249 |
similar to character ranges like
|
|
250 |
|
|
251 |
56
|
|
252 |
00:02:48,410 --> 00:02:51,005
|
|
253 |
in the existing regular
|
|
254 |
expression matchers.
|
|
255 |
|
|
256 |
57
|
|
257 |
00:02:51,005 --> 00:02:52,820
|
|
258 |
So this just says
|
|
259 |
|
|
260 |
58
|
|
261 |
00:02:52,820 --> 00:02:55,565
|
|
262 |
...this regular
|
|
263 |
expression can match
|
|
264 |
|
|
265 |
59
|
|
266 |
00:02:55,565 --> 00:03:00,860
|
|
267 |
either the character c1 or
|
|
268 |
the character c2 up to cn.
|
|
269 |
|
|
270 |
60
|
|
271 |
00:03:00,860 --> 00:03:03,620
|
|
272 |
Why do I include
|
|
273 |
that regular expression?
|
|
274 |
|
|
275 |
61
|
|
276 |
00:03:03,620 --> 00:03:09,050
|
|
277 |
I could also just say
|
|
278 |
c1 plus c2 plus c3...
|
|
279 |
|
|
280 |
62
|
|
281 |
00:03:09,050 --> 00:03:12,889
|
|
282 |
a big alternative.
|
|
283 |
Well that's possible.
|
|
284 |
|
|
285 |
63
|
|
286 |
00:03:12,889 --> 00:03:16,595
|
|
287 |
But remember the Achilles'
|
|
288 |
heel of this algorithm is,
|
|
289 |
|
|
290 |
64
|
|
291 |
00:03:16,595 --> 00:03:18,425
|
|
292 |
if the regular expression is large,
|
|
293 |
|
|
294 |
65
|
|
295 |
00:03:18,425 --> 00:03:20,825
|
|
296 |
then the computation
|
|
297 |
take always a long time.
|
|
298 |
|
|
299 |
66
|
|
300 |
00:03:20,825 --> 00:03:23,840
|
|
301 |
So I'm trying to compress
|
|
302 |
that as much as I can.
|
|
303 |
|
|
304 |
67
|
|
305 |
00:03:23,840 --> 00:03:25,370
|
|
306 |
So instead of giving a big
|
|
307 |
|
|
308 |
68
|
|
309 |
00:03:25,370 --> 00:03:29,134
|
|
310 |
alternative, I am trying to
|
|
311 |
give one regular expression,
|
|
312 |
|
|
313 |
69
|
|
314 |
00:03:29,134 --> 00:03:31,340
|
|
315 |
which can not just match
|
|
316 |
a single character,
|
|
317 |
|
|
318 |
70
|
|
319 |
00:03:31,340 --> 00:03:34,230
|
|
320 |
but a set of characters.
|
|
321 |
|
|
322 |
71
|
|
323 |
00:03:34,630 --> 00:03:36,980
|
|
324 |
How can you be sure that
|
|
325 |
|
|
326 |
72
|
|
327 |
00:03:36,980 --> 00:03:39,215
|
|
328 |
definition you come
|
|
329 |
up with are correct?
|
|
330 |
|
|
331 |
73
|
|
332 |
00:03:39,215 --> 00:03:41,450
|
|
333 |
Well, I showed you which are
|
|
334 |
|
|
335 |
74
|
|
336 |
00:03:41,450 --> 00:03:45,170
|
|
337 |
the properties these two
|
|
338 |
functions need to satisfy.
|
|
339 |
|
|
340 |
75
|
|
341 |
00:03:45,170 --> 00:03:47,060
|
|
342 |
I've given you here what
|
|
343 |
|
|
344 |
76
|
|
345 |
00:03:47,060 --> 00:03:49,325
|
|
346 |
the meaning of these
|
|
347 |
expressions are.
|
|
348 |
|
|
349 |
77
|
|
350 |
00:03:49,325 --> 00:03:52,700
|
|
351 |
So you will always know what's
|
|
352 |
on the right-hand side.
|
|
353 |
|
|
354 |
78
|
|
355 |
00:03:52,700 --> 00:03:54,515
|
|
356 |
This is completely determined.
|
|
357 |
|
|
358 |
79
|
|
359 |
00:03:54,515 --> 00:03:56,720
|
|
360 |
You essentially
|
|
361 |
have to fill something
|
|
362 |
|
|
363 |
80
|
|
364 |
00:03:56,720 --> 00:03:58,910
|
|
365 |
equivalent on the left-hand side.
|
|
366 |
|
|
367 |
81
|
|
368 |
00:03:58,910 --> 00:04:02,105
|
|
369 |
That's the main task
|
|
370 |
of the coursework.
|
|
371 |
|
|
372 |
82
|
|
373 |
00:04:02,105 --> 00:04:08,090
|
|
374 |
And you can start from the
|
|
375 |
files I provided on KEATS.
|
|
376 |
|
|
377 |
83
|
|
378 |
00:04:08,090 --> 00:04:12,125
|
|
379 |
So you would just have to
|
|
380 |
add these additional cases.
|
|
381 |
|
|
382 |
84
|
|
383 |
00:04:12,125 --> 00:04:15,020
|
|
384 |
When you add these
|
|
385 |
additional cases
|
|
386 |
|
|
387 |
85
|
|
388 |
00:04:15,020 --> 00:04:17,330
|
|
389 |
and you're using the Scala language,
|
|
390 |
|
|
391 |
86
|
|
392 |
00:04:17,330 --> 00:04:18,980
|
|
393 |
you can do me a favour.
|
|
394 |
|
|
395 |
87
|
|
396 |
00:04:18,980 --> 00:04:22,550
|
|
397 |
You can call these
|
|
398 |
constructors for
|
|
399 |
|
|
400 |
88
|
|
401 |
00:04:22,550 --> 00:04:25,400
|
|
402 |
these different regular
|
|
403 |
expressions as RANGE,
|
|
404 |
|
|
405 |
89
|
|
406 |
00:04:25,400 --> 00:04:28,985
|
|
407 |
PLUS, OPTIONAL and NTIMES,
|
|
408 |
UPTO, FROM and BETWEEN.
|
|
409 |
|
|
410 |
90
|
|
411 |
00:04:28,985 --> 00:04:31,370
|
|
412 |
Remember I have this
|
|
413 |
convention to always
|
|
414 |
|
|
415 |
91
|
|
416 |
00:04:31,370 --> 00:04:34,025
|
|
417 |
use capital letters
|
|
418 |
for regular expressions.
|
|
419 |
|
|
420 |
92
|
|
421 |
00:04:34,025 --> 00:04:36,680
|
|
422 |
It would be nice if you could use
|
|
423 |
|
|
424 |
93
|
|
425 |
00:04:36,680 --> 00:04:39,260
|
|
426 |
these names because
|
|
427 |
then it will be
|
|
428 |
|
|
429 |
94
|
|
430 |
00:04:39,260 --> 00:04:42,335
|
|
431 |
very easy for me
|
|
432 |
to test your code.
|
|
433 |
|
|
434 |
95
|
|
435 |
00:04:42,335 --> 00:04:44,690
|
|
436 |
If you're using a different
|
|
437 |
programming language
|
|
438 |
|
|
439 |
96
|
|
440 |
00:04:44,690 --> 00:04:46,370
|
|
441 |
like let's say Rust,
|
|
442 |
|
|
443 |
97
|
|
444 |
00:04:46,370 --> 00:04:48,860
|
|
445 |
I expect some people will use that, where I
|
|
446 |
|
|
447 |
98
|
|
448 |
00:04:48,860 --> 00:04:51,380
|
|
449 |
have no idea what are the
|
|
450 |
conventions in your language,
|
|
451 |
|
|
452 |
99
|
|
453 |
00:04:51,380 --> 00:04:53,420
|
|
454 |
how you have to call
|
|
455 |
these constructors,
|
|
456 |
|
|
457 |
100
|
|
458 |
00:04:53,420 --> 00:04:56,420
|
|
459 |
we will see whatever you
|
|
460 |
submit. I'll have a look.
|
|
461 |
|
|
462 |
101
|
|
463 |
00:04:56,420 --> 00:04:59,360
|
|
464 |
There's one more
|
|
465 |
constraint I have to
|
|
466 |
|
|
467 |
102
|
|
468 |
00:04:59,360 --> 00:05:02,194
|
|
469 |
impose to make this
|
|
470 |
coursework interesting.
|
|
471 |
|
|
472 |
103
|
|
473 |
00:05:02,194 --> 00:05:04,730
|
|
474 |
I do not want you
|
|
475 |
that you just take
|
|
476 |
|
|
477 |
104
|
|
478 |
00:05:04,730 --> 00:05:07,370
|
|
479 |
these extended regular
|
|
480 |
expressions and that you
|
|
481 |
|
|
482 |
105
|
|
483 |
00:05:07,370 --> 00:05:10,550
|
|
484 |
expand them using
|
|
485 |
their definition.
|
|
486 |
|
|
487 |
106
|
|
488 |
00:05:10,550 --> 00:05:12,320
|
|
489 |
Because that would make the regular
|
|
490 |
|
|
491 |
107
|
|
492 |
00:05:12,320 --> 00:05:13,955
|
|
493 |
expression matcher very slow.
|
|
494 |
|
|
495 |
108
|
|
496 |
00:05:13,955 --> 00:05:16,160
|
|
497 |
So for example, if
|
|
498 |
you want to define
|
|
499 |
|
|
500 |
109
|
|
501 |
00:05:16,160 --> 00:05:18,785
|
|
502 |
what's the derivative for r-plus,
|
|
503 |
|
|
504 |
110
|
|
505 |
00:05:18,785 --> 00:05:21,560
|
|
506 |
then what does not
|
|
507 |
count as a solution:
|
|
508 |
|
|
509 |
111
|
|
510 |
00:05:21,560 --> 00:05:24,770
|
|
511 |
if you just expand that
|
|
512 |
to the definition that r
|
|
513 |
|
|
514 |
112
|
|
515 |
00:05:24,770 --> 00:05:28,935
|
|
516 |
plus is equivalent to
|
|
517 |
r followed by r-star.
|
|
518 |
|
|
519 |
113
|
|
520 |
00:05:28,935 --> 00:05:32,845
|
|
521 |
So in code, what I
|
|
522 |
would like to not see,
|
|
523 |
|
|
524 |
114
|
|
525 |
00:05:32,845 --> 00:05:35,680
|
|
526 |
I would not give any
|
|
527 |
marks for that is,
|
|
528 |
|
|
529 |
115
|
|
530 |
00:05:35,680 --> 00:05:37,780
|
|
531 |
if you add the plus as follows,
|
|
532 |
|
|
533 |
116
|
|
534 |
00:05:37,780 --> 00:05:39,910
|
|
535 |
so that is still perfectly fine.
|
|
536 |
|
|
537 |
117
|
|
538 |
00:05:39,910 --> 00:05:42,160
|
|
539 |
There's nothing you
|
|
540 |
can do differently.
|
|
541 |
|
|
542 |
118
|
|
543 |
00:05:42,160 --> 00:05:44,065
|
|
544 |
That would be the constructor.
|
|
545 |
|
|
546 |
119
|
|
547 |
00:05:44,065 --> 00:05:46,480
|
|
548 |
But when you define nullable,
|
|
549 |
|
|
550 |
120
|
|
551 |
00:05:46,480 --> 00:05:49,180
|
|
552 |
I do not want to see
|
|
553 |
that you defined
|
|
554 |
|
|
555 |
121
|
|
556 |
00:05:49,180 --> 00:05:51,790
|
|
557 |
this plus r as nullable
|
|
558 |
|
|
559 |
122
|
|
560 |
00:05:51,790 --> 00:05:54,385
|
|
561 |
of the sequence of r and star-r,
|
|
562 |
|
|
563 |
123
|
|
564 |
00:05:54,385 --> 00:05:58,075
|
|
565 |
just unfolding
|
|
566 |
the definition.
|
|
567 |
|
|
568 |
124
|
|
569 |
00:05:58,075 --> 00:06:00,415
|
|
570 |
As you can see, when you come
|
|
571 |
|
|
572 |
125
|
|
573 |
00:06:00,415 --> 00:06:02,815
|
|
574 |
up with a much better
|
|
575 |
solution for that.
|
|
576 |
|
|
577 |
126
|
|
578 |
00:06:02,815 --> 00:06:05,110
|
|
579 |
This is a very inefficient way
|
|
580 |
|
|
581 |
127
|
|
582 |
00:06:05,110 --> 00:06:07,090
|
|
583 |
for how to calculate nullable.
|
|
584 |
|
|
585 |
128
|
|
586 |
00:06:07,090 --> 00:06:10,825
|
|
587 |
And so the same for derivative
|
|
588 |
would not be allowed.
|
|
589 |
|
|
590 |
129
|
|
591 |
00:06:10,825 --> 00:06:13,895
|
|
592 |
If you, I have the plus r here,
|
|
593 |
|
|
594 |
130
|
|
595 |
00:06:13,895 --> 00:06:16,685
|
|
596 |
you can't just unfold
|
|
597 |
the definition,
|
|
598 |
|
|
599 |
131
|
|
600 |
00:06:16,685 --> 00:06:19,790
|
|
601 |
with r-plus
|
|
602 |
being defined as r
|
|
603 |
|
|
604 |
132
|
|
605 |
00:06:19,790 --> 00:06:21,350
|
|
606 |
followed by r star and
|
|
607 |
|
|
608 |
133
|
|
609 |
00:06:21,350 --> 00:06:23,375
|
|
610 |
then just build the
|
|
611 |
derivative of that.
|
|
612 |
|
|
613 |
134
|
|
614 |
00:06:23,375 --> 00:06:25,280
|
|
615 |
You have to find something more
|
|
616 |
|
|
617 |
135
|
|
618 |
00:06:25,280 --> 00:06:28,730
|
|
619 |
primitive that involves
|
|
620 |
only the derivative of r,
|
|
621 |
|
|
622 |
136
|
|
623 |
00:06:28,730 --> 00:06:31,790
|
|
624 |
essentially, nothing
|
|
625 |
more involved. The same
|
|
626 |
|
|
627 |
137
|
|
628 |
00:06:31,790 --> 00:06:33,815
|
|
629 |
if you have n-times, for example,
|
|
630 |
|
|
631 |
138
|
|
632 |
00:06:33,815 --> 00:06:39,935
|
|
633 |
you can't just unfold this
|
|
634 |
n-times in this sequence
|
|
635 |
|
|
636 |
139
|
|
637 |
00:06:39,935 --> 00:06:43,310
|
|
638 |
of .... n-copies
|
|
639 |
|
|
640 |
140
|
|
641 |
00:06:43,310 --> 00:06:47,285
|
|
642 |
of this r. You have to get
|
|
643 |
something more primitive.
|
|
644 |
|
|
645 |
141
|
|
646 |
00:06:47,285 --> 00:06:49,760
|
|
647 |
Otherwise, as you've
|
|
648 |
seen earlier,
|
|
649 |
|
|
650 |
142
|
|
651 |
00:06:49,760 --> 00:06:53,135
|
|
652 |
your regular expression matcher
|
|
653 |
would be totally slow.
|
|
654 |
|
|
655 |
143
|
|
656 |
00:06:53,135 --> 00:06:55,475
|
|
657 |
When you submit your solution,
|
|
658 |
|
|
659 |
144
|
|
660 |
00:06:55,475 --> 00:06:58,445
|
|
661 |
please submit this
|
|
662 |
solution in the PDF.
|
|
663 |
|
|
664 |
145
|
|
665 |
00:06:58,445 --> 00:07:01,580
|
|
666 |
So give the cases of your definition
|
|
667 |
|
|
668 |
146
|
|
669 |
00:07:01,580 --> 00:07:06,004
|
|
670 |
in a form like that for
|
|
671 |
nullable and derivative.
|
|
672 |
|
|
673 |
147
|
|
674 |
00:07:06,004 --> 00:07:08,405
|
|
675 |
And also implement it in code.
|
|
676 |
|
|
677 |
148
|
|
678 |
00:07:08,405 --> 00:07:10,910
|
|
679 |
That is just helping me to
|
|
680 |
|
|
681 |
149
|
|
682 |
00:07:10,910 --> 00:07:13,400
|
|
683 |
find out whether you have
|
|
684 |
the correct solution or not.
|
|
685 |
|
|
686 |
150
|
|
687 |
00:07:13,400 --> 00:07:15,710
|
|
688 |
So you needed a kind of
|
|
689 |
mathematical notation of
|
|
690 |
|
|
691 |
151
|
|
692 |
00:07:15,710 --> 00:07:18,695
|
|
693 |
your solution, and
|
|
694 |
an actual code.
|
|
695 |
|
|
696 |
152
|
|
697 |
00:07:18,695 --> 00:07:22,414
|
|
698 |
And then once you
|
|
699 |
have your definition,
|
|
700 |
|
|
701 |
153
|
|
702 |
00:07:22,414 --> 00:07:25,699
|
|
703 |
also make sure you try
|
|
704 |
it out on some examples.
|
|
705 |
|
|
706 |
154
|
|
707 |
00:07:25,699 --> 00:07:28,970
|
|
708 |
These will be the examples
|
|
709 |
I will definitely try out,
|
|
710 |
|
|
711 |
155
|
|
712 |
00:07:28,970 --> 00:07:30,560
|
|
713 |
but probably also more
|
|
714 |
|
|
715 |
156
|
|
716 |
00:07:30,560 --> 00:07:33,215
|
|
717 |
depending on what
|
|
718 |
definitions you give me.
|
|
719 |
|
|
720 |
157
|
|
721 |
00:07:33,215 --> 00:07:38,300
|
|
722 |
There's one more question I
|
|
723 |
ask you to do. You've seen
|
|
724 |
|
|
725 |
158
|
|
726 |
00:07:38,300 --> 00:07:40,130
|
|
727 |
there's some regular
|
|
728 |
expressions that
|
|
729 |
|
|
730 |
159
|
|
731 |
00:07:40,130 --> 00:07:42,215
|
|
732 |
are involved for characters,
|
|
733 |
|
|
734 |
160
|
|
735 |
00:07:42,215 --> 00:07:46,925
|
|
736 |
for character ranges or
|
|
737 |
character sets essentially.
|
|
738 |
|
|
739 |
161
|
|
740 |
00:07:46,925 --> 00:07:48,665
|
|
741 |
And sometimes I also want to have
|
|
742 |
|
|
743 |
162
|
|
744 |
00:07:48,665 --> 00:07:51,665
|
|
745 |
just a reg expression which
|
|
746 |
can match any character!!
|
|
747 |
|
|
748 |
163
|
|
749 |
00:07:51,665 --> 00:07:56,195
|
|
750 |
And I could have for all of
|
|
751 |
them separate definitions.
|
|
752 |
|
|
753 |
164
|
|
754 |
00:07:56,195 --> 00:07:57,710
|
|
755 |
And that would mean I also need
|
|
756 |
|
|
757 |
165
|
|
758 |
00:07:57,710 --> 00:07:59,645
|
|
759 |
separate definitions for nullable,
|
|
760 |
|
|
761 |
166
|
|
762 |
00:07:59,645 --> 00:08:02,405
|
|
763 |
separate definitions
|
|
764 |
for derivative.
|
|
765 |
|
|
766 |
167
|
|
767 |
00:08:02,405 --> 00:08:05,825
|
|
768 |
But actually they can be
|
|
769 |
generalized to just one constructor.
|
|
770 |
|
|
771 |
168
|
|
772 |
00:08:05,825 --> 00:08:08,210
|
|
773 |
And that is if we introduce
|
|
774 |
|
|
775 |
169
|
|
776 |
00:08:08,210 --> 00:08:11,630
|
|
777 |
a constructor for regular expressions,
|
|
778 |
|
|
779 |
170
|
|
780 |
00:08:11,630 --> 00:08:13,760
|
|
781 |
which not just takes
|
|
782 |
a single character
|
|
783 |
|
|
784 |
171
|
|
785 |
00:08:13,760 --> 00:08:15,469
|
|
786 |
or set of characters.
|
|
787 |
|
|
788 |
172
|
|
789 |
00:08:15,469 --> 00:08:18,245
|
|
790 |
But, which I call here CFUN.
|
|
791 |
|
|
792 |
173
|
|
793 |
00:08:18,245 --> 00:08:23,165
|
|
794 |
And it takes a function from
|
|
795 |
characters to booleans.
|
|
796 |
|
|
797 |
174
|
|
798 |
00:08:23,165 --> 00:08:24,470
|
|
799 |
So if I want to match
|
|
800 |
|
|
801 |
175
|
|
802 |
00:08:24,470 --> 00:08:26,900
|
|
803 |
just a single character
|
|
804 |
then this function f
|
|
805 |
|
|
806 |
176
|
|
807 |
00:08:26,900 --> 00:08:29,060
|
|
808 |
would only say true
|
|
809 |
|
|
810 |
177
|
|
811 |
00:08:29,060 --> 00:08:32,225
|
|
812 |
if it gets as argument
|
|
813 |
the single character.
|
|
814 |
|
|
815 |
178
|
|
816 |
00:08:32,225 --> 00:08:34,850
|
|
817 |
If I have a character set,
|
|
818 |
|
|
819 |
179
|
|
820 |
00:08:34,850 --> 00:08:36,515
|
|
821 |
then I would say, well,
|
|
822 |
|
|
823 |
180
|
|
824 |
00:08:36,515 --> 00:08:38,300
|
|
825 |
I need a function
|
|
826 |
which says true
|
|
827 |
|
|
828 |
181
|
|
829 |
00:08:38,300 --> 00:08:40,970
|
|
830 |
for all the characters
|
|
831 |
in this set.
|
|
832 |
|
|
833 |
182
|
|
834 |
00:08:40,970 --> 00:08:43,460
|
|
835 |
And otherwise it's false.
|
|
836 |
|
|
837 |
183
|
|
838 |
00:08:43,460 --> 00:08:46,205
|
|
839 |
And if I want to
|
|
840 |
match any character!!,
|
|
841 |
|
|
842 |
184
|
|
843 |
00:08:46,205 --> 00:08:48,500
|
|
844 |
then they should get a function
|
|
845 |
|
|
846 |
185
|
|
847 |
00:08:48,500 --> 00:08:53,450
|
|
848 |
which says true for
|
|
849 |
all characters.
|
|
850 |
|
|
851 |
186
|
|
852 |
00:08:53,450 --> 00:08:56,630
|
|
853 |
Okay? If I have
|
|
854 |
such a constructor
|
|
855 |
|
|
856 |
187
|
|
857 |
00:08:56,630 --> 00:09:00,140
|
|
858 |
that actually I can save
|
|
859 |
myself a bit of work.
|
|
860 |
|
|
861 |
188
|
|
862 |
00:09:00,140 --> 00:09:03,200
|
|
863 |
And I can just have one case
|
|
864 |
|
|
865 |
189
|
|
866 |
00:09:03,200 --> 00:09:06,920
|
|
867 |
for nullable and one
|
|
868 |
case for CFUNS.
|
|
869 |
|
|
870 |
190
|
|
871 |
00:09:06,920 --> 00:09:09,800
|
|
872 |
So that would then subsume
|
|
873 |
the character ranges and
|
|
874 |
|
|
875 |
191
|
|
876 |
00:09:09,800 --> 00:09:13,385
|
|
877 |
the character and also
|
|
878 |
this ALL regular expression.
|
|
879 |
|
|
880 |
192
|
|
881 |
00:09:13,385 --> 00:09:15,380
|
|
882 |
Ok, this was not explicitly
|
|
883 |
included at the beginning,
|
|
884 |
|
|
885 |
193
|
|
886 |
00:09:15,380 --> 00:09:17,510
|
|
887 |
but that's what I can now define.
|
|
888 |
|
|
889 |
194
|
|
890 |
00:09:17,510 --> 00:09:21,740
|
|
891 |
And then I can define
|
|
892 |
this for characters,
|
|
893 |
|
|
894 |
195
|
|
895 |
00:09:21,740 --> 00:09:23,885
|
|
896 |
just as special cases
|
|
897 |
for these functions.
|
|
898 |
|
|
899 |
196
|
|
900 |
00:09:23,885 --> 00:09:25,610
|
|
901 |
And I told you already
|
|
902 |
what this function
|
|
903 |
|
|
904 |
197
|
|
905 |
00:09:25,610 --> 00:09:28,025
|
|
906 |
should look like in
|
|
907 |
these three cases.
|
|
908 |
|
|
909 |
198
|
|
910 |
00:09:28,025 --> 00:09:30,350
|
|
911 |
So I let ...
|
|
912 |
|
|
913 |
199
|
|
914 |
00:09:30,350 --> 00:09:33,515
|
|
915 |
you decide how you're
|
|
916 |
going to implement that.
|
|
917 |
|
|
918 |
200
|
|
919 |
00:09:33,515 --> 00:09:35,450
|
|
920 |
If you first define
|
|
921 |
|
|
922 |
201
|
|
923 |
00:09:35,450 --> 00:09:38,495
|
|
924 |
your implementation is
|
|
925 |
all these explicit cases.
|
|
926 |
|
|
927 |
202
|
|
928 |
00:09:38,495 --> 00:09:41,900
|
|
929 |
Because in these cases it's
|
|
930 |
actually more intuitive,
|
|
931 |
|
|
932 |
203
|
|
933 |
00:09:41,900 --> 00:09:45,980
|
|
934 |
what nullable and
|
|
935 |
derivative should be.
|
|
936 |
|
|
937 |
204
|
|
938 |
00:09:45,980 --> 00:09:48,035
|
|
939 |
And then in a second step,
|
|
940 |
|
|
941 |
205
|
|
942 |
00:09:48,035 --> 00:09:51,140
|
|
943 |
you implement these
|
|
944 |
more general cases.
|
|
945 |
|
|
946 |
206
|
|
947 |
00:09:51,140 --> 00:09:53,360
|
|
948 |
And just keep the original ones
|
|
949 |
|
|
950 |
207
|
|
951 |
00:09:53,360 --> 00:09:54,500
|
|
952 |
in your implementation if you
|
|
953 |
|
|
954 |
208
|
|
955 |
00:09:54,500 --> 00:09:57,710
|
|
956 |
want, or if you feel
|
|
957 |
adventurous enough,
|
|
958 |
|
|
959 |
209
|
|
960 |
00:09:57,710 --> 00:09:59,225
|
|
961 |
and want to be lazy,
|
|
962 |
|
|
963 |
210
|
|
964 |
00:09:59,225 --> 00:10:01,115
|
|
965 |
that you just implement that.
|
|
966 |
|
|
967 |
211
|
|
968 |
00:10:01,115 --> 00:10:02,660
|
|
969 |
And then you have already done
|
|
970 |
|
|
971 |
212
|
|
972 |
00:10:02,660 --> 00:10:06,530
|
|
973 |
at least two constructors
|
|
974 |
by just implementing one.
|
|
975 |
|
|
976 |
213
|
|
977 |
00:10:06,530 --> 00:10:08,915
|
|
978 |
But as said that
|
|
979 |
requires a bit skill
|
|
980 |
|
|
981 |
214
|
|
982 |
00:10:08,915 --> 00:10:11,450
|
|
983 |
of generalizing how
|
|
984 |
that should be.
|
|
985 |
|
|
986 |
215
|
|
987 |
00:10:11,450 --> 00:10:14,180
|
|
988 |
And the other questions
|
|
989 |
are just then
|
|
990 |
|
|
991 |
216
|
|
992 |
00:10:14,180 --> 00:10:16,745
|
|
993 |
trying out your
|
|
994 |
expression matcher on
|
|
995 |
|
|
996 |
217
|
|
997 |
00:10:16,745 --> 00:10:19,580
|
|
998 |
some examples, more
|
|
999 |
power examples,
|
|
1000 |
|
|
1001 |
218
|
|
1002 |
00:10:19,580 --> 00:10:22,400
|
|
1003 |
and they should be
|
|
1004 |
very easy to solve.
|
|
1005 |
|
|
1006 |
219
|
|
1007 |
00:10:22,400 --> 00:10:24,605
|
|
1008 |
So I hope it's not
|
|
1009 |
too much asked for
|
|
1010 |
|
|
1011 |
220
|
|
1012 |
00:10:24,605 --> 00:10:26,615
|
|
1013 |
and I hope you have fun.
|
|
1014 |
|
|
1015 |
221
|
|
1016 |
00:10:26,615 --> 00:10:29,675
|
|
1017 |
It is not too much ask
|
|
1018 |
because you can,
|
|
1019 |
|
|
1020 |
222
|
|
1021 |
00:10:29,675 --> 00:10:32,420
|
|
1022 |
I hope it's not too much
|
|
1023 |
because you can start from
|
|
1024 |
|
|
1025 |
223
|
|
1026 |
00:10:32,420 --> 00:10:35,615
|
|
1027 |
my definitions or
|
|
1028 |
my implementation.
|
|
1029 |
|
|
1030 |
224
|
|
1031 |
00:10:35,615 --> 00:10:39,425
|
|
1032 |
It's really essentially
|
|
1033 |
mostly is brain work,
|
|
1034 |
|
|
1035 |
225
|
|
1036 |
00:10:39,425 --> 00:10:42,170
|
|
1037 |
how these nullable and
|
|
1038 |
derivative should work.
|
|
1039 |
|
|
1040 |
226
|
|
1041 |
00:10:42,170 --> 00:10:44,510
|
|
1042 |
If you're in a
|
|
1043 |
language like Scala,
|
|
1044 |
|
|
1045 |
227
|
|
1046 |
00:10:44,510 --> 00:10:48,875
|
|
1047 |
the implementation should
|
|
1048 |
then be easy-peasy.
|
|
1049 |
|
|
1050 |
228
|
|
1051 |
00:10:48,875 --> 00:10:51,365
|
|
1052 |
If you are in a different language
|
|
1053 |
|
|
1054 |
229
|
|
1055 |
00:10:51,365 --> 00:10:52,610
|
|
1056 |
I assume you also
|
|
1057 |
|
|
1058 |
230
|
|
1059 |
00:10:52,610 --> 00:10:54,890
|
|
1060 |
dedicated to that
|
|
1061 |
language to start with,
|
|
1062 |
|
|
1063 |
231
|
|
1064 |
00:10:54,890 --> 00:10:58,475
|
|
1065 |
so it should be also very
|
|
1066 |
easy for you to translate
|
|
1067 |
|
|
1068 |
232
|
|
1069 |
00:10:58,475 --> 00:11:01,100
|
|
1070 |
my Scala code into whatever
|
|
1071 |
language you want to
|
|
1072 |
|
|
1073 |
233
|
|
1074 |
00:11:01,100 --> 00:11:04,280
|
|
1075 |
do, first and then
|
|
1076 |
start from there.
|
|
1077 |
|
|
1078 |
234
|
|
1079 |
00:11:04,280 --> 00:11:06,230
|
|
1080 |
If there are any questions,
|
|
1081 |
|
|
1082 |
235
|
|
1083 |
00:11:06,230 --> 00:11:08,360
|
|
1084 |
please ask me or the TAs.
|
|
1085 |
|
|
1086 |
236
|
|
1087 |
00:11:08,360 --> 00:11:10,040
|
|
1088 |
We are trying to be as helpful
|
|
1089 |
|
|
1090 |
237
|
|
1091 |
00:11:10,040 --> 00:11:12,900
|
|
1092 |
as possible with the coursework.
|
|
1093 |
|
|
1094 |
238
|
|
1095 |
00:11:13,240 --> 00:11:17,885
|
|
1096 |
Bear in mind, this is the
|
|
1097 |
first step in our compiler.
|
|
1098 |
|
|
1099 |
239
|
|
1100 |
00:11:17,885 --> 00:11:21,005
|
|
1101 |
The coursework 2 will then
|
|
1102 |
build on top of that.
|
|
1103 |
|
|
1104 |
240
|
|
1105 |
00:11:21,005 --> 00:11:25,770
|
|
1106 |
So it's better to get this
|
|
1107 |
correct first. Thanks.
|