|
1 1 |
|
2 00:00:06,410 --> 00:00:09,390 |
|
3 The final video for this week. |
|
4 |
|
5 2 |
|
6 00:00:09,390 --> 00:00:12,465 |
|
7 And let me say something |
|
8 about the coursework. |
|
9 |
|
10 3 |
|
11 00:00:12,465 --> 00:00:15,300 |
|
12 First. You can solve |
|
13 |
|
14 4 |
|
15 00:00:15,300 --> 00:00:17,745 |
|
16 the coursework in any programming |
|
17 language you're like, |
|
18 |
|
19 5 |
|
20 00:00:17,745 --> 00:00:22,440 |
|
21 I already said that. You |
|
22 have to submit a PDF file. |
|
23 |
|
24 6 |
|
25 00:00:22,440 --> 00:00:23,850 |
|
26 There will be some questions, |
|
27 |
|
28 7 |
|
29 00:00:23,850 --> 00:00:26,250 |
|
30 so you have to write down the |
|
31 solution to this questions. |
|
32 |
|
33 8 |
|
34 00:00:26,250 --> 00:00:30,315 |
|
35 Please use a PDF and you have |
|
36 to submit your source code. |
|
37 |
|
38 9 |
|
39 00:00:30,315 --> 00:00:35,580 |
|
40 And yes, if you do use a |
|
41 language which isn't Scala, |
|
42 |
|
43 10 |
|
44 00:00:35,580 --> 00:00:39,450 |
|
45 it might help to tell me |
|
46 how I can run your code. |
|
47 |
|
48 11 |
|
49 00:00:39,450 --> 00:00:42,550 |
|
50 If I can't run your code, |
|
51 I will ask you anyway. |
|
52 |
|
53 12 |
|
54 00:00:42,550 --> 00:00:50,044 |
|
55 Also, please submit both the |
|
56 PDF and the code in a file, |
|
57 |
|
58 13 |
|
59 00:00:50,044 --> 00:00:52,730 |
|
60 in a zip file, which generates |
|
61 |
|
62 14 |
|
63 00:00:52,730 --> 00:00:55,835 |
|
64 a subdirectory with your |
|
65 name and your family name. |
|
66 |
|
67 15 |
|
68 00:00:55,835 --> 00:00:58,970 |
|
69 That we'll just save a |
|
70 lot of time for me to try |
|
71 |
|
72 16 |
|
73 00:00:58,970 --> 00:01:02,030 |
|
74 to figure out who |
|
75 has submitted what. |
|
76 |
|
77 17 |
|
78 00:01:02,030 --> 00:01:04,445 |
|
79 Please do that. |
|
80 |
|
81 18 |
|
82 00:01:04,445 --> 00:01:07,789 |
|
83 So what is the task |
|
84 into coursework? |
|
85 |
|
86 19 |
|
87 00:01:07,789 --> 00:01:09,885 |
|
88 I essentially showed you how |
|
89 |
|
90 20 |
|
91 00:01:09,885 --> 00:01:11,995 |
|
92 the Brzozowski matcher |
|
93 |
|
94 21 |
|
95 00:01:11,995 --> 00:01:14,645 |
|
96 works for the basic |
|
97 regular expressions. |
|
98 |
|
99 22 |
|
100 00:01:14,645 --> 00:01:16,295 |
|
101 I also showed you actually how it |
|
102 |
|
103 23 |
|
104 00:01:16,295 --> 00:01:18,110 |
|
105 works for the n-times. |
|
106 |
|
107 24 |
|
108 00:01:18,110 --> 00:01:20,300 |
|
109 And the task in coursework |
|
110 |
|
111 25 |
|
112 00:01:20,300 --> 00:01:22,970 |
|
113 is that you extend the |
|
114 Brzozowski matcher to |
|
115 |
|
116 26 |
|
117 00:01:22,970 --> 00:01:25,820 |
|
118 the other regular expressions |
|
119 |
|
120 27 |
|
121 00:01:25,820 --> 00:01:27,800 |
|
122 from the extended |
|
123 regular expressions. |
|
124 |
|
125 28 |
|
126 00:01:27,800 --> 00:01:30,245 |
|
127 So you're supposed |
|
128 to extended with |
|
129 |
|
130 29 |
|
131 00:01:30,245 --> 00:01:34,490 |
|
132 r-plus, optional r, for n-times. |
|
133 |
|
134 30 |
|
135 00:01:34,490 --> 00:01:38,420 |
|
136 You've already seen that. |
|
137 For 0 or more times, |
|
138 |
|
139 31 |
|
140 00:01:38,420 --> 00:01:41,120 |
|
141 but not more than m |
|
142 times regular expression. |
|
143 |
|
144 32 |
|
145 00:01:41,120 --> 00:01:45,890 |
|
146 For at least n-times regular |
|
147 expression and for between |
|
148 |
|
149 33 |
|
150 00:01:45,890 --> 00:01:47,480 |
|
151 n times and m times |
|
152 |
|
153 34 |
|
154 00:01:47,480 --> 00:01:50,810 |
|
155 regular expression and also this |
|
156 NOT regular expression. |
|
157 |
|
158 35 |
|
159 00:01:50,810 --> 00:01:52,790 |
|
160 So your task is to |
|
161 |
|
162 36 |
|
163 00:01:52,790 --> 00:01:55,310 |
|
164 essentially find the definition |
|
165 |
|
166 37 |
|
167 00:01:55,310 --> 00:01:57,155 |
|
168 for nullable in these cases. |
|
169 |
|
170 38 |
|
171 00:01:57,155 --> 00:02:00,215 |
|
172 Find a definition for derivative, |
|
173 |
|
174 39 |
|
175 00:02:00,215 --> 00:02:02,480 |
|
176 implement them, |
|
177 write them down in |
|
178 |
|
179 40 |
|
180 00:02:02,480 --> 00:02:06,065 |
|
181 a PDF and then do some |
|
182 experiments with them. |
|
183 |
|
184 41 |
|
185 00:02:06,065 --> 00:02:08,875 |
|
186 Well, how can you do that? |
|
187 |
|
188 42 |
|
189 00:02:08,875 --> 00:02:10,370 |
|
190 Well I've given you |
|
191 |
|
192 43 |
|
193 00:02:10,370 --> 00:02:13,565 |
|
194 the meaning of these additional |
|
195 regular expressions. |
|
196 |
|
197 44 |
|
198 00:02:13,565 --> 00:02:16,730 |
|
199 So here's, for example, the |
|
200 meaning of this r-plus. |
|
201 |
|
202 45 |
|
203 00:02:16,730 --> 00:02:18,290 |
|
204 So that would be, I have |
|
205 |
|
206 46 |
|
207 00:02:18,290 --> 00:02:22,115 |
|
208 at least one copy up to infinity. |
|
209 |
|
210 47 |
|
211 00:02:22,115 --> 00:02:25,070 |
|
212 And the optional-r would be it's |
|
213 |
|
214 48 |
|
215 00:02:25,070 --> 00:02:28,370 |
|
216 the language of r plus |
|
217 the empty string. |
|
218 |
|
219 49 |
|
220 00:02:28,370 --> 00:02:30,440 |
|
221 If I have it exactly n times, |
|
222 |
|
223 50 |
|
224 00:02:30,440 --> 00:02:31,985 |
|
225 then it would be |
|
226 just the language |
|
227 |
|
228 51 |
|
229 00:02:31,985 --> 00:02:34,010 |
|
230 of r exactly n-times. |
|
231 |
|
232 52 |
|
233 00:02:34,010 --> 00:02:38,105 |
|
234 And here you have union |
|
235 from 0 to m and so on. |
|
236 |
|
237 53 |
|
238 00:02:38,105 --> 00:02:42,560 |
|
239 This might be a slightly |
|
240 interesting regular expression. |
|
241 |
|
242 54 |
|
243 00:02:42,560 --> 00:02:46,580 |
|
244 So that's supposed to be the |
|
245 character set that is very |
|
246 |
|
247 55 |
|
248 00:02:46,580 --> 00:02:48,410 |
|
249 similar to character ranges like |
|
250 |
|
251 56 |
|
252 00:02:48,410 --> 00:02:51,005 |
|
253 in the existing regular |
|
254 expression matchers. |
|
255 |
|
256 57 |
|
257 00:02:51,005 --> 00:02:52,820 |
|
258 So this just says |
|
259 |
|
260 58 |
|
261 00:02:52,820 --> 00:02:55,565 |
|
262 ...this regular |
|
263 expression can match |
|
264 |
|
265 59 |
|
266 00:02:55,565 --> 00:03:00,860 |
|
267 either the character c1 or |
|
268 the character c2 up to cn. |
|
269 |
|
270 60 |
|
271 00:03:00,860 --> 00:03:03,620 |
|
272 Why do I include |
|
273 that regular expression? |
|
274 |
|
275 61 |
|
276 00:03:03,620 --> 00:03:09,050 |
|
277 I could also just say |
|
278 c1 plus c2 plus c3... |
|
279 |
|
280 62 |
|
281 00:03:09,050 --> 00:03:12,889 |
|
282 a big alternative. |
|
283 Well that's possible. |
|
284 |
|
285 63 |
|
286 00:03:12,889 --> 00:03:16,595 |
|
287 But remember the Achilles' |
|
288 heel of this algorithm is, |
|
289 |
|
290 64 |
|
291 00:03:16,595 --> 00:03:18,425 |
|
292 if the regular expression is large, |
|
293 |
|
294 65 |
|
295 00:03:18,425 --> 00:03:20,825 |
|
296 then the computation |
|
297 take always a long time. |
|
298 |
|
299 66 |
|
300 00:03:20,825 --> 00:03:23,840 |
|
301 So I'm trying to compress |
|
302 that as much as I can. |
|
303 |
|
304 67 |
|
305 00:03:23,840 --> 00:03:25,370 |
|
306 So instead of giving a big |
|
307 |
|
308 68 |
|
309 00:03:25,370 --> 00:03:29,134 |
|
310 alternative, I am trying to |
|
311 give one regular expression, |
|
312 |
|
313 69 |
|
314 00:03:29,134 --> 00:03:31,340 |
|
315 which can not just match |
|
316 a single character, |
|
317 |
|
318 70 |
|
319 00:03:31,340 --> 00:03:34,230 |
|
320 but a set of characters. |
|
321 |
|
322 71 |
|
323 00:03:34,630 --> 00:03:36,980 |
|
324 How can you be sure that |
|
325 |
|
326 72 |
|
327 00:03:36,980 --> 00:03:39,215 |
|
328 definition you come |
|
329 up with are correct? |
|
330 |
|
331 73 |
|
332 00:03:39,215 --> 00:03:41,450 |
|
333 Well, I showed you which are |
|
334 |
|
335 74 |
|
336 00:03:41,450 --> 00:03:45,170 |
|
337 the properties these two |
|
338 functions need to satisfy. |
|
339 |
|
340 75 |
|
341 00:03:45,170 --> 00:03:47,060 |
|
342 I've given you here what |
|
343 |
|
344 76 |
|
345 00:03:47,060 --> 00:03:49,325 |
|
346 the meaning of these |
|
347 expressions are. |
|
348 |
|
349 77 |
|
350 00:03:49,325 --> 00:03:52,700 |
|
351 So you will always know what's |
|
352 on the right-hand side. |
|
353 |
|
354 78 |
|
355 00:03:52,700 --> 00:03:54,515 |
|
356 This is completely determined. |
|
357 |
|
358 79 |
|
359 00:03:54,515 --> 00:03:56,720 |
|
360 You essentially |
|
361 have to fill something |
|
362 |
|
363 80 |
|
364 00:03:56,720 --> 00:03:58,910 |
|
365 equivalent on the left-hand side. |
|
366 |
|
367 81 |
|
368 00:03:58,910 --> 00:04:02,105 |
|
369 That's the main task |
|
370 of the coursework. |
|
371 |
|
372 82 |
|
373 00:04:02,105 --> 00:04:08,090 |
|
374 And you can start from the |
|
375 files I provided on KEATS. |
|
376 |
|
377 83 |
|
378 00:04:08,090 --> 00:04:12,125 |
|
379 So you would just have to |
|
380 add these additional cases. |
|
381 |
|
382 84 |
|
383 00:04:12,125 --> 00:04:15,020 |
|
384 When you add these |
|
385 additional cases |
|
386 |
|
387 85 |
|
388 00:04:15,020 --> 00:04:17,330 |
|
389 and you're using the Scala language, |
|
390 |
|
391 86 |
|
392 00:04:17,330 --> 00:04:18,980 |
|
393 you can do me a favour. |
|
394 |
|
395 87 |
|
396 00:04:18,980 --> 00:04:22,550 |
|
397 You can call these |
|
398 constructors for |
|
399 |
|
400 88 |
|
401 00:04:22,550 --> 00:04:25,400 |
|
402 these different regular |
|
403 expressions as RANGE, |
|
404 |
|
405 89 |
|
406 00:04:25,400 --> 00:04:28,985 |
|
407 PLUS, OPTIONAL and NTIMES, |
|
408 UPTO, FROM and BETWEEN. |
|
409 |
|
410 90 |
|
411 00:04:28,985 --> 00:04:31,370 |
|
412 Remember I have this |
|
413 convention to always |
|
414 |
|
415 91 |
|
416 00:04:31,370 --> 00:04:34,025 |
|
417 use capital letters |
|
418 for regular expressions. |
|
419 |
|
420 92 |
|
421 00:04:34,025 --> 00:04:36,680 |
|
422 It would be nice if you could use |
|
423 |
|
424 93 |
|
425 00:04:36,680 --> 00:04:39,260 |
|
426 these names because |
|
427 then it will be |
|
428 |
|
429 94 |
|
430 00:04:39,260 --> 00:04:42,335 |
|
431 very easy for me |
|
432 to test your code. |
|
433 |
|
434 95 |
|
435 00:04:42,335 --> 00:04:44,690 |
|
436 If you're using a different |
|
437 programming language |
|
438 |
|
439 96 |
|
440 00:04:44,690 --> 00:04:46,370 |
|
441 like let's say Rust, |
|
442 |
|
443 97 |
|
444 00:04:46,370 --> 00:04:48,860 |
|
445 I expect some people will use that, where I |
|
446 |
|
447 98 |
|
448 00:04:48,860 --> 00:04:51,380 |
|
449 have no idea what are the |
|
450 conventions in your language, |
|
451 |
|
452 99 |
|
453 00:04:51,380 --> 00:04:53,420 |
|
454 how you have to call |
|
455 these constructors, |
|
456 |
|
457 100 |
|
458 00:04:53,420 --> 00:04:56,420 |
|
459 we will see whatever you |
|
460 submit. I'll have a look. |
|
461 |
|
462 101 |
|
463 00:04:56,420 --> 00:04:59,360 |
|
464 There's one more |
|
465 constraint I have to |
|
466 |
|
467 102 |
|
468 00:04:59,360 --> 00:05:02,194 |
|
469 impose to make this |
|
470 coursework interesting. |
|
471 |
|
472 103 |
|
473 00:05:02,194 --> 00:05:04,730 |
|
474 I do not want you |
|
475 that you just take |
|
476 |
|
477 104 |
|
478 00:05:04,730 --> 00:05:07,370 |
|
479 these extended regular |
|
480 expressions and that you |
|
481 |
|
482 105 |
|
483 00:05:07,370 --> 00:05:10,550 |
|
484 expand them using |
|
485 their definition. |
|
486 |
|
487 106 |
|
488 00:05:10,550 --> 00:05:12,320 |
|
489 Because that would make the regular |
|
490 |
|
491 107 |
|
492 00:05:12,320 --> 00:05:13,955 |
|
493 expression matcher very slow. |
|
494 |
|
495 108 |
|
496 00:05:13,955 --> 00:05:16,160 |
|
497 So for example, if |
|
498 you want to define |
|
499 |
|
500 109 |
|
501 00:05:16,160 --> 00:05:18,785 |
|
502 what's the derivative for r-plus, |
|
503 |
|
504 110 |
|
505 00:05:18,785 --> 00:05:21,560 |
|
506 then what does not |
|
507 count as a solution: |
|
508 |
|
509 111 |
|
510 00:05:21,560 --> 00:05:24,770 |
|
511 if you just expand that |
|
512 to the definition that r |
|
513 |
|
514 112 |
|
515 00:05:24,770 --> 00:05:28,935 |
|
516 plus is equivalent to |
|
517 r followed by r-star. |
|
518 |
|
519 113 |
|
520 00:05:28,935 --> 00:05:32,845 |
|
521 So in code, what I |
|
522 would like to not see, |
|
523 |
|
524 114 |
|
525 00:05:32,845 --> 00:05:35,680 |
|
526 I would not give any |
|
527 marks for that is, |
|
528 |
|
529 115 |
|
530 00:05:35,680 --> 00:05:37,780 |
|
531 if you add the plus as follows, |
|
532 |
|
533 116 |
|
534 00:05:37,780 --> 00:05:39,910 |
|
535 so that is still perfectly fine. |
|
536 |
|
537 117 |
|
538 00:05:39,910 --> 00:05:42,160 |
|
539 There's nothing you |
|
540 can do differently. |
|
541 |
|
542 118 |
|
543 00:05:42,160 --> 00:05:44,065 |
|
544 That would be the constructor. |
|
545 |
|
546 119 |
|
547 00:05:44,065 --> 00:05:46,480 |
|
548 But when you define nullable, |
|
549 |
|
550 120 |
|
551 00:05:46,480 --> 00:05:49,180 |
|
552 I do not want to see |
|
553 that you defined |
|
554 |
|
555 121 |
|
556 00:05:49,180 --> 00:05:51,790 |
|
557 this plus r as nullable |
|
558 |
|
559 122 |
|
560 00:05:51,790 --> 00:05:54,385 |
|
561 of the sequence of r and star-r, |
|
562 |
|
563 123 |
|
564 00:05:54,385 --> 00:05:58,075 |
|
565 just unfolding |
|
566 the definition. |
|
567 |
|
568 124 |
|
569 00:05:58,075 --> 00:06:00,415 |
|
570 As you can see, when you come |
|
571 |
|
572 125 |
|
573 00:06:00,415 --> 00:06:02,815 |
|
574 up with a much better |
|
575 solution for that. |
|
576 |
|
577 126 |
|
578 00:06:02,815 --> 00:06:05,110 |
|
579 This is a very inefficient way |
|
580 |
|
581 127 |
|
582 00:06:05,110 --> 00:06:07,090 |
|
583 for how to calculate nullable. |
|
584 |
|
585 128 |
|
586 00:06:07,090 --> 00:06:10,825 |
|
587 And so the same for derivative |
|
588 would not be allowed. |
|
589 |
|
590 129 |
|
591 00:06:10,825 --> 00:06:13,895 |
|
592 If you, I have the plus r here, |
|
593 |
|
594 130 |
|
595 00:06:13,895 --> 00:06:16,685 |
|
596 you can't just unfold |
|
597 the definition, |
|
598 |
|
599 131 |
|
600 00:06:16,685 --> 00:06:19,790 |
|
601 with r-plus |
|
602 being defined as r |
|
603 |
|
604 132 |
|
605 00:06:19,790 --> 00:06:21,350 |
|
606 followed by r star and |
|
607 |
|
608 133 |
|
609 00:06:21,350 --> 00:06:23,375 |
|
610 then just build the |
|
611 derivative of that. |
|
612 |
|
613 134 |
|
614 00:06:23,375 --> 00:06:25,280 |
|
615 You have to find something more |
|
616 |
|
617 135 |
|
618 00:06:25,280 --> 00:06:28,730 |
|
619 primitive that involves |
|
620 only the derivative of r, |
|
621 |
|
622 136 |
|
623 00:06:28,730 --> 00:06:31,790 |
|
624 essentially, nothing |
|
625 more involved. The same |
|
626 |
|
627 137 |
|
628 00:06:31,790 --> 00:06:33,815 |
|
629 if you have n-times, for example, |
|
630 |
|
631 138 |
|
632 00:06:33,815 --> 00:06:39,935 |
|
633 you can't just unfold this |
|
634 n-times in this sequence |
|
635 |
|
636 139 |
|
637 00:06:39,935 --> 00:06:43,310 |
|
638 of .... n-copies |
|
639 |
|
640 140 |
|
641 00:06:43,310 --> 00:06:47,285 |
|
642 of this r. You have to get |
|
643 something more primitive. |
|
644 |
|
645 141 |
|
646 00:06:47,285 --> 00:06:49,760 |
|
647 Otherwise, as you've |
|
648 seen earlier, |
|
649 |
|
650 142 |
|
651 00:06:49,760 --> 00:06:53,135 |
|
652 your regular expression matcher |
|
653 would be totally slow. |
|
654 |
|
655 143 |
|
656 00:06:53,135 --> 00:06:55,475 |
|
657 When you submit your solution, |
|
658 |
|
659 144 |
|
660 00:06:55,475 --> 00:06:58,445 |
|
661 please submit this |
|
662 solution in the PDF. |
|
663 |
|
664 145 |
|
665 00:06:58,445 --> 00:07:01,580 |
|
666 So give the cases of your definition |
|
667 |
|
668 146 |
|
669 00:07:01,580 --> 00:07:06,004 |
|
670 in a form like that for |
|
671 nullable and derivative. |
|
672 |
|
673 147 |
|
674 00:07:06,004 --> 00:07:08,405 |
|
675 And also implement it in code. |
|
676 |
|
677 148 |
|
678 00:07:08,405 --> 00:07:10,910 |
|
679 That is just helping me to |
|
680 |
|
681 149 |
|
682 00:07:10,910 --> 00:07:13,400 |
|
683 find out whether you have |
|
684 the correct solution or not. |
|
685 |
|
686 150 |
|
687 00:07:13,400 --> 00:07:15,710 |
|
688 So you needed a kind of |
|
689 mathematical notation of |
|
690 |
|
691 151 |
|
692 00:07:15,710 --> 00:07:18,695 |
|
693 your solution, and |
|
694 an actual code. |
|
695 |
|
696 152 |
|
697 00:07:18,695 --> 00:07:22,414 |
|
698 And then once you |
|
699 have your definition, |
|
700 |
|
701 153 |
|
702 00:07:22,414 --> 00:07:25,699 |
|
703 also make sure you try |
|
704 it out on some examples. |
|
705 |
|
706 154 |
|
707 00:07:25,699 --> 00:07:28,970 |
|
708 These will be the examples |
|
709 I will definitely try out, |
|
710 |
|
711 155 |
|
712 00:07:28,970 --> 00:07:30,560 |
|
713 but probably also more |
|
714 |
|
715 156 |
|
716 00:07:30,560 --> 00:07:33,215 |
|
717 depending on what |
|
718 definitions you give me. |
|
719 |
|
720 157 |
|
721 00:07:33,215 --> 00:07:38,300 |
|
722 There's one more question I |
|
723 ask you to do. You've seen |
|
724 |
|
725 158 |
|
726 00:07:38,300 --> 00:07:40,130 |
|
727 there's some regular |
|
728 expressions that |
|
729 |
|
730 159 |
|
731 00:07:40,130 --> 00:07:42,215 |
|
732 are involved for characters, |
|
733 |
|
734 160 |
|
735 00:07:42,215 --> 00:07:46,925 |
|
736 for character ranges or |
|
737 character sets essentially. |
|
738 |
|
739 161 |
|
740 00:07:46,925 --> 00:07:48,665 |
|
741 And sometimes I also want to have |
|
742 |
|
743 162 |
|
744 00:07:48,665 --> 00:07:51,665 |
|
745 just a reg expression which |
|
746 can match any character!! |
|
747 |
|
748 163 |
|
749 00:07:51,665 --> 00:07:56,195 |
|
750 And I could have for all of |
|
751 them separate definitions. |
|
752 |
|
753 164 |
|
754 00:07:56,195 --> 00:07:57,710 |
|
755 And that would mean I also need |
|
756 |
|
757 165 |
|
758 00:07:57,710 --> 00:07:59,645 |
|
759 separate definitions for nullable, |
|
760 |
|
761 166 |
|
762 00:07:59,645 --> 00:08:02,405 |
|
763 separate definitions |
|
764 for derivative. |
|
765 |
|
766 167 |
|
767 00:08:02,405 --> 00:08:05,825 |
|
768 But actually they can be |
|
769 generalized to just one constructor. |
|
770 |
|
771 168 |
|
772 00:08:05,825 --> 00:08:08,210 |
|
773 And that is if we introduce |
|
774 |
|
775 169 |
|
776 00:08:08,210 --> 00:08:11,630 |
|
777 a constructor for regular expressions, |
|
778 |
|
779 170 |
|
780 00:08:11,630 --> 00:08:13,760 |
|
781 which not just takes |
|
782 a single character |
|
783 |
|
784 171 |
|
785 00:08:13,760 --> 00:08:15,469 |
|
786 or set of characters. |
|
787 |
|
788 172 |
|
789 00:08:15,469 --> 00:08:18,245 |
|
790 But, which I call here CFUN. |
|
791 |
|
792 173 |
|
793 00:08:18,245 --> 00:08:23,165 |
|
794 And it takes a function from |
|
795 characters to booleans. |
|
796 |
|
797 174 |
|
798 00:08:23,165 --> 00:08:24,470 |
|
799 So if I want to match |
|
800 |
|
801 175 |
|
802 00:08:24,470 --> 00:08:26,900 |
|
803 just a single character |
|
804 then this function f |
|
805 |
|
806 176 |
|
807 00:08:26,900 --> 00:08:29,060 |
|
808 would only say true |
|
809 |
|
810 177 |
|
811 00:08:29,060 --> 00:08:32,225 |
|
812 if it gets as argument |
|
813 the single character. |
|
814 |
|
815 178 |
|
816 00:08:32,225 --> 00:08:34,850 |
|
817 If I have a character set, |
|
818 |
|
819 179 |
|
820 00:08:34,850 --> 00:08:36,515 |
|
821 then I would say, well, |
|
822 |
|
823 180 |
|
824 00:08:36,515 --> 00:08:38,300 |
|
825 I need a function |
|
826 which says true |
|
827 |
|
828 181 |
|
829 00:08:38,300 --> 00:08:40,970 |
|
830 for all the characters |
|
831 in this set. |
|
832 |
|
833 182 |
|
834 00:08:40,970 --> 00:08:43,460 |
|
835 And otherwise it's false. |
|
836 |
|
837 183 |
|
838 00:08:43,460 --> 00:08:46,205 |
|
839 And if I want to |
|
840 match any character!!, |
|
841 |
|
842 184 |
|
843 00:08:46,205 --> 00:08:48,500 |
|
844 then they should get a function |
|
845 |
|
846 185 |
|
847 00:08:48,500 --> 00:08:53,450 |
|
848 which says true for |
|
849 all characters. |
|
850 |
|
851 186 |
|
852 00:08:53,450 --> 00:08:56,630 |
|
853 Okay? If I have |
|
854 such a constructor |
|
855 |
|
856 187 |
|
857 00:08:56,630 --> 00:09:00,140 |
|
858 that actually I can save |
|
859 myself a bit of work. |
|
860 |
|
861 188 |
|
862 00:09:00,140 --> 00:09:03,200 |
|
863 And I can just have one case |
|
864 |
|
865 189 |
|
866 00:09:03,200 --> 00:09:06,920 |
|
867 for nullable and one |
|
868 case for CFUNS. |
|
869 |
|
870 190 |
|
871 00:09:06,920 --> 00:09:09,800 |
|
872 So that would then subsume |
|
873 the character ranges and |
|
874 |
|
875 191 |
|
876 00:09:09,800 --> 00:09:13,385 |
|
877 the character and also |
|
878 this ALL regular expression. |
|
879 |
|
880 192 |
|
881 00:09:13,385 --> 00:09:15,380 |
|
882 Ok, this was not explicitly |
|
883 included at the beginning, |
|
884 |
|
885 193 |
|
886 00:09:15,380 --> 00:09:17,510 |
|
887 but that's what I can now define. |
|
888 |
|
889 194 |
|
890 00:09:17,510 --> 00:09:21,740 |
|
891 And then I can define |
|
892 this for characters, |
|
893 |
|
894 195 |
|
895 00:09:21,740 --> 00:09:23,885 |
|
896 just as special cases |
|
897 for these functions. |
|
898 |
|
899 196 |
|
900 00:09:23,885 --> 00:09:25,610 |
|
901 And I told you already |
|
902 what this function |
|
903 |
|
904 197 |
|
905 00:09:25,610 --> 00:09:28,025 |
|
906 should look like in |
|
907 these three cases. |
|
908 |
|
909 198 |
|
910 00:09:28,025 --> 00:09:30,350 |
|
911 So I let ... |
|
912 |
|
913 199 |
|
914 00:09:30,350 --> 00:09:33,515 |
|
915 you decide how you're |
|
916 going to implement that. |
|
917 |
|
918 200 |
|
919 00:09:33,515 --> 00:09:35,450 |
|
920 If you first define |
|
921 |
|
922 201 |
|
923 00:09:35,450 --> 00:09:38,495 |
|
924 your implementation is |
|
925 all these explicit cases. |
|
926 |
|
927 202 |
|
928 00:09:38,495 --> 00:09:41,900 |
|
929 Because in these cases it's |
|
930 actually more intuitive, |
|
931 |
|
932 203 |
|
933 00:09:41,900 --> 00:09:45,980 |
|
934 what nullable and |
|
935 derivative should be. |
|
936 |
|
937 204 |
|
938 00:09:45,980 --> 00:09:48,035 |
|
939 And then in a second step, |
|
940 |
|
941 205 |
|
942 00:09:48,035 --> 00:09:51,140 |
|
943 you implement these |
|
944 more general cases. |
|
945 |
|
946 206 |
|
947 00:09:51,140 --> 00:09:53,360 |
|
948 And just keep the original ones |
|
949 |
|
950 207 |
|
951 00:09:53,360 --> 00:09:54,500 |
|
952 in your implementation if you |
|
953 |
|
954 208 |
|
955 00:09:54,500 --> 00:09:57,710 |
|
956 want, or if you feel |
|
957 adventurous enough, |
|
958 |
|
959 209 |
|
960 00:09:57,710 --> 00:09:59,225 |
|
961 and want to be lazy, |
|
962 |
|
963 210 |
|
964 00:09:59,225 --> 00:10:01,115 |
|
965 that you just implement that. |
|
966 |
|
967 211 |
|
968 00:10:01,115 --> 00:10:02,660 |
|
969 And then you have already done |
|
970 |
|
971 212 |
|
972 00:10:02,660 --> 00:10:06,530 |
|
973 at least two constructors |
|
974 by just implementing one. |
|
975 |
|
976 213 |
|
977 00:10:06,530 --> 00:10:08,915 |
|
978 But as said that |
|
979 requires a bit skill |
|
980 |
|
981 214 |
|
982 00:10:08,915 --> 00:10:11,450 |
|
983 of generalizing how |
|
984 that should be. |
|
985 |
|
986 215 |
|
987 00:10:11,450 --> 00:10:14,180 |
|
988 And the other questions |
|
989 are just then |
|
990 |
|
991 216 |
|
992 00:10:14,180 --> 00:10:16,745 |
|
993 trying out your |
|
994 expression matcher on |
|
995 |
|
996 217 |
|
997 00:10:16,745 --> 00:10:19,580 |
|
998 some examples, more |
|
999 power examples, |
|
1000 |
|
1001 218 |
|
1002 00:10:19,580 --> 00:10:22,400 |
|
1003 and they should be |
|
1004 very easy to solve. |
|
1005 |
|
1006 219 |
|
1007 00:10:22,400 --> 00:10:24,605 |
|
1008 So I hope it's not |
|
1009 too much asked for |
|
1010 |
|
1011 220 |
|
1012 00:10:24,605 --> 00:10:26,615 |
|
1013 and I hope you have fun. |
|
1014 |
|
1015 221 |
|
1016 00:10:26,615 --> 00:10:29,675 |
|
1017 It is not too much ask |
|
1018 because you can, |
|
1019 |
|
1020 222 |
|
1021 00:10:29,675 --> 00:10:32,420 |
|
1022 I hope it's not too much |
|
1023 because you can start from |
|
1024 |
|
1025 223 |
|
1026 00:10:32,420 --> 00:10:35,615 |
|
1027 my definitions or |
|
1028 my implementation. |
|
1029 |
|
1030 224 |
|
1031 00:10:35,615 --> 00:10:39,425 |
|
1032 It's really essentially |
|
1033 mostly is brain work, |
|
1034 |
|
1035 225 |
|
1036 00:10:39,425 --> 00:10:42,170 |
|
1037 how these nullable and |
|
1038 derivative should work. |
|
1039 |
|
1040 226 |
|
1041 00:10:42,170 --> 00:10:44,510 |
|
1042 If you're in a |
|
1043 language like Scala, |
|
1044 |
|
1045 227 |
|
1046 00:10:44,510 --> 00:10:48,875 |
|
1047 the implementation should |
|
1048 then be easy-peasy. |
|
1049 |
|
1050 228 |
|
1051 00:10:48,875 --> 00:10:51,365 |
|
1052 If you are in a different language |
|
1053 |
|
1054 229 |
|
1055 00:10:51,365 --> 00:10:52,610 |
|
1056 I assume you also |
|
1057 |
|
1058 230 |
|
1059 00:10:52,610 --> 00:10:54,890 |
|
1060 dedicated to that |
|
1061 language to start with, |
|
1062 |
|
1063 231 |
|
1064 00:10:54,890 --> 00:10:58,475 |
|
1065 so it should be also very |
|
1066 easy for you to translate |
|
1067 |
|
1068 232 |
|
1069 00:10:58,475 --> 00:11:01,100 |
|
1070 my Scala code into whatever |
|
1071 language you want to |
|
1072 |
|
1073 233 |
|
1074 00:11:01,100 --> 00:11:04,280 |
|
1075 do, first and then |
|
1076 start from there. |
|
1077 |
|
1078 234 |
|
1079 00:11:04,280 --> 00:11:06,230 |
|
1080 If there are any questions, |
|
1081 |
|
1082 235 |
|
1083 00:11:06,230 --> 00:11:08,360 |
|
1084 please ask me or the TAs. |
|
1085 |
|
1086 236 |
|
1087 00:11:08,360 --> 00:11:10,040 |
|
1088 We are trying to be as helpful |
|
1089 |
|
1090 237 |
|
1091 00:11:10,040 --> 00:11:12,900 |
|
1092 as possible with the coursework. |
|
1093 |
|
1094 238 |
|
1095 00:11:13,240 --> 00:11:17,885 |
|
1096 Bear in mind, this is the |
|
1097 first step in our compiler. |
|
1098 |
|
1099 239 |
|
1100 00:11:17,885 --> 00:11:21,005 |
|
1101 The coursework 2 will then |
|
1102 build on top of that. |
|
1103 |
|
1104 240 |
|
1105 00:11:21,005 --> 00:11:25,770 |
|
1106 So it's better to get this |
|
1107 correct first. Thanks. |